Despite the growing popularity of virtualized services, privacy in the cloud remains an unresolved problem. Companies, such as Turbo Tax, are selling services that prepare income tax returns on-line, in the cloud. However, when the Turbo Tax computer site is compromised, all of its customer’s personal data will be released. Experience dictates that all computer sites are eventually hacked or compromised by human failures.
We present an anonymous tax preparation protocol that mitigates the impact of corrupted computers by distributing the processing. The identities of collaborating computers are hidden from one another using a double locked box, cryptographic protocol, developed for an anonymous credit card in the 1990’s.
In the 1990’s we were concerned with data storage. Now we are concerned with both storage and processing. Processed data and the source data of the processing are not independent. If we compromise the processed data, we partially compromise the source data. For instance, if we know the sum of two numbers, we know bounds on the values of the source data. In addition, the source data are not always independent. For instance, in the tax preparation system, a person’s charitable contributions or alimony payments are correlated with one’s income. We use entropy to indicate our ability to predict the source data initially, and as processing components are compromised.
Entropy is a measure of the average number of bits needed to specify all of the source data. The entropy is a function of the probability distribution of the data and the correlation between the data. We calculate the entropy of each processing unit, and the entropy as the information from different processing units are combined.
In this talk we:
- Give a brief description of the double locked box protocol and the anonymous credit card system.
- Describe the processing architecture of a distributed tax preparation system.
- Calculate the entropy of the US tax data, based upon the 2013 statistics.
- And, give an example of the entropy calculations given the output of a particular processing unit and both the processing unit and its inputs. The first measure provides the decrease in privacy when the output of a processing unit is associated with an individual, and the second is a measure of the additional decrease in privacy when the processing unit is also compromised.
This work is co-authored by Emmanuel S. Peters and Nicholas F. Maxemchuk
About Nick Maxemchuk
Nicholas Maxemchuk, a networking pioneer, holds a permanent double appointment as Professor at the world-leading Columbia University of New York City (New York, USA) and Chief Researcher at IMDEA Networks.
He holds a M.Sc. in Electrical Engineering and a Ph.D. in Systems Engineering, both from the University of Pennsylvania (Philadelphia, USA). Before joining Columbia University and IMDEA Networks, Nick Maxemchuk held the position of Technical Leader at AT&T Research Laboratories (1996 – 2001) and, prior to that, was the Head of Distributed Systems Research Department at AT&T Bell Laboratories (1976 – 1996). From 1968 to 1976 he was a member of the technical staff at the RCA David Sarnoff Research Center in Princeton, New Jersey.
Many of his far-sighted contributions to computer-communications networking have been years ahead of their time and have led to the development of groundbreaking new systems. His invention of Dispersity Routing in the 1970s, for example, has recently been applied to ad hoc networks. In 2006, his achievements in the field were recognized by the world’s leading professional association for the advancement of technology, the IEEE, when he was awarded the prestigious 2006 IEEE Koji Kobayashi Computers and Communications Award.
Amongst other awards that he has been given, some of the most noteworthy are the RCA Laboratories Outstanding Achievement Award in 1970, the Bell Laboratories Distinguished Technical Staff Award in 1984, the IEEE’s Leonard G. Abraham Prize Paper Award in 1985 and 1987, and the William R. Bennett Prize Paper Award in 1997. He was also made a fellow of the IEEE in 1989, and received the 1996 R&D 100 award for his work on document marking
As well as owning 30 patents and publishing three books, Nicholas Maxemchuk has co-authored over 100 publications. His strong reputation as an eminent scientist has earned him many editorial and advisory positions with organizations including the IEEE, ACM, NSF Expert Group and the United Nations. He has published three award winning papers and had two of his publications voted into the Communication Society50th Anniversary Issue. He is a member of the Board of Governors of the Armstrong Foundation and also works as a Consultant on Data Networks in Transportation Networks for The National Academies/Transportation Research Board.
This event will be conducted in English
Image source: Flickr | FutUndBeidl