What you publish today, what you give to others - who may store it in a database and then lose it (just look at all the hard disks, laptops, USB keys recently lost by the Government and banks) -, what is obtained on you (through surveillance or sousveillance) or what information you reveal when achieving some specific goals such as a search query (see Gregory Conti's Could Googling take down a president?) or posting a geo-tagged picture (see Jan Chipchase's Great to see you. Just not around here) may haunt you when you least expect it. It can be reused in a different context, you need to share your health data with your GP, you may be ok sharing it on the NHS Spine (if not see the Big Opt Out how-to), but you're unlikely to want it widely available to insurers. It can also resurface years later in a completely different context such as a potential employer or spouse checking on you.
Living as a hermit recluse from society is not an option, at least not for most of us so we should look for other ways to tackle this issue which will grow in severity. You can't avoid surveillance, sousveillance and unanticipated use of information leakages, but you do have the choice of making it worse by blindly adopting the wisdom of the techno utopian crowd and actively share your life with the world. Twittering your minute actions (surely the only thing you should ever twitter is 'I'm posting a twit'?), upload in near real-time your geo-located pictures, stream live your latest ramblings, dribble a stream of (un)consciousness on your blog (and repost your twits on it), etc.
Forgetting is difficult. We all have a tendency to retain things 'just in case' and even if you do decide to delete some data, as it is copied, cached, backed up, etc., how can you be sure you've effectively destroyed all traces of it? For example, this is a question with no answer for the few people managing to get off the National DNA Database. The issue was highlighted in a recent a CTO storage roundtable feature (published in the August 2008 issue of Communications of the ACM) exploring near-term challenges and opportunities facing the commercial computing community:
MACHE CREEGER: Now that we all agree that there should be a way to make information have some sort of time-to-live or be able to disappear at some future direction, what recommendations can we make?
MARGO SELTZER: There's a fundamental conflict here. We know how to do real deletion using encryption, but for every benefit there's a cost. As an industry, people have already demonstrated that the cost for security is too high. Why are our systems insecure? No one is willing to pay the cost in either usability or performance to have true security. In terms of deletion, there's a similar cost-benefit relationship. There is a way to provide the benefit, but the cost in terms of risk of losing data forever is so high that there's a tension. This fundamental tension is never going to be fully resolved unless we come up with a different technology.
ERIC BREWER: If what you want is time to change your mind, we could just wait awhile to throw away the key.
MARGO SELTZER: The best approach I've heard is that you throw away bits of the key over time. Throwing away one bit of the key allows recovery with a little bit of effort. Throw away the second bit and it becomes harder, and so on.
ERIC BREWER: But ultimately you're either going to be able to make it go away or you're not. You have to be willing to live with what it means to delete. Experience always tells us that there's regret when you delete something you would rather keep.
(An unexpressed assumption in the dialogue above is that the cryptographic algorithms used will never be broken.)
One approach is to adopt what has worked sufficiently well for other issues. This is the conclusion reached by Daniel J. Weitzner, Harold Abelson, Tim Berners-Lee, Joan Feigenbaum, James Hendler, and Gerald Jay Sussman in Information Accountability (published as an MIT Computer Science and Artificial Intelligence Laboratory Technical Report in June 2007):
[I]nformation accountability through policy awareness, while a departure from traditional computer and network systems policy techniques, is actually far more consistent with the way that legal rules traditionally work in democratic societies. Computer systems depart from the norm of social systems in that they seek to enforce rules up front by precluding any possibility of violation, generally through the application of strong cryptographic techniques. In contrast, the vast majority or rules we live by have high levels of actual compliance without actually forcing people to comply. Rather we follow rules because we are aware of what they are and because we know there will be consequences, after the fact, if we violate them, and we trust in our democratic systems to keep rules and enforcement reasonable and fair. We believe that technology will better support freedom by relying on these social compacts rather than by seeking to supplant them.
Data protection legislation, only available in some countries, is such an example. It requires organisations that collect personal information to state the purposes of use and not to keep the data longer than necessary. Unfortunately, information commissioners often do not have enough resources or enforcement power to make consequences of breach serious.
This month is the 25th birthday of the GNU project that led Richard Stallman to write the GPL and LGPL free software licenses. This has been a very successful use of legal compacts to ensure a set of freedoms for software users. We need to figure out and negotiate what set of societal rules we must follow to ensure information accountability.