Reconstructing a Virtual Life

In one project in our laboratory, we are using software originally designed for the electronic Beowulf project to analyze software "cookies" harvested from individual web devices to reconstruct the personal identity and life history of device users.

The project was originally motivated by a DARPA contest. The DARPA competition was launched at the end of October 2011, and the idea was to develop programs capable of piecing together documents that had been ripped into bits.

DoD's plan for the technology is to use computers that never tire like Iranian students who reconstructed documents shredded at the US embassy in Tehran after the fall of the Shah, or the West Germans who reconstructed documents at the East German Stasi headquarters after the fall of the Berlin Wall.

In the contest, teams of programmers and engineers were provided with scanned images of the shredded documents and had to "stitch" the images back together virtually like a picture puzzle. The figure below depicts one such reconstruction problem:

To prove they had accurately completed the reconstruction, contest participants were required to provide the answer to puzzles embedded in the content of the reconstructed document. The winning team, named "All Your Shreds Are Belong To U.S.", completed the job on December 2, 2011, and earned a $50,000 prize as a result.

Turning Cookies into Sculpture

A cookie (also known as an HTTP cookie, web cookie, or browser cookie), is usually a small data chunk sent from a website and stored in a user's web browser while a user is browsing a website. In ordinary use, when the user browses the same website in the future, the data stored in the cookie can be retrieved by the website to notify the website of the user's previous activity. Cookies were designed to be a reliable mechanism for websites to remember the state of the website or activity the user had taken in the past, so passwords and default selections would not have to all be re-entered every time a user visited the site. Cookies can store information like clicking particular buttons, logging in, or a record of which pages were visited by the user even months or years previously.

Tracking cookies and in particular third-party tracking cookies are designed as ways to assemble long-term descriptions of individuals' web histories. Other kinds of cookies carry out essential functions on the modern Web. For example, authentication cookies are the most common method used by web servers to know whether a user is logged in or not, and which account a user is logged into. Without such a mechanism, a site would not know whether to send a page containing sensitive information like new email, or to require the user to authenticate himself first by logging in. The security of an authentication cookie generally depends on the security of the issuing website and the user's web browser, and on whether the cookie data are encrypted. Security vulnerabilities may allow a cookie's data to be read by a hacker, used to gain access to user data, or used to gain access (with the user's credentials) to the website to which the cookie belongs. The information content of a cookie can be representationally mapped into a multidimensional graph. Functions like Karhunen–Loève, Fourier and Mellin transforms manipulate the graphs of many cookies translationally, rotationally and topologically to match and to stitch histories in web devices together. Because cookies are on usually on by default (cookies make the web easier to navigate, store passwords, etc.) active participation of the device users is not required.

Using this approach, it is possible to piece together where users have been on the Internet and what they have read, done and said. As more cookie data are collected it becomes possible to reconstruct who their friends are, where they live and vacation, what they buy and what they think about buying, how they voted, and whether they prefer blondes or brunettes. The cutting edge of the research is predicting where the users will go next and what they will do.