Presenting the code to the Proceedings of the National Academy of Sciences (PNAS) the team say that the tools could facilitate counterterrorism efforts and infectious disease tracking while preserving the privacy of those who should not be looked at.
The boffins said that the need for useful or essential gathering and analysis of data about citizens and the privacy rights of those citizens was little tricky.
“The most striking and controversial recent example is the revelation that US intelligence agencies systemically engage in ‘bulk collection’ of civilian ‘metadata’ detailing telephonic and other types of communication and activities, with the alleged purpose of monitoring and thwarting terrorist activity.”
The Penn researchers said that there are similar problems around medical data and targeted advertising. They said that in every case, the friction is between individual privacy and some larger purpose, whether it’s corporate profits, public health, or domestic security.
They said that there is a protected subpopulation that enjoys (either by law, policy, or choice) certain privacy guarantees.
“Protected individuals might be nonterrorists, or uninfected citizens (and perhaps informants and health care professionals). They are to be contrasted with the ‘unprotected’ or targeted subpopulation, which does not share those privacy assurances.”
It claims that its algorithms can output a list of confirmed targeted individuals discovered in the network, for whom any subsequent action (e.g., publication in a most-wanted list, further surveillance, or arrest in the case of terrorism; medical treatment or quarantine in the case of epidemics) will not compromise the privacy of the protected.”
The algorithms are based on a few basic ideas. The first is that every member of a network has a sequence of bits indicating their membership in a targeted group. The algorithms have a budget where it can only reveal so many bits and no more. The algorithms then optimise this scenario so that as many bits-of-interest are revealed as possible.
Using real social networks with stochastically (randomly) generated, artificial target groups, the Penn team found that they could indeed search a network for targeted members while not revealing information about individuals in privacy-protected populations.
“Our work is of course not a complete solution to the practical problem, which can differ from our simple model in many ways. It is just one interesting modeling question for future work.”