Facilitating Access to Non-Public Data: A World in Which All Data is “Open”
At the Technologists for Public Good April 17 Demo Day, Andrew Trask, Executive Director of OpenMined, presented a vision where researchers can safely access non-public data as quickly as using a public website. This is a recap of that session.
This is made possible by combining the core strengths of various Privacy Enhancing Technologies (PETs). Individually, various PETs — differential privacy, secure multiparty-computation, homomorphic encryption, etc. protect different parts of the information pipeline. Used independently, a single PET fails to protect the entire end-to-end flow of a piece of information. Together, in unison, PETs can create a highly customizable information pipeline based on governance preferences, privacy restrictions, and other relevant constraints.
In practice, when PETs work in concert, they can create secure information pipelines for external researchers to perform remote data science. OpenMined is developing technical infrastructure to support remote data science, allowing data scientists to access data securely — data they do not and perhaps cannot have a copy of. Whether you're a seasoned data scientist, privacy expert, or starting out, you can delve deeper into PETs and remote data science through OpenMined's free Private AI course series.
PETs offer governments a new tool to bolster their open data policies, enforce their open data regulations and legislations, and continue to increase the quality and quantity of data openly available to the research community. This is not just a technological advancement but a societal shift that governments should lead. Public interest technologists tasked with implementing open data policies, regulations, and legislation can work to ensure that the benefits of PETs are maximized by pushing PETs to create public goods rather than walled gardens. Trask suggests public interest technologists should "go to workshops and meetups, write blog posts, and just help set the tone for this public infrastructure so that it actually becomes public infrastructure and doesn't get eaten up by the same network effects and scaling that eats most of the network effect technologies in the world. Because if the protocols for these things are open source and linked with public institutions serving the public interest, the future is radically different in terms of who gets to decide what nonpublic information is used for."
After the presentation, Shira Honig, Demo Day co-host and TPG Community Committee Member, facilitated a short discussion on how PETs relate to the treatment of data within government that may contain bias or exclusions (starting at 44:00 in a recording of the session). Trask recommends that public interest technologists start early with internal education on PETs. If internal interest exists, teams can progress to running PETs pilots of a single, low-risk dataset. OpenMined would be happy to help any team interested in conducting a PET pilot.
To learn more or ask questions contact Andrew on the OpenMined Slack community.
Resources
Article by Katie Johnson, TPG Community Leads Committee, 2024
Special thanks to Lacey Strahm, Policy Lead at OpenMined, for support on the presentation and this article.
More Demo Days from TPG
Stay tuned for the next demo day announcements in the TPG newsletter and Slack channel. You can access these resources by becoming a member.
Are you interested in leading a Demo Day session?
If you’d like to showcase your work or nominate someone to present at a Demo Day, please email the TPG Community Leads Committee.