An open letter and open questions about the COVID19 datastore

I’ve co-signed an open letter with some open questions on the COVID19 datastore along with a group of researchers and of civil society organisations.

The letter is copied below. The original version can be found here. Associated coverage and posts can be found at The New Statesman, The Register, Computer Weekly, IT Pro, Hollyrood and the Open Knowledge Foundation. You can also find a related petition from OpenDemocracy here.

May 18, 2020, London

Dear Matt Hancock,

We are civil society organisations, privacy advocates and academic researchers writing to express concerns about the NHS’s plans to build a COVID-19 datastore. We share the common goal of preserving public confidence in systems that can help make us all safer. Therefore, before the NHS continues its plans, we urge you to provide the public with more information and take appropriate measures to reduce risk of data sharing and keep the aggregated data under democratic control.

In March, the NHS announced a new plan to build a datastore that aggregates COVID-19 health data. Microsoft, Google, Palantir, Faculty and Amazon will assist in the development of the datastore and the processing of the data.

In the announcement, NHS promises to provide transparency around the plans, to protect sensitive data and to not give its private partners control over the data. However, a recent report by the Guardian suggests these promises are not upheld. Furthermore, a legal opinion by Ryder, Craven, Sarathy and Naik concludes that the plan ‘does not comply, thus far, with data protection principles’. Transparency has been lacking as well: attempts by journalism platform openDemocracy and tech-justice non-profit Foxglove to obtain more information through FOIA requests and legal letters have produced no substantive response. Questions sent to Palantir, seeking clarification about their work on the project offered some assurances but failed to clarify the extent of the project and what protections exist.

Emergencies require rapid responses, but these responses should also be appropriate, lawful and just. It’s unlikely that the NHS’s current plan to build a large-scale Covid-19 datastore meets those principles. We understand the need for better health information, but maintain that the public should be consulted throughout the development of the datastore and be able to obtain adequate information about the data sharing agreements in place.

We urge the NHS to provide answers to all of the questions below and to not proceed with the development of the datastore until the public has had a chance to have their say.

Is there a real need for this solution?
We need to understand what problems the NHS is hoping to solve and whether this is the best way to solve them.
– What problems does the NHS aim to solve by building the datastore?
– What alternatives have been explored? Alternative data governance models, like data trusts, have previously been explored for similar public-private data sharing agreements and could be useful here as well.

What are we not doing while we are doing this?
With limited resources we need to prioritise needs. While public money is directed towards the building of datastores other needs may go ignored. Understanding the opportunity costs of new policies is vital.
– How is the datastore financed?
– Has the NHS considered the trade-offs? What trade-offs specifically?
– What do these considerations look like?

What agreements are in place with private companies?
The partners the NHS has chosen to work with on the datastore are not without their problems. The public has a right to know what they have been promised (now and in the future), both financially and in terms of data access.
– What agreements are in place with each private partner?
– What is the value of the contract with each private partner?
– Will the private companies be able to use the product trained under the agreement with NHS to improve the future products provided by private companies? If yes, what applications will product(s) trained by NHS data have and for what purposes will it be used.
– Will any additional agreements, or amendments and changes to existing agreements be shared with the public?

How does this proposal shift the balance of power from the public to the private sector?
The public has a right to know whether outsourcing large parts of the datastore’s development shifts the balance of power away from the public sector to the private sector and in what way.
– Will the NHS be able to easily switch development partners if needed?
– Will the datastore make use of software controlled by one of its private partners? What software?
– What intellectual property may be created throughout the development of the datastore? Who will hold these rights?

Who has control over the data in these public private partnerships ?
Closely related to the question of power, is the question of who is in control over the data aggregated in the datastore.
– What specific data will each party have access to?
– What are the terms governing these parties’ usage of the data?
– To what extent will data access by these parties be audited?
– Has a Data Protection Impact Assessment been made for each of these partnerships? When will such an assessment be made public? If they are not made public what is the reason for not disclosing them?

Who is most at risk and how do we protect them?
When estimating the risk of data sharing efforts, it’s not enough to rely on individual consent alone, nor can we rely on de-identification as a sufficient strategy for anonymizing data. We need to take account of the negative externalities of data sharing.
– What data sources will be pooled together in the datastore and who will have access?
– Who does this put at risk and in what way?
– What measures are in place to protect the most vulnerable?
– Have these measures been reviewed?
– Who is actually doing the risk assessment? And when?

What is the exit strategy?
In the announcement the NHS promises to destroy (most of) the data once the pandemic is over. However, it provides no criteria for determining when that will be.
– For what duration is the data collected and what happens when that period ends?
– If the exit strategy depends on the pandemic ending, then what criteria are used to determine when the pandemic is indeed over? (i.e. when is the promised destruction of the datastore triggered?)

Are the measures transparent? Who is accountable?
So far information about the datastore has been scarce. We need to understand what information will be made available and who the public can hold accountable.
– What public facing documentation do you intend to provide describing this datastore and the various data sources?
– Will further use of the datastore by the Department of Health Care Services, or its partners, outside the scope as currently defined, be communicated with the public?
– What party do you intend to use for privacy compliance and security auditing of the system

While we understand that resources are limited, these questions are fundamental to maintaining public trust in the NHS and to help keep high-risk personal data about UK citizens safe at a time when we need that the most. Lack of transparency and opacity in which these agreements are made do not help building this trust.

We’d appreciate a response as soon as you’re able.

Signed (in alphabetical order),

Big Brother Watch
Echo Chamber Club
Fair Vote UK
Open Knowledge Foundation
Open Rights Group
Privacy International
WebRoots Democracy

Dr. Alon Lischinsky, School of History, Philosophy & Culture, Oxford Brookes University
Dr. Andres Guadamuz, School of Law, Politics and Sociology, University of Sussex
Dr. Angela Daly, Co-Director of the Strathclyde Centre for Internet Law & Policy
Anouk Ruhaak, Mozilla Fellow embedded with AlgorithmWatch
Dr. Arne Hintz, Data Justice Lab, Cardiff University
Brett Scott, author
Prof. Chris Marsden, University of Sussex
Dr Cory Doctorow, Visiting Professor of Computer Science, Open
University; Co-founder of the Open Rights Group; Author of Little Brother.
Dr. Dana Naomy Mills, lecturer politics and international relations, Oxford Brookes
Dr. Elinor Carmi, Postdoc Research Associate — Digital Media & Society
Frederike Kaltheuner, Tech Policy Fellow, Mozilla
Assoc Prof Guido Noto La Diega, PhD — University of Stirling
Dr. Harry Dyer, Lecturer in Education at the University of East Anglia
Javier Ruiz, independent policy consultant
Dr. Jonathan Gray, Department of Digital Humanities, King’s College London
Jonnie Penn, University of Cambridge
Dr. Lina Dencik, Cardiff University/Data Justice Lab
Dr. Mark Coté, Department of Digital Humanities, King’s College London
Prof. Mark Graham, Oxford Internet Institute, University of Oxford
Dr. Michael Veale, University College London
Dr Michele Paule, Senior Lecturer, Oxford Brookes University
Councillor Mike Rowley (Labour)
Prof Niall Winters, University of Oxford
Dr. Nick Srnicek, Department of Digital Humanities, King’s College London
Prof. Noortje Marres, Centre for Interdisciplinary Methodologies, University of Warwick
Rachel Coldicutt, independent technology strategist
Srujana Katta, researcher at Oxford Internet Institute
Dr. Zarinah Agnew, Social Observatory

Back to posts