New Paper: “OpenTrials: Towards a Collaborative Open Database of All Available Information on All Clinical Trials”

A few years ago, I had some discussions with the physician, academic and science writer Ben Goldacre which led to a collaboration on a new project called OpenTrials.

Clinical trials are conducted in order to generate information about the safety and effectiveness of a given medical treatment. This information is used to take decisions which can transform people’s lives. However research suggests that negative results are often withheld, and outcomes are often only selectively reported. OpenTrials aspires to address this by providing a collaborative database of public information about clinical trials, collated from a wide variety of different sources for and by patients, doctors, researchers, civil society groups, public institutions and others.

Ben and I have co-authored a paper which outlines what we hope to do with the OpenTrials project, which has just come out in the open access Trials journal published by BioMed Central. The abstract for the paper is copied below.

OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data.


