MyNixOS website logo
Description

Clinical Trial Registry History.

Retrieves historical versions of clinical trial registry entries from <https://ClinicalTrials.gov>. Package functionality and implementation for v 1.0.0 is documented in Carlisle (2022) <DOI:10.1371/journal.pone.0270909>.

cthist

This package provides functions for mass-downloading and interpreting historical clinical trial registry entry data.

How to install

To install the stable version of cthist through CRAN:

install.packages("cthist")
library(cthist)

If you want the most recent development version of cthist, you will need to install devtools first, and then install via git:

install.packages("devtools")
library(devtools)
install_github("bgcarlisle/cthist")
library(cthist)

Functions provided by cthist

This package provides 5 functions for downloading and interpreting historical clinical trial data from ClinicalTrials.gov

Download clinical trial version dates:

## Get all the dates and status updates when the registry entry for
## NCT02110043 changed

clinicaltrials_gov_dates(c("NCT02110043", "NCT03281616"))
## A tibble: 10 × 5
##    nctid       version_number total_versions version_date overall_status       
##    <chr>                <int>          <int> <chr>        <chr>                
##  1 NCT02110043              0              8 2014-04-08   RECRUITING           
##  2 NCT02110043              1              8 2014-09-22   RECRUITING           
##  3 NCT02110043              2              8 2014-10-13   RECRUITING           
##  4 NCT02110043              3              8 2016-03-15   RECRUITING           
##  5 NCT02110043              4              8 2016-12-20   RECRUITING           
##  6 NCT02110043              5              8 2017-07-04   RECRUITING           
##  7 NCT02110043              6              8 2017-07-26   ACTIVE_NOT_RECRUITING
##  8 NCT02110043              7              8 2021-05-20   COMPLETED            
##  9 NCT03281616              0              2 2017-09-11   COMPLETED            
## 10 NCT03281616              1              2 2017-09-18   COMPLETED            

## Get all the dates when NCT02110043 had a change in overall status

clinicaltrials_gov_dates("NCT02110043", status_change_only=TRUE)
## A tibble: 3 × 5
##   nctid       version_number total_versions version_date overall_status       
##   <chr>                <int>          <int> <chr>        <chr>                
## 1 NCT02110043              0              8 2014-04-08   RECRUITING           
## 2 NCT02110043              6              8 2017-07-26   ACTIVE_NOT_RECRUITING
## 3 NCT02110043              7              8 2021-05-20   COMPLETED            

Download clinical trial registry entry version data:

## Get the 4th version of NCT02110043

version_data <- clinicaltrials_gov_version("NCT02110043", 4)

## Get the 2nd item (enrolment) for that version
version_data$enrol
## [1] 22

## Get the 3rd item (enrolment type) for that version
version_data$enroltype
## [1] "ESTIMATED"

Mass-download clinical trial registry entry versions:

## Download all data for all versions of NCT02110043 and store in
## variable `versions`

versions <- clinicaltrials_gov_download("NCT02110043")

Mass-download clinical trial registry entry versions for many trials and save to disk:

## Download all data for all versions of NCT02110043 and NCT03281616
## and save to versions.csv

clinicaltrials_gov_download(c("NCT02110043", "NCT03281616"), "versions.csv")

Extract publications indexed on downloaded trial versions

The function clinicaltrials_gov_download downloads a data frame of versions of a trial's history, with the references column containing a nested JSON-encoded data frame of the publications that were indexed by ClinicalTrials.gov.

The function extract_publications interprets a data frame of the type returned by clinicaltrials_gov_download and returns a new data frame that contains only publications of the type specified ("RESULT", "BACKGROUND", or "DERIVED").

This function will provide one row for every publication of the type specified that was indexed on ClinicalTrials.gov for every version of the trial registry record contained on the data frame provided.

## Download only the latest clinical trial registry entries for the 
## specified NCT numbers and extract PMID's for indexed RESULT 
## publications

clinicaltrials_gov_download(
  c("NCT05784103", "NCT05780281"), 
  latest=TRUE
) %>%
  extract_publications(type="RESULT") %>%
  select(nctid, pmid)

# A tibble: 2 × 2
  nctid       pmid    
  <chr>       <chr>   
1 NCT05784103 28183823
2 NCT05780281 34928698

Calculate overall status lengths

The function clinicaltrials_gov_download downloads a data frame of versions of a trial's history, with the overall_status column indicating the status of the trial on the date the entry is updated, which is specified in the version_date column.

The function overall_status_lengths interprets a data frame of the type returned by clinicaltrials_gov_download and returns a new data frame that contains a list of the NCT numbers and all the overall statuses that the trial in question passed through, and for how many days, optionally, within a specified timeframe.

## Download the clinical trial registry entries for the specified NCT
## number(s) and calculate the number of days that each registry entry
## spends in a reported overall status within a prescribed time
## interval of interest (in this case, the years 2020-2022, inclusive)

clinicaltrials_gov_download(
    c("NCT04338971", "NCT03461211")
) %>%
    overall_status_lengths(
        start_date = "2020-01-01",
        end_date = "2022-12-31"
    )

# A tibble: 5 × 3
# Groups:   nctid [2]
  nctid       overall_status          days    
  <chr>       <chr>                   <drtn>  
1 NCT03461211 ACTIVE_NOT_RECRUITING   406 days
2 NCT03461211 COMPLETED               668 days
3 NCT03461211 ENROLLING_BY_INVITATION  21 days
4 NCT04338971 COMPLETED               488 days
5 NCT04338971 WITHHELD                511 days

What data is extracted?

VariableData type
Version number (0, 1, 2, etc.)Double
Version date (ISO-8601)Date
Overall statusCharacter
Start dateDate
Start date precisionCharacter
Primary completion dateDate
Primary completion date precisionDate
Primary completion date typeCharacter
EnrolmentDouble
Enrolment typeCharacter
Inclusion and exclusion criteriaCharacter (HTML)
Outcome measuresJSON-encoded table
Overall contactsJSON-encoded table
Central contactsJSON-encoded table
Responsible partyJSON-encoded table
Lead sponsorJSON-encoded table
CollaboratorsJSON-encoded table
LocationsJSON-encoded table
"Why stopped?"Character
Results postedLogical
ReferencesJSON-encoded table
Organization study IDCharacter
Secondary IDsJSON-encoded table

Note regarding ClinicalTrials.gov July 2023 website re-write

For cthist v >= 2.0.0, the method for downloading has been updated to reflect the new version of ClinicalTrials.gov. Because the data on the updated website are presented differently from the way they were scraped from the old version, there will be some changes. E.g. the overall status field is now in all-caps.

DRKS.de functions deprecated

Update as of 2022-12-11

DRKS.de has recently been updated in a manner that makes scraping data more difficult and so the functions related to DRKS.de have been deprecated, at least temporarily while I assess the changes.

Note on use

Please note that this script is provided under AGPL v 3, and so you may use it for any purpose, however if you modify it, you must provide access to your modified version or you are in violation of the terms of the license.

Citing cthist

@Manual{bgcarlisle-cthist,
  title          = {Analysis of Clinical Trial Registry Entry Histories Using the Novel {{R}} Package cthist},
  author         = {Carlisle, Benjamin Gregory},
  date           = {2022-07-01},
  journaltitle   = {PLOS ONE},
  shortjournal   = {PLOS ONE},
  volume         = {17},
  number         = {7},
  pages          = {e0270909},
  publisher      = {{Public Library of Science}},
  issn           = {1932-6203},
  doi            = {10.1371/journal.pone.0270909},
  url            = {https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0270909}
}

Please open an issue in the issue tracker above if you find a bug, need this package to download some historical trial data that it currently does not capture, or if you would like to collaborate on a project that uses this tool.

If you used my package in your research and you found it useful, I would take it as a kindness if you cited it.

Best,

Benjamin Gregory Carlisle PhD.

Metadata

Version

2.1.11

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows