CCDI Participant Index (V1.2.0)

Download OpenAPI specification:

Welcome to the CCDI Participant Index (CPI) Documentation! This site provides information to understand the CPI application programming interface (API) available to authorized users.

About

The CPI is designed and developed within the National Cancer Institute (NCI) Childhood Cancer Data Initiative (CCDI) program, and is intended to provide a centralized index of participant data across multiple institutions.

The goal of the CPI is to manage and share multiple cross-linked participant IDs that represent the same individual by connecting various participant IDs from different studies/research institutions (domains). These mappings will empower researchers to explore complex questions, gain deeper insights into diseases, develop innovative therapies, and enhance existing treatments.

Identifiers and Domains

Participant ID: Public identifier that appears in a research dataset accessible to researchers (e.g., Kids First ID or GENIE ID). This also includes global identifiers used more broadly (e.g., COG USI).

Domain: Groups of unique participant IDs, Domains can be categorized as:

  • Institutional Identifier. Typically refers to a larger project/network namespace of unique participant identifiers. Current institutional identifiers include GENIE (AACR Project GENIE), GMKF (Gabriella Miller Kids First), PCDC (Pediatric Cancer Data Commons), StJude_CompBIOID (St Jude Comp Bio IDs), Treehouse (Treehouse Childhood Cancer Initiative and USI (COG USI).
  • Dataset. Refers to a single dataset for a particular study where data files for the participants can be found. This is typically a dbGaP phs accession number – for example phs000720 (dbGaP accession: Genomic sequencing of Pediatric Rhabdomyosarcoma).
  • Study. Represents programmatic references to research initiatives, such as TARGET, or institution’s internal study synonyms for dataset domains such as SD_AQ9KVN5P, equivalent to phs002276.

The CPI is envisioned as an API-mediated service with a minimal web application to serve as an administrative user interface. CPI will not hold PII or any participant data—just IDs. Access to data with which those IDs are associated will remain under control of primary data source owners.

Access and Support

The CPI API will be made available to applications and services authorized by NCI. Integration with the CPI is currently under development for certain CCDI applications; other interested system owners may initiate a request for access by emailing NCIChildhoodCancerDataInitiative@mail.nih.gov.

Authentication and Authorization

We follow the OAuth 2.0 standard. Once your request to access the CPI API service is approved, you will receive the Client ID, credentials, and the token server endpoint. Use Client ID, credentials to communicate with the token server to obtain an access token, which you can then use token to interact with the CPI service.

Latest Release

📌 The latest Release can be found here - CPI V1.3 Release notes.pdf.

Version History

Get Relevant Domain IDs

Given a list of 1 to many domain ID/ID pairs, return the input domain ID/ID pair and all mapped domain ID/ID pairs for each input domain ID/ID pair

Authorizations:
LambdaAuthorizer
Request Body schema: application/json
required
Array (<= 100000000 items)
domain_name
stringr'^[a-zA-Z0-9\s_.-]*$'
participant_id
stringr'^[a-zA-Z0-9\s_.-]*$'

Responses

Request samples

Content type
application/json
Example

An example request that includes an array of a single object containing a single Participant ID and Domain ID pair.

[
  • {
    }
]

Response samples

Content type
application/json
Example

An example response for a request that that only included a single Participant ID and Domain ID pair in the request body.

{
  • "supplementary_domains": [
    ],
  • "participant_ids": [
    ]
}

Get Status for Participant

Given a list of 1 to many domain ID/ID pairs, return the status for each input domain ID/ID pair

Authorizations:
LambdaAuthorizer
Request Body schema: application/json
required
Array (<= 100000000 items)
domain_name
stringr'^[a-zA-Z0-9\s_.-]*$'
participant_id
stringr'^[a-zA-Z0-9\s_.-]*$'

Responses

Request samples

Content type
application/json
Example

An example request that includes an array of a single object containing a single Participant ID and Domain ID pair.

[
  • {
    }
]

Response samples

Content type
application/json

Returns an array of status for each domain ID/ID pairs

[
  • {
    },
  • {
    }
]

Get Relevant Domains

Given a list of 1 to many IDs (without domain IDs), return the input ID and all mapped domain IDs for each input ID.

Authorizations:
LambdaAuthorizer
Request Body schema: application/json
required

Sample Description for Request Body

Array (<= 100 items)
stringr'^[a-zA-Z0-9\s_.-]*$'

Responses

Request samples

Content type
application/json
Example

An example request that includes an array in the request body containing a single Participant ID.

[
  • "participant_id_1"
]

Response samples

Content type
application/json
Example

An example response to a request that contained a single Participant ID in the request body.

{
  • "participant_id": "participant_id_1",
  • "associated_domains": [
    ]
}

Get Participant Counts by Domain

Given a list of 1 to many domain IDs, return counts.

Authorizations:
LambdaAuthorizer

Responses

Response samples

Content type
application/json

Returns an array of statistics for each domain and the total number of unique individuals and mapped participant IDs.

{
  • "counts_by_domain": [
    ],
  • "mapped_participant_ids": 30,
  • "unique_individuals": 22
}

Get All Available Domains

Get a list of all domain IDs and their properties

Authorizations:
LambdaAuthorizer

Responses

Response samples

Content type
application/json

Returns an array of details for each domain.

[
  • {
    },
  • {
    }
]