Internal Function to Get Mutations/CNA/Fusion By Study ID
Source:R/genomics_by_study.R
dot-get_data_by_study.Rd
Endpoints for retrieving mutation and cna data are structurally similar.
This internal function allows you to pull data from either endpoint. It has
logic for sensible default guesses at study_id
and molecular_profile_id
when those are NULL
Usage
.get_data_by_study(
study_id = NULL,
molecular_profile_id = NULL,
data_type = c("mutation", "cna", "fusion", "structural_variant", "segment"),
base_url = NULL,
add_hugo = TRUE
)
Arguments
- study_id
A study ID to query mutations. If NULL, guesses study ID based on molecular_profile_id.
- molecular_profile_id
a molecular profile to query mutations. If NULL, guesses molecular_profile_id based on study ID.
- data_type
specify what type of data to return. Options are
mutation
,cna
,fusion
, orstructural_variant
(same asfusion
), andsegment
(copy number segmentation data)..- base_url
The database URL to query If
NULL
will default to URL set withset_cbioportal_db(<your_db>)
- add_hugo
Logical indicating whether
HugoGeneSymbol
should be added to your resulting data frame, if not already present in raw API results. Argument isTRUE
by default. IfFALSE
, results will be returned as is (i.e. any existing Hugo Symbol columns in raw results will not be removed).
Examples
# \dontrun{
set_cbioportal_db("public")
#> ✔ You are successfully connected!
#> ✔ base_url for this R session is now set to "www.cbioportal.org/api"
.get_data_by_study(study_id = "prad_msk_2019", data_type = "cna")
#> ℹ Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 PTEN 5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(study_id = "prad_msk_2019", data_type = "mutation")
#> ℹ Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 2 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 3 ATR 545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 4 BCL2 596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 5 ETV1 2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 6 ETV1 2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 7 FAT1 2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 8 MSH6 2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 9 MSH6 2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1 3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> # validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> # normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(study_id = "prad_msk_2019", data_type = "fusion")
#> ℹ Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#> uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> # site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> # site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> # site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> # site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> # site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> # site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …
.get_data_by_study(molecular_profile_id = "prad_msk_2019_cna", data_type = "cna")
#> ℹ Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 PTEN 5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(molecular_profile_id = "prad_msk_2019_mutations", data_type = "mutation")
#> ℹ Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 2 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 3 ATR 545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 4 BCL2 596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 5 ETV1 2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 6 ETV1 2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 7 FAT1 2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 8 MSH6 2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 9 MSH6 2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1 3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> # validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> # normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(molecular_profile_id = "prad_msk_2019_structural_variants", data_type = "fusion")
#> ℹ Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#> uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> # site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> # site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> # site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> # site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> # site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> # site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …
# }