Get All Genomic Information By Sample IDs

Usage

get_genetics_by_sample(
  sample_id = NULL,
  study_id = NULL,
  sample_study_pairs = NULL,
  genes = NULL,
  panel = NULL,
  add_hugo = TRUE,
  base_url = NULL,
  return_segments = FALSE
)

Arguments

sample_id: a vector of sample IDs (character)
study_id: A string indicating the study ID from which to pull data. If no study ID, will guess the study ID based on your URL and inform. Only 1 study ID can be passed. If mutations/cna from more than 1 study needed, see sample_study_pairs
sample_study_pairs: A dataframe with columns: sample_id, study_id and molecular_profile_id (optional). Variations in capitalization of column names are accepted. This can be used in place of sample_id, study_id, molecular_profile_id arguments above if you need to pull samples from several different studies at once. If passed this will take overwrite sample_id, study_id, molecular_profile_id if also passed.
genes: A vector of Entrez ids or Hugo symbols. If Hugo symbols are supplied, they will be converted to entrez ids using the get_entrez_id() function. If panel and genes are both supplied, genes from both arguments will be returned. If both are NULL (default), it will return gene results for all available genomic data for that sample.
panel: One or more panel IDs to query (e.g. 'IMPACT468'). If panel and genes are both supplied, genes from both arguments will be returned. If both are NULL (default), it will return gene results for all available genomic data for that sample.
add_hugo: Logical indicating whether HugoGeneSymbol should be added to your resulting data frame, if not already present in raw API results. Argument is TRUE by default. If FALSE, results will be returned as is (i.e. any existing Hugo Symbol columns in raw results will not be removed).
base_url: The database URL to query If NULL will default to URL set with set_cbioportal_db(<your_db>)
return_segments: Default is FALSE where copy number segmentation data won't be returned in addition to the mutation, cna and structural variant data. TRUE will return any available segmentation data with results.

Value

A list of mutations, cna and structural variants (including fusions), if available. Will also return copy number segmentation data if return_segments = TRUE.

Examples

# \dontrun{
get_genetics_by_sample(sample_id = c("TCGA-OR-A5J2-01","TCGA-OR-A5J6-01"),
 study_id = "acc_tcga",
 return_segments = TRUE)
#> The following parameters were used in query:
#> Study ID: "acc_tcga"
#> Molecular Profile ID: "acc_tcga_mutations"
#> Genes: "All available genes"
#> The following parameters were used in query:
#> Study ID: "acc_tcga"
#> Molecular Profile ID: "acc_tcga_gistic"
#> Genes: "All available genes"
#> The following parameters were used in query:
#> Study ID: "acc_tcga"
#> Molecular Profile ID: "Not Applicable"
#> Genes: "All available genes"
#> ! No "structural_variant" data returned. Error:  No molecular profile for `data_type = fusion` found in "acc_tcga".  See `available_profiles('acc_tcga')`
#> $mutation
#> # A tibble: 173 × 28
#>    hugoGeneSymbol entrezGeneId uniqueSampleKey                  uniquePatientKey
#>    <chr>                 <int> <chr>                            <chr>           
#>  1 ZFPM1                161882 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  2 ZNF787               126208 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  3 PODXL                  5420 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  4 CCDC102A              92922 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  5 TVP23C               201158 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  6 ZNF628                89887 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  7 TBP                    6908 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  8 SEMA5B                54437 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  9 CELSR2                 1952 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#> 10 MUC5B                727897 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#> # ℹ 163 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> #   validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> #   normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> #   endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> #   mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
#> 
#> $cna
#> # A tibble: 417 × 9
#>    hugoGeneSymbol entrezGeneId uniqueSampleKey                  uniquePatientKey
#>    <chr>                 <int> <chr>                            <chr>           
#>  1 AJAP1                 55966 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  2 NPHP4                261734 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  3 KCNAB2                 8514 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  4 CHD5                  26038 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  5 RPL22                  6146 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  6 RNF207               388591 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  7 ICMT                  23463 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  8 ICMT-DT              148645 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#>  9 GPR153               387509 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#> 10 HES3                 390992 VENHQS1PUi1BNUoyLTAxOmFjY190Y2dh VENHQS1PUi1BNUo…
#> # ℹ 407 more rows
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, alteration <int>
#> 
#> $segment
#> # A tibble: 210 × 10
#>    uniqueSampleKey  uniquePatientKey patientId  start    end segmentMean studyId
#>    <chr>            <chr>            <chr>      <int>  <int>       <dbl> <chr>  
#>  1 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 3.22e6 4.75e6      -0.224 acc_tc…
#>  2 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 4.75e6 1.13e7      -0.839 acc_tc…
#>  3 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 1.14e7 1.28e7       0.174 acc_tc…
#>  4 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 1.28e7 3.59e7      -0.226 acc_tc…
#>  5 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 3.59e7 3.60e7       0.478 acc_tc…
#>  6 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 3.60e7 4.23e7      -0.226 acc_tc…
#>  7 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 4.23e7 4.24e7       0.491 acc_tc…
#>  8 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 4.24e7 4.47e7      -0.243 acc_tc…
#>  9 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 4.48e7 4.48e7       0.441 acc_tc…
#> 10 VENHQS1PUi1BNUo… VENHQS1PUi1BNUo… TCGA-OR-… 4.48e7 5.33e7      -0.238 acc_tc…
#> # ℹ 200 more rows
#> # ℹ 3 more variables: sampleId <chr>, chromosome <chr>, numberOfProbes <int>
#> 
# }