Global Perspectives and Initiatives for Large-Scale Genomics
-
Register
- Regular Member - Free!
- Early Career Member - Free!
- Resident/Clinical Fellow Member - Free!
- Postdoctoral Fellow Member - Free!
- Graduate Student Member - Free!
- Undergraduate Student Member - Free!
- Emeritus Member - Free!
- Life Member - Free!
- Trainee Member - Free!
Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab.
Recorded session from the 2021 virtual meeting.
Key:
COVID-19 symptoms vary widely, ranging from asymptomatic in some patients to fatal in others. Elucidating the host genetics of COVID-19 holds the potential for understanding both susceptibility to SARS-CoV-2 infection as well as heterogeneity in patient presentation and outcome. Prior work focused on identifying common variants associated with COVID-19 susceptibility and severity, but little has been done to explore the entire allele frequency spectrum of genetic variation, from common to rare exonic variants. Here, we present the largest trans-ancestry exome sequencing study of COVID-19 to date in 586,713 individuals, with a larger set of 1,012,636 individuals with imputed data across 7 studies and 5 continental ancestries.
Through exome sequencing of 21,820 COVID-19 cases and 564,893 controls, we did not identify any rare variants after Bonferroni correction (P<9.6e-10). Burden tests identified three genes tentatively associated with COVID-19: DISP3 (P=2e-8; OR=1.8±0.3), MARK1 (P=3e-9; OR=38.4±16.9), and TLR7 (P=4e-8; OR=4.5±2.2). Despite having a 100x larger sample size, we could not replicate a previous reported role for rare variants in the interferon pathway (P=0.59).
Our larger GWAS of 56,841 cases and 955,795 controls found 11 loci (P<5e-8). Most notably, we identified a strong protective association amongst SARS-CoV-2 infected cases for rs190509934 located 60bp upstream of ACE2, the primary cell receptor for the SARS-CoV-2 spike protein (P=4.5e-13; OR=0.6±0.08; EUR MAF=0.003). Using RNA-seq, rs190509934 reduced ACE2 expression by 39% (P=3e-8), supporting the hypothesis that reduced ACE2 expression protects against SARS-CoV-2 infection.
Lastly, we developed a polygenic risk score (PRS) to predict hospitalization and severity of COVID-19. Among those of European ancestry, individuals with the top 10% PRSs are 1.8-fold more likely to be hospitalized (P=6e-11) and 1.58-fold more likely to be placed on a ventilator or die from COVID-19 (P=7e-10). These associations hold in other non-European populations (albeit with decreased power) and after accounting for known clinical risk factors.
Our data represents the most comprehensive survey of common and rare exonic variation associated with COVID-19 identifying new loci and polygenic risk scores that predict severity of COVID-19.
Jack Kosmicki
Regeneron Genetics Center
Rare variant analyses in 239,395 whole exome and whole genome sequenced participants of the UK Biobank reveals novel genetic associations with renal function and chronic kidney disease
Genome-wide association studies have identified common genetic variants associated with chronic kidney disease (CKD), but the burden of rare loss-of-function (LoF) or pathogenic/likely pathogenic (P/LP) variants has not been well characterized. We performed gene-/region-based and variant association analyses for 5 renal function biomarkers (eGFR estimated from serum creatinine and/or cystatin-C, BUN, UACR) and 5 CKD endpoints (ESRD and stage4/5 CKD, CKD defined by biomarkers and/or diagnoses from NHS data, Cystic) in 239,395 UKB participants of genetically-assessed European ancestry and with whole exome (WES, n=171,172) or whole genome sequencing (WGS, n=121,019). For each trait, we fit a genome-wide regression model and tested for association using REGENIE V2.0, adjusting for age, sex, 10 principal components of ancestry, assessment center and BMI, where appropriate. For gene-based analyses, we generated 15 models to collapse ClinVar-classified P/LP, VEP(LOFTEE)-predicted putative LoF and deleterious variants predicted by 16 in silico scores (SIFT, Polyphen, BayesDel, etc.) from dbNSFP 4.1c. The WGS data further enabled annotation of promoter/enhancer variants, which were incorporated into collapsing models for gene-based association. In participants with WES, we identified 30 and 11 genes associated with ≥2 biomarkers and ≥1 CKD endpoint across collapsing models (FDR<0.05), respectively. PKD1/2, COL4A3/4, CUBN, IFT140 were associated with both biomarkers and CKD. Association analyses also highlighted other genes including: COL4A1, CST3, LAMC1, LRP2, SLC22A2, SLC34A3, SH2B3. Variant-level analyses further informed impact on protein, e.g. the SLC22A2 association signal was mainly driven by a frameshift (rs8177505) with lowering effects on eGFR (p=1.2e-27, beta=-6.2, MAF=0.12%). Exome-wide variant analyses revealed 25 genes (eg. PDILT-UMOD) with variant associations (p<5.0e-8) with >3 biomarkers or ≥1 endpoint, including 2 that were also implicated from the gene-based analyses (COL4A4 and CUBN). Analyses of WGS allowed for sequence level validation of exome derived findings and the identification of additional variants not captured in WES. This study provides a framework for the assessment of the genetic landscape of kidney disease. The results validated known genes and identified potential novel associations with renal function.
Shuwei Li
Janssen
Novel genetic associations for rare diseases with GWAS and trans-ethnic analysis of self-reported medical data
Nearly 7000 rare diseases are known, and though each disease affects a few people, the total population prevalence of rare diseases is estimated to be 3.5-5.9%. A key challenge in the study of rare disease genetics is assembling large case cohorts for well-powered studies. Here we demonstrate use of large-scale self-reported rare disease data, combined with genetic data collected through the 23andMe direct-to-consumer platform, to study 33 rare diseases and identify genetic associations through GWAS. We developed web-based questionnaires, and gathered self-reported data on rare diseases from a cohort of over 1.6 million genotyped research-consented individuals. To reduce mis-reporting and maximize coverage, we used an autocomplete mechanism including 7000 rare diseases. We validated the approach through simulations and replication of known rare disease associations. In simulations based on genotypes from 4,957,230 European individuals, we show that GWAS can recover genome-wide significant associations in monogenic rare diseases for a variety of architectures. In rare diseases with known genetic associations, we reidentified 29 associations at a genome-wide significance level (p-value < 5e-8) with a diverse range of minor allele frequencies (minimum MAF=0.0001, maximum MAF=0.487) and effect sizes for the risk allele (minimum OR=1.24, maximum OR=273.15). We performed the first GWAS in European ancestry for Duane retraction syndrome, vestibular schwannoma and spontaneous pneumothorax, and report novel genome-wide significant associations for these diseases. For Duane retraction syndrome, an eye movement disorder, we found two independent associations near the OLIG1 and OLIG2 genes, knockdown of which causes a similar phenotype in mice. For vestibular schwannoma, we find a single association near the CDKN2A and CDKN2B genes, which are associated with many other cancers. We found three novel associations for spontaneous pneumothorax, two of which are also associated with lung function phenotypes. We replicated these associations in the UK Biobank and found that 3 of 5 replicated with p < 0.05, and all 5 had the same direction of effect. Trans-ethnic mixed-model analyses, including individuals of all ancestries, found the same associations with comparable or increased significance. Our results show that self-reported rare disease data is a viable method for discovering genetic associations for rare diseases. With increasing sample size and diverse imputation reference panels, we may also be able to study rare diseases more widely in multiple populations and improve our understanding of the trans-ethnic genetic architecture of these diseases.
Suyash S Shringarpure
23andMe
Common and rare variant analysis of 21K psoriasis cases and 623K controls identifies novel, protective associations in several genes in the type 1 interferon pathway
Psoriasis is a complex autoimmune disease resulting in chronic inflammation and hyperproliferation of the skin. The aberrant immune response associated with psoriasis is mediated by pathogenic T cells, which are activated, in part, by type 1 interferons (IFNs). Prior large-scale analyses of psoriasis cases focusing on common genetic variants have implicated >63 loci, including genes in the IFN signaling pathway. However, large-scale analysis of rare exonic variation is lacking.
To study the contribution of both common and rare variants to psoriasis risk, we performed whole-exome sequencing and meta-analysis of 20,810 psoriasis cases and 623,159 controls of EUR and AFR ancestry across 6 cohorts. Common variant analysis replicated 44 significant and independent associations in known psoriasis loci, including IL23R, TYK2, IL12B, HLA-C, and DDX58, among others. Rare-variant gene-burden analysis of putative loss-of-function (pLoF) and/or predicted-deleterious missense variants (<1% AAF) identified significant and novel associations in 5 genes, including 3 genes in the IFN pathway. These include protective pLoF associations for IFIH1 (OR=0.74 [0.68, 0.81], p=4.1E-12), which encodes a pathogen sensor that activates IFN production, and TRIM65 (OR=0.63 [0.50, 0.79], p=4.8E-5), which encodes a ubiquitin ligase that binds and activates IFIH1. We find the protective TRIM65 association is driven by a rare, predicted-deleterious missense variant (rs202175254, AAF=0.1%) in the IFIH1-TRIM65 binding domain. Further, we find a nominally significant, protective association for the burden of rare pLoFs in DDX58 (OR=0.76 [0.49, 0.89], p=6.7E-3), which encodes a second pathogen sensor that activates IFNs. This DDX58 protective pLoF association helps confirm direction of effect at this known psoriasis locus.
Consistent with inhibition of IFNs being protective in psoriasis, we also found a significant and novel gene-burden association between increased odds of psoriasis and pLoFs in ADAR (OR=2.29 [1.68, 3.12], p=1.4E-7), which encodes a protein that suppresses IFNs and in which partial LoFs have been associated with Aicardi-Goutières syndrome, an inherited disorder that features over-production of IFNs.
Collectively, these results represent the largest rare-variant exome-sequencing analysis of psoriasis, to date. Future experiments will characterize effects of these pLoFs on protein expression and/or function, and further analysis will determine whether an IFN gene signature can identify a clinically-relevant subset of psoriasis patients who would therapeutically benefit from IFN inhibition.
Julie Horowitz
Regeneron Genetics Center
Investigating genetic and phenotypic associations for 168 blood metabolites in 120K UK Biobank participants
In this study, we accessed the large-scale metabolomics, exome sequencing and phenomics data from the UK Biobank (UKB) to investigate gene-metabolite and metabolite-phenotype relationships. Blood metabolites (N=168) were profiled by Nightingale Health in ~120,000 UKB participants, >90% of whom had exome sequences and all had data on ~16,000 clinical traits.
We explored genetic associations with blood metabolites by two complementary approaches: (i) single-variant analysis, and (ii) gene-level collapsing analysis, using a linear regression model, adjusted for age, sex and BMI. For the single-variant analysis, we tested ~3.2 million variants under dominant and recessive models. For the gene-level collapsing analysis, the aggregate effect of variants in each gene was tested using 11 different models, including ones that focused on rare (MAF<0.1%) missense and protein-truncating variants. We also performed a metabolite PheWAS, in which the association for each metabolite was tested with each clinical trait.
Our analyses provide a rich catalogue of significant (p<1x10-8) associations: 10,461 variant-metabolite, 970 gene-metabolite, and 127,947 metabolite-phenotype relationships. This includes well-established, biologically plausible associations such as variants in PAH with phenylalanine levels [beta=1.2; p<1x10-300] and the concentration of intermediate-density lipoprotein particles with type 2 diabetes [beta=-1.5; p<1x10-300]. These data may also provide insights into underlying biological mechanisms: for instance, the observed metabolite signature for mutations in a gene that is a known drug target (e.g, HSD17B13) can indicate the metabolic profile expected with desirable therapeutic response.
The catalogue of genetic and phenotypic relationships for blood metabolites, which will expand further once metabolomics data becomes available in the entire UKB cohort of ~500,000 subjects, represents an excellent resource to better understand mechanisms underlying complex human diseases.
Abhishek Nag
Centre for Genomics Research, AstraZeneca
Practical implementation of polygenic risk scores and absolute risk score estimation across diverse ancestry groups
Polygenic risk scores (PRS) have generated considerable translational interest. Yet, most validation efforts focus on assessing relative rather than absolute risk scores (ARS), even though ARS are required for clinical decision making. ARS validation experiments are typically based on a single large cohort split into training/testing and rarely incorporate PRS. While such approaches typically generate calibrated ARS within the testing dataset, they do not properly capture the complex biases inherent to each healthcare context or account for environmental differences between countries and ethnicities. Consequently, the robustness of the ARS across different contexts is largely unknown.
To address these gaps, we derived a framework to combine ethnicity-specific disease baselines from a range of country-specific surveys, which capture social determinants of health, with ancestry-adjusted PRS (European OR per 1SD 1.87, 2.10, 1.51 and 2.09 respectively) for breast cancer, prostate cancer (PC), cardiovascular disease (CVD) and type 2 diabetes (T2D). We validated these ARS in independent datasets, computing calibration summary statistics, including the standard incidence ratio (SIR), calibration slope and intercept, and the integrated calibration index.
We find that inclusion of an ethnic specific baseline captures substantial ARS variability not captured by the PRS, particularly for PC, where an UK African and Caribbean baseline results in calibration (0.99-1.34 95% CI SIR) whilst the UK average baseline results in strong miscalibration (2.24-3.02 95% CI SIR). The extent of the calibration varied, with challenges arising for T2D and CVD, whose incidence has fluctuated across time and location in the US over the last decades. For T2D, baselines date from 1997-2019 but prospective testing data date from 1987-1999, resulting in miscalibration for White males (1.35-1.62 95% CI SIR). For CVD, baselines for myocardial infarction and fatal heart disease date from 2004-2011 and ischemic stroke from 1999, but prospective testing data date from 1986-2000, resulting in miscalibration for White females and males (0.66-0.92 and 1.04-1.31 95% CI SIR respectively).
We demonstrate that with appropriate data it is possible to translate genetic risk into clinically meaningful ARS that robustly replicate in diverse contexts. Our results also demonstrate the challenges arising from variation across ethnicity, geography and time and the need for population-relevant information on which risk prediction tools are to be applied.
Rachel Moore
Genomics plc
The Kidney genome atlas reveals a novel locus on chromosome 14 associated with adult proteinuric kidney diseases
Chronic Kidney disease (CKD) affects 1 in 9 people worldwide. There is a high unmet need for drugs that extend and restore kidney function, because dialysis and organ transplantation carry substantial economic and psychological burden. To foster drug development of genetically validated targets, we have created the Kidney Genome Atlas (KGA) by assembling ~23,000 whole genomes from 2,832 kidney disease cases including proteinuric kidney disease cases such as Focal segmental glomerulosclerosis (571 cases), minimal change disease (244 cases), nephrotic syndrome (196 cases) and idiopathic proteinuria (1,123 cases) and 19,804 controls. Following the gnomAD pipeline, we implemented a rigorous quality control procedure to obtain a high confidence dataset for downstream analyses of proteinuric kidney diseases. Ancestries were inferred genetically based on a k-NN model trained on 1,000 Genomes data which resulted in 597 cases and 10,127 controls of European (EUR) ancestry, 513 cases and 3,805 controls of African (AFR) ancestry, and 290 cases and 754 controls of Latino/Admixed American (AMR) ancestry for association testing. Meta-analysis of common variants across ancestries showed minimal impact of potential confounders, such as ancestry or sequencing center differences (lambda=1.03). We identified a novel locus on chromosome 14 (rs11160484; effect size = -0.42, P = 2.8*10-8) associated with proteinuric kidney disease. In addition, we confirmed the well-known association of APOL1 risk haplotypes (G1/G1, G2/G2 or G1/G2; effect size = 0.50, P = 2.4*10-9, under recessive model) in the AFR cohort. LD-score regression analysis revealed a trend towards a weak positive genetic correlation (rg = 0.097, 90% CI [0.010, 0.18]) between proteinuric kidney diseases and CKD defined by estimated glomerular filtration rate or eGFR (Wuttke et al, 2019). Using summary statistics from our EUR dataset, we estimated the SNP heritability of proteinuric kidney diseases at 0.15 (95% CI [0.095, 0.20]), suggesting that there may be many more genetic contributions that are yet to be discovered. These findings advance our understanding of the genetic architecture of proteinuric kidney diseases and highlight an opportunity for novel therapies and patient stratification.
Eva Fast
Goldfinch Bio