ASHG 2021 Sessions

  • Register
    • Regular Member - Free!
    • Early Career Member - Free!
    • Resident/Clinical Fellow Member - Free!
    • Postdoctoral Fellow Member - Free!
    • Graduate Student Member - Free!
    • Undergraduate Student Member - Free!
    • Emeritus Member - Free!
    • Life Member - Free!
    • Trainee Member - Free!

The ASHG 2021 Annual Meeting was held virtually from October 18-22 2021. In this package, you can view member recommended sessions from the meeting.

For more information on the 2021 Annual Meeting, visit our website

  • Contains 1 Component(s)

    Speakers provide an in-depth look at the newly resolved regions of human genome.

    Since the initial release of the human genome sequence 20 years ago, human chromosomes have remained unfinished due to large regions of highly identical repeats clustered within centromeres, regions of segmental duplication, and the acrocentric short arms of chromosomes. However, recent advances in long-read sequencing technologies and associated algorithms have now made it possible to systematically assemble these regions from native DNA for the first time. 

    In this session, we will present the first complete sequence of a human genome and provide an in-depth look at the newly resolved regions, their variation across individuals, and the resulting impact on human health, disease, and evolution. Our first speaker is a co-lead of the Telomere-to-Telomere (T2T) Consortium (https://sites.google.com/ucsc.edu/t2tworkinggroup), and he will introduce the session by unveiling the complete human genome and explaining the efforts to sequence, assemble, and validate the genome assembly. Our second speaker will present the genetic and epigenetic maps of all human centromeric regions and discuss their evolution across the hominid phylogeny over the last 25 million years. Our third speaker will focus on the segmental duplications found within the genome and discuss their transcriptional and epigenetic status. Finally, our fourth speaker will present the human methylome, with a particular focus on the epigenetic profile of newly resolved regions. At the end of the session, we will host a panel discussion to allow for a Q&A between the audience and each of our four speakers.

    Recorded session from the 2021 virtual meeting.

  • Contains 1 Component(s)

    This panel will examine workforce diversity initiatives and practices that aim to redress inequities.

    A resounding call for increased workforce diversity has been made in the genomics research community in recent years (Green et al. 2020; Channaoui et al. 2020). Recognizing the lack of diversity in both research participants and in the genomics workforce, workforce diversity initiatives strive to train and retain diverse members of the scientific community such that scientific fields are more inclusive and better represent racial, ethnic, sexual, gender minority, and differently abled groups. The NHGRI 2020 Strategic Vision, for instance, articulates how building a diverse genomics workforce will be a key priority “to promote workforce diversity, leadership in the field, and inclusion practices.” A broad literature demonstrates the lack of diversity among NIH funded investigators, even as research has demonstrated that researchers from underrepresented groups develop novel scientific projects at higher rates (Hofstra et al. 2020). 

    This panel will examine workforce diversity initiatives and practices that aim to redress inequities that have excluded underrepresented and communities of color from the genomics leadership and workforce more broadly. Drawing on empirical cases and the experiences and perspectives of researchers and program leaders on initiatives aimed at increasing diversity and inclusion in the field, this panel will discuss how definitions of diversity, commitments to diverse experiences, distribution of resources and infrastructures, and professional networks directly impact equitable diversification of the workforce. This panel will consider how an equity framework can be brought to bear on questions of what workforce diversity efforts can and should accomplish, who should be responsible for such initiatives, and what sustainable/lasting commitment to workforce diversity means for the genomics community moving forward.

    Recorded session from the 2021 virtual meeting.

  • Contains 1 Component(s)

    Findings from ELSI studies examining returning clinical PRS across diverse populations.

    Many polygenic risk scores (PRS) have been published with an eye towards clinical implementation. However, little work has been done on the social and ethical considerations of calculating and returning PRS, particularly across genetic ancestral backgrounds. 

    This session reports findings from embedded ELSI studies examining social and ethical considerations of returning clinical PRS across diverse populations. The panel will advance our understanding of critical issues that must be addressed to maximize potential benefits of clinical PRS. Following an introduction to the topic, Maya Sabatello describes the views of patients, clinicians and IRB members about challenges translating PRS research into improved care and strategies to promote health equity. Broadening our understanding of variation in stakeholder views, Sabrina Suckiel highlights English- and Spanish- speaking patients’ perceptions of clinical utility of PRS, preferences regarding return of information and potential barriers to uptake. 

    The format of PRS results can impact patient and provider understanding of risk and responsiveness to corresponding recommendations. Anna Lewis discusses research on stakeholder preferences regarding various formats of return and the potential impacts on use and understanding. Finally, Ellen Clayton presents data on the role of patient education to ensure researchers understand racial/ethnic minority views on clinical PRS. Following discussion, closing remarks will highlight the utility of embedded ELSI projects within large-scale PRS or genomic studies and offer recommendations for future research. These studies, embedded in the Electronic Medical Records and Genomics (eMERGE) IV Network, were designed to inform return of actionable PRS for common complex diseases to patients and their healthcare providers.

    Recorded session from the 2021 virtual meeting. 

  • Contains 1 Component(s) Recorded On: 10/20/2021

    This session will discuss ethnically diverse populations and health equity in genomic medicine.

    This session will focus on opportunities, challenges and ethical issues related to studying ethnically diverse populations to improve discovery and health equity in genomic medicine. Here we gather scientists from across the globe with experience in conducting genomic research in populations under-represented in human genetics research. 

    Speakers will address research on genetic risk factors for medical phenotypes of particular relevance to the study populations. We will also discuss ethical, social, and legal issues (ELSI) that arise when conducting genomic research in Indigenous communities, ways in which we can achieve more inclusive and equitable research, and ensure benefit sharing. We will have four 15-minute presentations followed by a 30 minute panel discussion. 

    Our session will start with a presentation discussing studies of pharmacogenetic variation in Indigenous peoples from South America and implications for personalized medicine in these populations. Our second speaker will describe results of a multi ethnic genome wide association study (GWAS) of Differentiated Thyroid Cancer (DTC) in Melanesians from New Caledonia and Polynesians from French Polynesian, two populations with the highest incidence of DTC worldwide. The speaker will illustrate the impact of genetic studies of DTC risk on community health in Oceanian populations. The next speaker will follow on the promising future for genetic discovery that can be achieved by studying African populations that have high levels of genomic and phenotypic diversity. This speaker will also illustrate how the study of ethnically diverse African populations has shed light on the genetic basis of hearing impairment, resulting in identification of multiple novel genes influencing hearing loss. Our last speaker will discuss ethical perspectives and the challenges of conducting genomic research in Indigenous populations from North America, the potential benefit for personalized medicine, and the importance of creating a partnership with Indigenous communities. 

    The panel discussion, which will include the two moderators and audience participation, will focus on how studies of ethnically diverse populations are of benefit to the global medical genetics community. We will further discuss ethical issue that arise from consequences of research that stigmatizes Indigenous communities and will touch base on principals of how to conduct research in minority and Indigenous populations in an ethical manner.

    Recorded session from the 2021 virtual meeting.

  • Contains 1 Component(s)

    This session will introduce recent efforts to level ancestry imbalance in genomic research.

    The success of genome-wide association studies (GWAS) in humans have yielded a wealth of clues about the molecular basis of many common human diseases. In addition, polygenic risk scores (PRS) for a variety of traits are increasingly becoming accurate enough to be useful for clinical practice, realizing the longstanding goal of personalized medicine. However, data collection continues to be predominantly imbalanced towards individuals of European ancestry, and it is abundantly clear that methods developed in one human ancestry group do not perform well in other ancestry groups, limiting their utility and exacerbating already severe health disparities. The speakers in this session will introduce recent efforts to level ancestry imbalance in genomic research, including the formation of large collaborative efforts and the development of novel statistical methods.

    Recorded session from the 2021 virtual meeting.

  • Contains 1 Component(s)

    Speakers discuss the molecular basis of craniofacial development.

    Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab. 

    Recorded session from the 2021 virtual meeting.

    HDAC9 structural variants disrupting TWIST1 transcriptional regulation lead to craniofacial and limb malformations

    Structural variants (SVs) such as insertions, deletions duplications, translocations and inversions, are associated with human disorders. SVs can affect protein coding sequences as well as gene regulatory elements. However, SVs disrupting protein coding sequences that also function as cis regulatory elements remain largely uncharacterized. Here, we show that craniosynostosis patients with SVs containing the Histone deacetylase 9 (HDAC9) protein coding sequence are associated with disruption of TWIST1 regulatory elements that reside within HDAC9 sequence. Using epigenetic marks and in vivo enhancer assays, we characterized six craniofacial TWIST1 enhancers located in the TWIST1-HDAC9 locus. Based on SVs within the HDAC9-TWIST1 locus, we defined the 3' HDAC9 sequence (~500Kb) as a critical Twist1 regulatory region. By deleting Twist1 enhancers within the Hdac9 protein coding sequence in mice (eTw5-7Del/Del), we showed that Twist1 expression was decreased, resulting in smaller sized and asymmetric skull and polydactyly. Furthermore, deletion of a Ctcf site (CtcfDel/Del) within the Hdac9 protein coding sequence, disrupted Twist1 enhancer-promoter interactions and altered Twist1 expression which led to deformed skull and hindlimb polydactyly, resembling Twist1+/- mouse phenotype. Deletions of Twist1 regulatory elements altered the distinct anterior\posterior expression patterns of Shh pathway genes, including Hand2 and Alx4. Using UMI-4C, we demonstrated that both enhancers and Ctcf site regions interact with Twist1 promoter region. These interactions are depended on the presence of both regulatory regions, indicating a specific chromatin conformation of Hdac9 in regulating Twist1 expression. Finally, a large inversion of the entire Hdac9 sequence (Hdac9INV/+) that does not disrupt Hdac9 expression but rather repositions Twist1 regulatory elements showed a decrease in Twist1 expression that led to subtle craniofacial phenotype and hindlimb polydactyly. Thus, our study elucidated essential components of TWIST1 transcriptional machinery that reside within the HDAC9 sequence, suggesting that SVs, encompassing protein coding sequence, such as HDAC9, could lead to a phenotype that is not attributed to its protein function but rather to a disruption of the transcriptional regulation of a nearby gene, such as TWIST1. 

    Ramon Y Birnbaum
    Ben Gurion University

    Kate Wilson Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation TrustDe-Novo Mutations identified in Nonsyndromic Cleft lip/Palate Families from Africa

    Background: Despite successes in the investigation of de novo mutations (DNMs) in the etiology of some birth defects (autism, congenital heart defects), only a limited number have been reported for nonsyndromic cleft lip with or without cleft palate (NSCL/P), the most common craniofacial birth defect. To identify high impact DNMs controlling risk of NSCL/P, we conducted whole genome sequencing (WGS) analyses of case-parent trios from an understudied population.
    Method: A total of 150 nsCL/P African case-parent trios were sequenced for this study. Each trio comprises an affected child (with nsCL/P) and unaffected parents and were recruited from Ghana and Nigeria. Saliva samples were collected from these individuals and their genomes were sequenced from extracted DNA. After quality control, we screened the genomes of the remaining 130 trios for high impact DNMs possibly contributing to risk of nsCL/P. We used bioinformatic prediction tools to identify those mutations predicted to damage and affect the protein structures and functions.
    Results: We identified 110 potential pathogenic DNMs. These include novel loss of function (LOF) variants in TTN, MINK1 and ARHGAP10 genes; and missense variants in DHRS3, TULP4, SHH, TP63, FKBP10, ACAN, RECQL4 and KMT2D. These variants are predicted to be damaging and are among the most deleterious (top 1% ) mutations in the human genome. Experimental evidence in published works showed TTN, SHH, TP63, FKBP10, ACAN, RECQL4 and KMT2D genes are involved in facial development and are involved in the etiology of syndromic CL/P. While DHRS3, SHH and TP63 contribute to the risk of nsCL/P. Interestingly, our SHH de novo variant p.Ser362Leu has been reported to cause holoprosencephaly 3 (HPE3), a syndromic form of CL/P. Damaging mutations in the DHRS3 gene affects retinoic acid signaling during embryogenesis which causes cleft palate. Association studies have identified TULP4 as a potential cleft candidate gene, while ARHGAP10 interacts with CTNNB1 to control WNT signaling. MINK1 plays a role in cell-cell adhesion and migration, and causes abnormal tooth morphogenesis in mice. Our gene-set enrichment analysis identified additional genes that are important in palatal development. These include DLX6 and EPHB2 and they both harbored novel damaging DNMs.
    Conclusion: Our WGS adds to the available data on Africa population (a historically underrepresented group in genetics study) and has identified novel pathogenic de novo variants that may contribute to the developmental pathogenesis of NSCL/P. These findings demonstrate the power of WGS analysis of trios for discovering potential pathogenic variants. 

    Waheed Awotoye
    Iowa Institute for Oral Health Research, University of Iowa

    Identification of novel molecular pathways in syndromic orofacial clefting

    Background: Syndromic orofacial clefting (OC) accounts for 30% of cleft lip and/or palate. An updated review of molecular pathways associated with syndromic OC is unavailable. The Deciphering Developmental Disorders (DDD) study provides a source of quality data to assemble this information. The Genomics England PanelApp is a publicly available database of curated virtual gene panels and is a valuable reference tool for genes associated with syndromic OC.
    Aim: To investigate molecular pathways associated with syndromic OC by reviewing the results of exome sequencing (ES) and exon-arrayCGH in a large cohort of patients with syndromic OC.
    Methods: Patients with the HPO terms ‘cleft’ and ‘bifid uvula’ were identified through a Complementary Analysis Project within the DDD study. Possible diagnostic variants were identified by automated variant filtering and manual review. Single nucleotide variants (SNVs) within known disease-causing genes and copy number variants (CNVs) were classified according to the ACMG guidelines, the ACGS Best Practice Guidelines and consensus opinion. Functional analyses of identified genes were performed within STRING, Cytoscape and MCODE. Associated phenotypes were explored using the International Mouse Phenotyping Consortium. Gene expression analyses were performed within GENE2FUNC.
    Results: 603/13612 (4.4%) patients were identified of whom 453/603 (75.1%) had trio ES. Pathogenic (P) or likely pathogenic (LP) variants were identified for 220/603 (36.5%) patients in 124 known disease-causing genes with SATB2 the most common (16/220, 7.3%). 23/220 (10.5%) patients had a P or LP CNV of partial or full contribution to their phenotype. 35/124 genes fulfilled criteria to be added to the PanelApp ‘Clefting’ panel, increasing the size of the current panel by 23.8%. Gene ontology and pathway analyses identified novel molecular networks for syndromic OC which were distinct from those in non-syndromic OC. Gene expression analyses and investigation of knockout phenotypes also showed a distinction between syndromic and non-syndromic OC. Pathway and expression analyses showed an enrichment of genes associated with intellectual disability (FDR=2.8x10-33), RNA metabolism (FDR<3.5x10-21), transcription (FDR<2.3x10-20) and chromatin organisation (FDR=1.03x10-11).
    Conclusion: This study demonstrates the utility of ES and CNV analysis for patients with syndromic OC and increases the diagnostic rate for this patient cohort. It also highlights novel molecular pathways specific to syndromic OC and enhances our understanding of lip and palate development. 

    RERE deficiencycontributes to the development of orofacial clefts in humans and mice

    Deletions of chromosome 1p36 are the most common telomeric deletions in humans and are associated with an increased risk of orofacial clefting. Deletion/phenotype mapping, combined with data from human and mouse studies, suggests the existence of multiple 1p36 genes associated with orofacial clefting including SKIPRDM16PAX7, and GRHL3. The arginine-glutamic acid dipeptide (RE) repeats gene (RERE) is located in the proximal critical region for 1p36 deletion syndrome and encodes a nuclear receptor co-regulator. Pathogenic RERE variants have been shown to cause neurodevelopmental disorder with or without anomalies of the brain, eye, or heart (NEDBEH), but have not been shown to cause orofacial clefting. Here we report the first individual with NEDBEH to have a cleft palate. We confirm that RERE is broadly expressed in the palate during mouse embryonic development, and we demonstrate that the majority of RERE-deficient mouse embryos on C57BL/6 background have cleft palate. We go on to show that ablation of Rere in cranial neural crest cells, mediated by a Wnt1-Cre, leads to delayed elevation of the palatal shelves and cleft palate, and that proliferation of mesenchymal cells in the palatal shelves is significantly reduced in Rereflox/flox;Wnt1-Cre embryos. We conclude that loss of RERE function contributes to the development of cleft palate in individuals with proximal 1p36 deletions and NEDBEH, and that RERE expression in cranial neural crest cells and their derivatives is required for normal palatal development. 

    Bum-Jun Kim
    Molecular & Human Genetics, Baylor College Medicine

    Single cell transcriptomics-directed investigation of de novo mutations and rare inherited genetic variants in cleft lip and palate

    Cleft lip and palate (CL/P) is one of the most common congenital anomalies. The etiology of CL/P is complex with both environmental and genetic risk factors. While previous studies have identified several CL/P-associated genes or regions, only a fraction of all cases can be clearly attributed to specific genes. Additional genetic causes may be due to rare inherited variants (RIVs) or de novo mutations (DNMs) in simplex CL/P cases. To investigate this further, we performed single cell RNA sequencing on epithelial cells of the lambdoidal junction (λ) from gestational day (E)10.5 wildtype mouse embryos at the point of upper lip fusion. We identified six cell clusters, and using the top 150 differentially expressed genes from each, we carried out targeted evaluation of both DNMs and RIVs in whole genome sequences from 756 CL/P case-parent trios of Asian, Latino, and European ancestry. For each cluster we analyzed enrichment of nonsynonymous, loss-of-function (LOF), and protein altering (nonsynonymous + LOF) DNMs. Of these, the olfactory epithelium cluster was enriched for protein altering (p=0.005) and the periderm cluster was enriched for nonsynonymous variants (p=0.005). We then evaluated exonic RIVs as defined by a minor allele frequency of <0.05% in gnomAD. A total of 2,976 variants were identified, with subsequent filtering for gene and variant specific constraint resulting in 445 variants of interest in 109 genes. Among these were several CL/P risk associated genes, including IRF6TFAP2A, and GRHL3, all of which contained both DNMs and RIVs, suggesting other genes identified via this method may also be significant in risk for CL/P. Although all clusters were evaluated, the λ-fusion effector cluster was of specific interest given its critical role in prominence fusion during normal craniofacial development. This cluster harbored variants of interest in 31 genes, a higher percentage than other clusters (23% vs. 9-19%). Further, gene ontology revealed a group of 14 genes enriched for terms related to transcription and, interestingly, both negative regulation of epithelial cell proliferation (FDR 0.015) and positive regulation of mesenchymal cell proliferation (FDR 0.049). Among this group was transcription factor ZFHX3, which contained the most variants including loss-of-function DNMs and RIVs; thus, it remains of high interest for novel CL/P risk association. This investigation illustrates the benefit of integrating genomic technologies to prioritize and identify novel genetic associations with risk to CL/P. Continued evaluation focused on gene interactions and pathways represented by these variants may further elucidate CL/P etiology. 

    Kelsey Robinson
    Emory University

    A novel 3 MB deletion on 6p24 removes distant neural-crest enhancers controlling TFAP2A resulting in mild Branchiooculofacial syndrome in a multiplex family with orofacial clefting

    In a previous effort to characterize large copy number variants as genetic risk factors for orofacial clefting (OFC) following a genome-wide association study, we identified a novel 3 Mb deletion on 6p24 in three affected members of the same family from Colombia. The reported pedigree had a dominant inheritance pattern and included 5 individuals with OFC. All affected family members carried the deletion, which was inherited from the proband’s unaffected grandmother. Supporting the pathogenicity of this deletion, exome sequencing of this family found no segregating single nucleotide variants in OFC candidate genes. The 3Mb deleted region included 12 genes, none of which are strong OFC candidates and a 840 kb gene desert. We observed that the 3’ breakpoint of this deletion occurred within a large non-coding region directly adjacent to the important developmental transcription factor TFAP2ATFAP2A loss of function is one of the primary genes associated with Branchiooculofacial syndrome (BFOS) that includes OFC in conjunction with branchial and ocular abnormalities. Affected members of this family display OFC, broad nasal root, and slight hypotelorism characteristic of BFOS. Cell type-specific enhancers in the genomic region directly flanking TFAP2A have been previously implicated in a subset of BOFS patients, however these enhancers remain intact in this family.We hypothesized that this deletion contains additional regulatory elements of TFAP2A, resulting in attenuated expression of this gene and a milder form of BOFS. To address this hypothesis and predict enhancer-gene interactions at this region, we used an activity by contact model (ABC-Enhancer) to integrate ChIP-seq based chromatin state annotations, chromosome conformation (Hi-C), and gene expression (RNA-Seq) from primary human craniofacial tissues and a culture model of cranial neural crest cells (CNCCs). We identified closely clustered, strong enhancer states that were predicted to have interactions with TFAP2A over a distance of greater than two megabases. To validate these predictions in vivo, we used CRISPR-Cas9 in human embryonic stem-cells to create homozyguous deletions of a 25kb region encompassing the strong enhancer segments. We differentiated these lines to CNCCs and compared gene expression and proliferative capacity to wildtype. We found this region is essential for cell viability during CNCC differentiation and differential gene expression analysis revealed substantial effects on TFAP2A expression. In summary, we identified a region within the inherited deletion that regulates TFAP2A gene expression and could contribute to the mild BOFS phenotype in this family. 

    Tara Yankee
    Department of Genetics and Genome Sciences, UConn Health Graduate Program in Genetics and Developmental Biology

  • Contains 1 Component(s)

    Speakers discuss genomics in Africa.

    Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab. 

    Recorded session from the 2021 virtual meeting.

    Revisiting the out of Africa event with a deep learning approach

    Anatomically modern humans evolved around 300 thousand years ago in Africa. Modern humans started to appear in the fossil record outside of Africa about 100 thousand years ago though other hominins existed throughout Eurasia much earlier. Recently, several researchers argued in favourof a single out of Africa event for modern humans based on whole-genome sequences analyses. However, the single out of Africa model is in contrast with some of the findings from fossil records, which supports two out of Africa, and uniparental data, which proposes a back to Africa movement. Here, we used a deep learning approach coupled with Approximate Bayesian Computation and Sequential Monte Carlo to revisit these hypotheses from the whole genome sequence perspective. Our results support the back to Africa model over other alternatives. We estimated that there are two successive splits between Africa and out of African populations happening around 60-90 thousand years ago and separated by 13-15 thousand years. One of the populations resulting from the more recent split has to a large extent replaced the older West African population while the other one has founded the out of Africa populations.

    Mayukh Mondal
    Institute of Genomics, University of Tartu

    An analysis of population copy number variation in sub-Saharan African genomes


    Introduction Copy number variation (CNV) is responsible for a large component of normal human variation and has been implicated in the cause/genetic aetiology of several rare diseases. Population reference databases containing CNV information from all global populations is critical in disease genetics research, but current resources lack diversity, especially from the African continent. This makes such databases of limited use in studies looking at genetic diseases in African individuals. This study therefore aims to address this knowledge gap by producing a map of CNV using whole-genome data from several, previously unstudied African populations
    Methods 1027 high coverage whole genome sequences obtained from individuals across west, central, southern and east Africa, were analysed using Manta and Graphtyper2. Additionally, 919 of the samples were also analysed using Genome STRiP to detect multi-allelic CNV. Quality control specific to each tool was performed in order to achieve high quality variant call sets.
    Results 56 816 CNVs were detected by the Manta pipeline, consisting of 44 671 deletions and 12 145 duplications. Due the ability of Manta to detect small variants (<100bp), we are able to describe this previously less studied class of variants in an African cohort. 25% of the variants detected by Manta were <100 bp and 40% of these were common variants at >5% allele frequency. 50% of these variants are novel compared to 27% of the remaining variants >100bp. Overall, 32% of the variants identified were novel. A comparison between central, west, east and southern African regions yielded a number of variants unique to each region. We find deletions tend to have lower allele frequencies compared to duplications. The majority of variants were found in the non-coding genome, with only 8% of variants overlapping coding transcripts. An additional 5% of variants overlapped regulatory features. Genome STRiP detected 3991 multi-allelic variants with 99% having a copy number between 3-20. There were also variants with copy numbers greater than 20, some of which appear to be incidences of excessive runaway duplications not previously described.
    Conclusion The amount of novel variation found demonstrates the importance of including African individuals from multiple African regions when producing reference databases and the rich genomic diversity of African genomes. Work is currently being performed to combine the full Genome STRiP and Manta call sets to produce a robust combined dataset. The variant database produced in this study will provide a valuable resource as a reference of normal CNV for the study of diseases in African populations.
    Emma Wiener

    Division of Human Genetics, National Health Laboratory Service & School of Pathology, Faculty of Health Sciences, University of Witwatersrand

    Integrative genomic analyses identify key interethnic differences in immune response to malaria

    Host responses to infection with the malaria parasite P. falciparum vary between individuals for reasons that are poorly understood. Here we reveal metabolic perturbations as a consequence of malaria infection in children and identify an immunosuppressive role of endogenous steroid production in the context of P. falciparum infection. We perform metabolomics on matched samples from children from two ethnic groups in West Africa, before and after infection with seasonal malaria. Analyzing 306 global metabolomes we identify 92 parasitemia-associated metabolites with impact on the host adaptive immune response. Integrative metabolomic-transcriptomic and causal mediation- moderation analyses reveal an infection-driven immunosuppressive role of parasitemia-associated pregnenolone steroids on lymphocyte function and the expression of key immunoregulatory lymphocyte genes in the Gouin ethnic group. In children from the less malaria-susceptible Fulani ethnic group we observe opposing responses upon infection, consistent with the immunosuppressive role of endogenous steroids in malaria. These findings advance our understanding of P. falciparum pathogenesis in humans and identify potential new targets for antimalarial therapeutic interventions. 

    Youssef Idaghdour
    New York University Abu Dhabi


    GWAS of complex traits in a multi-population African cohort

    The diversity among present-day African populations is the result of a deep and complex history of admixture, migrations, and regional adaptations to local environments and diseases. Little is known about the impact of this evolutionary history on the genetics underlying complex traits. Here I present recent work on genetic associations for a panel of anthropometric, cardiovascular, and metabolic biomarker measurements paired with dense genotyping data. For some traits, the variation among populations is expected to reflect local adaptations, such as short stature in western Cogo rainforest hunter-gatherers. The study cohort of several thousand individuals is drawn from an ancestrally diverse set of populations from western, eastern, and southern sub-Saharan Africa. Populations include current or recent hunter-gatherers, traditional agriculturalists, and semi-nomadic pastoralists, from rural regions of Cameroon, Nigeria, Ethiopia, Kenya, Tanzania, and Botswana. For many of these traits, this marks the first genotype/phenotype analysis to include these ethnic groups. The high degree of population structure presents both challenges and opportunities for genetic analysis. Genetic structure analysis indicates genetic clustering by geographic location, language family, and regional hunter-gatherer lineages. Examples include the hunter-gathers from the Serengeti, western Congo, and Kalahari, and clusters that correlate with Niger-Congo, Afroasiatic, and Nilo-Saharan language families. We observe substantial population-level variation for many traits, such as height, skin pigmentation, and blood pressure. The proportion of the trait variance that is due to the genetic population structure varies by trait and tends to be greater for anthropometric traits like height and skin pigmentation than for metabolic biomarkers like LDL. From genotype/phenotype association tests we find numerous independent associations at genome-wide significance for several traits, including circulating triglyceride levels and BMI. The population structure of the total additive genetic effects is also examined. European GWAS associations replicate poorly in this African cohort, while associations discovered in the African cohort show comparatively better replication in Europeans. 

    Matthew Hansen
    Univ Pennsylvania

    Genotype-by-infection interactions: Single cell RNASeq profiling of in-vivo host immune response to malaria reveals cell type and infection-specific eQTLs

    The disease burden of malaria remains a significant global public health challenge. Plasmodium falciparum is responsible for more than 99% of malaria cases in Africa and for >400,000/year malaria-related deaths worldwide. Inter-individual differences in susceptibility to malaria is multifactorial and has a significant heritable component but our understanding of the effect of infection on gene regulation of immune response at the transcriptional remains very limited. Here we use longitudinal matched sampling, single cell RNAseq profiling of PBMCs and whole-genome sequencing data of malarial children before and after natural P. falciparum infection in Banfora, Burkina Faso, West Africa. In total, we generated ~90,000 single cell RNASeq profiles and identified PBMC cell types affected by infection. Single cell RNASeq eQTL analysis revealed cell type specific eQTLs and genome-wide significant genotype-by-infection interaction effects implicating key immune genes. These results provide the first genome-wide picture of host in vivo regulatory variation events in malaria at the single cell level and highlight the implication of regulatory interaction effects in modulating host immune response in-vivo. 

    Odmaa Bayaraa
    New York University Abu Dhabi


    Returning secondary genetic findings: Provider perspective in Africa

    Objective: Previous research has shown that lack of resources and knowledge significantly impact the return of genomic test results. However, not much is known about the level of expertise and knowledge of clinicians providing cleft care in Africa on genetic diseases, despite the vast genetic diversity in this population.
    Methods: Providers in participating cleft-craniofacial clinics in Ethiopia, Ghana, and Nigeria were sent the link to a 63-question online survey. This survey assessed the providers' experience with genetic testing, genetics education and return of genetic results, provider knowledge, clinician comfort with returning results, available resources to assist with genomic findings, and potential barriers.
    Results: As of June 2nd, 2021, 246 providers completed the survey. Only 2% had been involved in the delivery of Exome or Genome sequencing; 78.6% had no formal genetic education, 49.6% agreed that all secondary findings should be disclosed to patients. Regarding the comfort level, 89.4% were somewhat to extremely comfortable discussing genetic risk factors with patients, and 81.8% were somewhat to extremely comfortable with returning genetic results. Sixty-three percent believed that resources were currently available to enable them to access needed genetic information.
    Conclusion: Providers were aware that genetic testing could help in the clinical management of diseases from the returned responses. However, the lack of knowledge about genomic medicine, uncertain clinical utility, and lack of available resources were cited as barriers that significantly impacted incorporating genetic testing into their practice. Data collection is ongoing and will continue till July 31st, 2021. This is the first Ethical, Legal, and Social Implications (ELSI) study to document the knowledge and comfort level of cleft providers in Africa. This study will help determine the most beneficial information to equip providers with the return of secondary genetic findings. 

    Abimbola Oladayo
    University of Iowa

  • Contains 1 Component(s)

    Speakers discuss diversifying data, diagnostics, and treatment options for genetic disease.

    Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab. 

    Recorded session from the 2021 virtual meeting.

    Long-Term systemic expression and cross-correction ability of HMI-203: Investigational gene therapy candidate for mucopolysaccharidosis type II or Hunter Syndrome

    Mucopolysaccharidosis type II (MPS II), or Hunter syndrome, is a rare X-linked lysosomal storage disorder caused by mutations in the iduronate-2-sulfatase (IDS) gene, resulting in loss of I2S activity leading to systemic (peripheral organs and central nervous system (CNS)) toxic lysosomal accumulation of glycosaminoglycans (GAGs). GAGs are large polysaccharides made of repeating disaccharide units responsible for providing structure and hydration to the cell. The disease results in skeletal dysplasia, joint stiffness, organomegaly, airway obstruction and, in severe cases, neurocognitive deficits. Hunter syndrome occurs in approximately 1 in 100,000 to 1 in 170,000 males, and causes significantly reduced lifespan, with the severe form leading to life expectancy of 10 to 20 years The proposed therapeutic mechanism of gene therapy candidate HMI‑203 is based on both intracellular expression and synthesis of active I2S, as well as high levels of expression and secretion of active I2S enzyme to support cross correction. Herein, we report preclinical data where a single intravenous dose of HMI-203 delivering human IDS via a rAAVHSC vector in the MPS II murine model resulted in dose-dependent and long-term transduction, IDS expression and I2S enzymatic activity in the evaluated tissues, e.g., liver, brain and serum through 52 weeks post-dose. A significant correlation was observed between liver and serum I2S activity, suggesting that the liver was likely the major contributor to the elevated levels of active I2S in the serum. The circulating I2S protein in the serum was functionally active (i.e., 90 kDa form) and cross-correction activity via a mannose-6-phosphate receptor dependent pathway was demonstrated using an in vitro competition assay. The robust and broad IDS tissue expression, along with demonstrated cross-correction significantly reduced GAG heparan sulfate (GAG-HS) to wild type (WT) levels in all evaluated organs associated with the disease, cerebrospinal fluid (CSF) and urine. In addition, lysosomal-associated membrane protein-1 (LAMP1) levels were significantly reduced to WT-like levels in the peripheral organs and CNS tissues. Of note, positive and significant correlations were observed between reduction in GAG-HS and LAMP1 levels in the CNS and brain and CSF GAG-HS levels, suggesting that CSF GAG-HS levels could be indicative of overall brain GAG and lysosomal burden levels in the clinic. Taken together, we have demonstrated that HMI-203 combines transduction and expression with the potential for cross-correction. These HMI-203 IND-enabling studies support HMI-203 as a gene therapy candidate for the treatment of MPS II. 

    Kruti Patel
    Homology Medicines, Inc.

    Tasimelteon Safely and Effectively Improves Sleep in Smith Magenis Syndrome: results from a Double-Blind Randomized Trial Followed by an Open-Label Extension

    Smith-Magenis Syndrome (SMS; OMIM #182290) is a rare genetic disorder that results from an interstitial deletion of 17p11.2 and, in rare cases, from a retinoic acid induced 1 (RAI1) gene variant (Slager et al 2003). Currently, the prevailing theory is that there is an underlying circadian pathophysiology causing sleep disturbances in these patients, as they exhibit low overall melatonin concentrations and abnormal timing of peak plasma melatonin concentrations. This abnormal inverted circadian rhythm is estimated to occur in 95% of individuals with SMS (Boone et al., 2011; Spruyt et al., 2016). To assess the efficacy of tasimelteon, a melatonin receptor agonist, to improve sleep in SMS, a 9-week, double-blind, randomized, two-period crossover study was conducted at four U.S. clinical centers. Genetically-confirmed SMS patients, aged 3 to 39, with sleep complaints participated in the study. Patients were assigned to treatment with tasimelteon or placebo in a 4-week crossover study with a one week washout between treatments. Eligible patients participated in an open label study and were followed for > 3 months. Improvement of sleep quality (DDSQ50) and total sleep time (DDTST50) on the worst 50% of nights were primary endpoints. Secondary measures included actigraphy and behavioral parameters. Over three years, fifty-two patients were screened and twenty-five patients completed the randomized portion of the study. DDSQ50 significantly improved over placebo (0.4, p=0.0139) and DDTST50 also improved (18.5 min, p=0.0556). Average sleep quality (0.3, p=0.0155) and actigraphy-based total sleep time (21.1 min, p=0.0134) improved significantly, consistent with the primary outcomes. Patients treated for ≥ 90 days in the open label study showed persistent efficacy. Adverse events were similar between placebo and tasimelteon. Tasimelteon safely and effectively improved sleep in SMS. The 17p11.2 deletion encompasses RAI1, leading to haploinsufficiency, which is considered the primary cause for most features of SMS, including dysregulation of the molecular clock via its effect on CLOCK expression. ChIP-Chip and reporter studies suggest that RAI1 binds, directly or in a complex, to the 1st intron of CLOCK, enhancing transcriptional activity, resulting in reduced CLOCK expression in SMS patient-derived cells (Williams et al 2012). The results of this study suggest that treatment with a the circadian regulator can, in part, ameliorate the circadian deficiencies caused by RAI1 haploinsufficiency, providing further evidence of a critical role for RAI1 in the regulation of circadian rhythms. 

    Christos Polymeropoulos
    Vanda Pharmaceuticals Inc.

    Unravelling African genomes: Whole-genome sequencing of 1000 Nigerian samples spanning 50 tribal groups provides new insights into diversity and admixture

    The lack of adequate representation of diverse genomes in human genomics research may limit insights that can be made about variants influencing disease susceptibility and trait variability across populations. We are helping to address this gap by performing germline whole genome sequencing of a Nigerian cohort. Nigeria represents one of the most diverse and populous regions on earth, with a population of over 200 million and over 250 unique tribal groups. We coordinate data generation in Lagos with analysis by staff around the world by leveraging cloud resources and deploying a scalable, robust, portable pipeline for alignment and variant calling. We present results from an initial round of whole-genome sequencing of ~1000 subjects from 50 tribal groups in Nigeria. We describe patterns of variation across tribes including variants of different functional classes and frequencies. We survey patterns of autozygosity across groups and compare these to 1000 Genomes samples. We highlight genetic distances between tribes and reveal evidence of admixture with European and northern African populations. We compare frequencies within our dataset to those reported in publicly available data (e.g. 1000 Genomes) for specific loci of clinical utility, e.g. those associated with drug response, highlighting noteworthy differences. Lastly, we find widespread, tribe-specific differences in allele frequency for medically-relevant variation, underscoring the importance of variant discovery and replication in non-European ancestry cohorts. Our results add to the growing body of genomic data from diverse populations, investigating understudied groups and the unique opportunities for discovery that they represent. We highlight opportunities for precision medicine, and reveal insights about variants of most clinical importance within and between human populations. 

    Colm O'Dushlaine
    54gene

    NOTCH3 p.Arg1231Cys is Present in 1 in 92 Pakistani and Associated with Stroke

    Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy (CADASIL) is an autosomal dominant Mendelian disorder characterized by early onset of migraine with aura, recurrent stroke, and dementia. Pathogenic CADASIL variants either add or remove a cysteine (Cys+/-) residue in one of 34 epidermal growth factor like repeats (EGFR) in the extra-cellular domain (ECD) or NOTCH3. Exome-wide association analysis of 4,882 stroke cases and 6,094 controls recruited in the Pakistan Genomic Resource (PGR) from Pakistan identified one such variant, p.Arg1231Cys, associated with subcortical stroke; p value 2.18e-8, odds ratio (OR) 2.97, 95% confidence interval (CI) 2.03 to 4.35, minor allele frequency (MAF) 7.1e-3. Analyses of the larger PGR cohort comprising of 80,000 participants identified additional heterozygous and homozygous carriers of this variant; call back studies of the carriers and their family members identified a high mortality in family members and a high prevalence of stroke. The Cys allele was found to disrupt a highly conserved domain (91% overall sequence identity between human and mouse), was predicted deleterious by PolyPhen2 (score 0.843 of 1), and was risk-increasing (cases MAF 0.016, controls MAF 0.0053). The p.Arg1231Cys variant was observed at a similar MAF in other South Asian populations sequenced by Regeneron Genetics Center, and present but orders of magnitude rarer in European populations. Despite rare prevalence in Europe, p.Arg1231Cys was associated with ischemic stroke in 450 thousand UK Biobank (UKB) participants; p value 8.8e-4, OR 3.38, CI 1.65 to 6.94, MAF 2.0e-4. In addition, p.Arg1231Cys was associated with multiple brain MRI phenotypes relevant to CADASIL in a 40K subset of UKB, such as mean diffusivity in the external capsule; p value 5.41e-10, OR 1.4, CI 0.96 to 1.8, MAF 2.8e-4. Consistent with CADASIL pathogenicity, a burden test limited to Cys+/- variants in the NOTCH3 ECD (including p.Arg1231Cys) strengthened associations in both Pakistan (subcortical stroke p value 1.5e-10, OR 3.39, CI 2.32 to 4.91) and UKB (ischemic stroke p value 9.3e-8, OR 3.38, CI 1.74 to 2.98). In both cohorts, p.Arg1231Cys was the most common Cys+/- variant in the NOTCH3 ECD. Taken together, these findings have major implications for precision medicine in South Asia, given that an estimated 1 in 92 (over 20 million of 1.9 billion) individuals are carriers for this variant and are at approximately 3-fold elevated risk for stroke. Our estimates suggest that around 2% of strokes in Pakistan may be attributable to NOTCH3 p.Arg1231Cys. 

    Juan L Rodriguez-Flores
    Regeneron Genetics Center

    A high-resolution panel for uncovering repeat expansions that cause ataxias

    The hereditary ataxias are a group of rare neurological diseases with similar symptoms. Many of these ataxic syndromes are caused by expansions of short tandem repeat (STR) in a number of different genes. Molecular genetic testing to accurately determine the genetic cause of known ataxias is often employed to support clinical diagnoses. Advances in therapeutic strategies (e.g., antisense oligonucleotides) to target repeat expansions underscore the importance of understanding the genetic context and sequence complexity of ataxic repeat expansions. Further highlighting the importance of molecular genetic testing, several studies have shown that repeat sequence interruptions in certain ataxia expansions play important roles in modifying the penetrance of the disease and age of onset. PCR and Southern blotting assays are currently the most employed methods in commercially available ataxia repeat expansion panels for clinical testing. Although these electrophoresis-based methods could detect repeat expansions above pathogenic threshold, accurate sizing of the repeat expansion is difficult to achieve when the length of repeat sequence is longer than a few hundred bases. Sequence interruption information is also not available with these approaches. We have recently developed an ataxia expansion panel using the PacBio No-Amp targeted sequencing approach to capture and sequence repeat expansion loci associated with fifteen ataxia diseases. The method utilizes CRISPR-Cas9 nuclease and pairs of guide RNAs to excise DNA fragments containing the repeat sequences within ataxia genes. This approach eliminates PCR amplification artifacts, amplification bias, and preserves native DNA for base modification detection. In this study, we sequenced samples with known or unknown diagnosis for ataxia with the No-Amp targeted sequencing panel utilizing PacBio highly accurate long reads - HiFi reads. The high accuracy of HiFi reads provides both certainty in sizing of the repeat expansion and repeat sequence interruption within the expansion sequences. Sequencing results demonstrate the potential of using this repeat expansion panel for eventual genetic testing. As additional ataxia, and related neurological diseases, caused by STR expansions are discovered and studied, the No-Amp targeted sequencing panel could be expanded to include additional targets. The ability to multiplex samples from different patients also makes the method a potentially cost-effective option for molecular genetic screening in the future. 

    Yu-Chih Tsai
    Pacific Biosciences

    Deployment of clinical whole genome sequencing in support of more than 1,000 resource-limited patients: four years of the iHope Program

    Patients with a suspected genetic disease are often unable to obtain a timely molecular diagnosis, and those in resource-limited locations face even greater challenges. Clinical whole genome sequencing (cWGS) shows promise as a comprehensive test which may shorten the diagnostic odyssey regardless of setting. The iHope Program is a philanthropic effort to provide cWGS to patients who are unable to obtain precision testing due to resource-limitations.
    From June 2016 through June 9, 2021, 1004 individuals pursued cWGS test through the iHope Program. Cases were received from 23 partner iHope clinical sites spanning seven countries. Forty percent of cases (n=403) were received from global partners in Mexico (n=205), Peru (n=93), Italy (n=50), Democratic Republic of Congo (n=40), New Zealand (n=10), and the United Arab Emirates (n=5). Most testing was performed on duos and trios. Proband phenotypes were complex, with nervous system, head and neck, skeletal, eye, and digestive the five most frequently identified Human Phenotype Ontology root ancestor terms.
    Variants were reported in 67% (n=677) of cases, of which 40.5% (n=407) conferred a definitive molecular diagnosis. Reported variants per case ranged from 0 to 5, and in 33 cases (3.3%), multiple molecular diagnoses were observed. Variants spanned 468 unique single genes. Of 1020 reported variants, a majority were nuclear SNVs or MNVs (n=693, 67.9%), followed by CNVs (n=175,17.2%), small indels (n=127, 12.5%), short tandem repeats (n=12, 1.2%), mitochondrial SNVs (n=10, 1%) and uniparental disomy (n=2, 0.1%). Copy number variants ranged in size from 3 kb to full aneuploidies. In fifteen individuals from eleven families, findings were suggestive of a structural chromosomal rearrangement.
    At least ninety days after cWGS report delivery, a clinical utility survey was requested of the ordering clinician to assess effects on care and management. To date, surveys have been obtained for 581 patients (58%), representing one of the largest cWGS clinical utility datasets in a pediatric outpatient population. Data collection is ongoing, but initial analysis indicates that cWGS results prompted follow-up such as imaging, laboratory or physiological testing, referral for specialty consultation or other evaluations in 40% (233/581) of patients. In 56.6% (329/581), cWGS results contributed to counseling about prognosis, recurrence risks, reproductive screening/testing options and screening/testing recommendations or options for family members. These findings suggest that deployment of cWGS in support of resource-limited patients is tractable globally and can have a substantial impact on patient management. 

    Erin Thorpe
    Illumina, Inc.


  • Contains 1 Component(s)

    Speakers discuss leveraging ancestry and family history to study polygenic risk prediction.

    Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab. 

    Recorded session from the 2021 virtual meeting.

    Local ancestry allows for improved genomic prediction in underrepresented and admixed populations

    Due to the paucity of methodological and computational approaches that account for their genomic complexity, admixed populations are systematically excluded from statistical genomic studies. Admixed populations make up more than a third of the US populace but are severely underrepresented in biomedical research which may contribute to health disparities. To reap the full benefits from the ongoing efforts to collect samples from underrepresented populations and from existing mixed ancestry cohorts, tools facilitating the well-calibrated research of admixed peoples are urgently needed.
    We recently developed a local ancestry aware GWAS model, Tractor, which corrects for fine-scale population structure at the genotype level, often boosts locus discovery power, and produces ancestry-specific effect size estimates and p values. Using Tractor summary statistics from African ancestry (AFR) tracts in ~4500 admixed UK Biobank (UKB) individuals, we built polygenic risk scores (PRS) and predicted blood panel phenotypes on homogenous African ancestry UKB individuals. We benchmarked these PRS against scores created from traditional GWAS runs on 1) the same admixed cohort, 2) a large European UKB sample, and 3) a large multi-ancestry meta-analysis of continental ancestry groups from the pan-UKB project (https://pan.ukbb.broadinstitute.org/). We also tested the accuracy of several PRS models including pruning and thresholding and PRS-CSx. We find that incorporating diverse samples and ancestry-specific estimates from admixed populations results in higher prediction accuracy for homogeneous AFR individuals. The bulk of African-descent GWAS participants are currently admixed individuals of the Americas, and some underrepresented ancestries are rarely found outside of the admixed context. Thus, building models based on ancestry-specific estimates generated from the deconvolved local ancestry tracts of admixed genomes allows for better PRS performance on many diverse populations from making better use of existing collections.
    We additionally highlight several loci which we find to have well-demonstrated effect size differences across ancestries, a phenomenon for which there are few prior examples in the literature. As our models are constructed off of local ancestry components from the same admixed individuals, these results hint at genetic differences rather than environmental factors, which are often tricky to disentangle. Ultimately, our work highlights how Tractor and local ancestry allow for improved population characterization and can be leveraged to advance the understanding of complex diseases across diverse cohorts. 

    Elizabeth G Atkinson
    Baylor College of MedicinePolygenic risk prediction of obesity across the life course and in diverse populations

    Polygenic risk scores (PRSs) for body mass index (BMI) that leverage the increasing genome-wide association study (GWAS) sample sizes may aid risk stratification and allow targeted prevention of obesity at an early age. We constructed ancestry-specific and trans-ancestral PRSs to predict obesity in adulthood, and examined their added value over and above easily measurable predictors of obesity during childhood and adolescence.
    We calculated PRSs based on summary statistics of up to 1.2 million common variants [minor allele frequency (MAF)>1%] from the GIANT consortium’s BMI GWAS meta-analysis of up to 1.6 million individuals (72% European (EUR), 16% East Asian (EAS), 6% African (AA), 4% Hispanic (HA), 2% South Asian (SAS)). Explained variance for BMI and discrimination for obesity were examined in the UK Biobank (UKB, n=437k) and Million Veteran Program (MVP, n=101k). The best performing PRS in EUR was taken forward to the Avon Longitudinal Study of Parents and Children (ALSPAC, n=5.8k), for cross-sectional and longitudinal associations with BMI across 21 time-points from birth to age 22y. We compared the predictive performance of the PRS to that of clinically available factors (maternal education, pre-pregnancy maternal BMI, household social status).
    The trans-ancestral PRS explained more of the variation in BMI than ancestry-specific PRSs in all but the EUR-populations (R2 min-max for non-EUR; UKB: 7.5-12.4% (AA/EAS); MVP: 5.7-11.1% (AA/HA); UKB-EUR (EUR-PRS): 15.8%; MVP-EUR (EUR-PRS): 13.1%). For all ancestries, maximum explained variance was roughly double that of previously published obesity PRSs. The PRSs were better at discriminating between adults with or without obesity than age, sex, or scores of genome-wide significant variants only. EUR-PRS associations with BMI were weak at birth, but increased rapidly during childhood, and remained stable from adolescence onwards (e.g., BMI-SD per PRS-SD at 13y (95%CI): 0.39 (0.37,0.42)). Consistently, longitudinal modeling of BMI trajectories using the PRS showed increasing divergence until early adolescence. When added to other factors available at birth, the PRS helped predict substantially more of BMI from 5y onwards (e.g., R2 at 5y: 13 to 18%; 11y: 11 to 21%).
    The current PRSs, based on larger GWAS sample sizes, double the previously explained variance for BMI across multiple ancestries, thereby advancing the options for prognostication in populations traditionally underrepresented in genetic research. Moreover, we find that genetic predisposition to adult obesity affects childhood growth trajectories, and shows potential to improve risk stratification for obesity at an early age. 

    Roelof A.J. Smit
    The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai

    Improving Polygenic Prediction in Ancestrally Diverse Populations

    Polygenic risk scores (PRS) are less effective when ported across populations. While the scale of non-European genomic resources has been expanded in recent years, a clear attenuation of the predictive performance of PRS remains in individuals who are genetically distant from Europeans.
    In order to include data from all ancestral groups to ensure more equitable delivery of genomic prediction to global populations, we developed the first principled Bayesian PRS construction method, termed PRS-CSx, that jointly models GWAS summary statistics from multiple populations to improve cross-ancestry polygenic prediction. PRS-CSx couples genetic effects across populations via a shared continuous shrinkage prior, enabling more accurate effect size estimation by sharing information between summary statistics and leveraging linkage disequilibrium (LD) diversity across discovery samples, while inheriting computational efficiency and robustness from PRS-CS.
    PRS-CSx outperformed existing PRS methods across various simulations settings with different sample sizes, fractions of causal variants, and genetic correlations between populations. Using quantitative traits from biobanks, we showed that PRS-CSx substantially improved the prediction accuracy even if only a small non-European GWAS was included in the discovery data. For example, the median R2 increased by 76% for individuals of East Asian ancestry when the Biobank Japan samples (N=62K-159K) were added to the UK Biobank European samples (N=340K-360K) to train the PRS. Similarly, the median R2 increased by 22% for individuals of African ancestry when the PAGE study samples (N=20K-50K) were integrated with UK Biobank and Biobank Japan samples (400K-519K).
    Furthermore, by integrating GWAS summary statistics of schizophrenia from East Asian (14K-17K cases due to leave-one-out) and European (33K cases) populations, PRS-CSx more accurately predicted schizophrenia risk in individuals of East Asian ancestry, showing 52% and 97% improvement in the liability R2 relative to PRS constructed using East Asian or European summary statistics only, and approximately doubled the prediction accuracy when compared with alternative methods that can combine multiple GWAS to make prediction.
    Our method represents a much needed and critical breakthrough in PRS construction. Through joint modeling of multi-ancestry data, PRS-CSx substantially improves polygenic prediction in non-European populations. With the rapid expansion of non-European genomic resources, our method will help accelerate the equitable deployment of PRS in clinical settings and maximize its healthcare potential. 

    Yunfeng Ruan
    Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard

    A trans-ancestry polygenic test to predict severe hypercholesterolemia in diverse ancestry patients

    Approximately 7% of adults have severe hypercholesterolemia (SH; untreated low density lipoprotein (LDL-C) ≥ 190 mg/dL). SH is associated with a 6-fold increased risk of cardiovascular disease, and up to 20-fold increased risk in individuals identified with monogenic Familial Hypercholesterolemia (FH)-associated variants. Despite high frequency of cholesterol screening and awareness, individuals with SH remain undertreated, with disparities in treatment and LDL-C control observed among African American (AA) populations. Only 2.5% of individuals with SH harbor a monogenic FH-associated variant, and polygenic SH accounts for 15%-30% of clinical FH, motivating the development of a polygenic test for predicting SH in diverse populations. We obtained summary statistics for validated trans-ancestry polygenic risk scores (PRS) to predict LDL-C from the Global Lipids Genetics Consortium pre-publication. The PRS were developed from a genome-wide association study of ~1.6M trans-ethnic participants, and validated in European (EU), AA, African, Hispanic or Latino (HL), South Asian and East Asian populations. We leveraged independent genotype and phenotype data from the diverse BioMe biobank in New York City. We extracted laboratory values and medications from electronic health records for adults with an age range of 18-95 from three population groups: AA, EU, and HL (other groups were excluded due to low sample size). SH cases were defined as participants with statin-adjusted maximum LDL-C ≥ 190 mg/dL and controls with statin-adjusted maximum LDL-C < 160 mg/dL (EU: 323/4810, AA: 422/3741, and HL: 539/5780). In a model that included the covariates age, sex, and the top 10 principal components, we measured PRS discrimination (via AUC) which was 0.68 (0.65-0.71), 0.70 (0.68-0.73), and 0.72 (0.70-0.74) for EU, AA, and HL respectively; and 0.67 (0.64-0.70), 0.68 (0.66-0.71), and 0.65 (0.62-0.67) for the genomic predictor alone. The effect size of the PRS was 1.97 (1.74-2.24), 2.0 (1.85-2.35), and 2.02 (1.81-2.25) odds ratio (OR) per standard deviation. We established a high-risk threshold of 3%, and found effect sizes of 4.98 (3.30-7.34), 2.99 (1.89-4.61), and 3.96 (2.74-5.64) OR compared to the 97% below the threshold. We estimated prevalence-adjusted positive vs. negative predictive values of cases being classified in the high risk group, and demonstrated 0.25 vs 0.93, 0.17 vs 0.93, and 0.21 vs 0.93. In summary, we demonstrate that a PRS for LDL-C can be leveraged to predict a 3- to 5-fold increased risk of SH in diverse populations, raising the possibility that this test could be used to identify individuals predisposed to polygenic SH. 

    Michael C Turchin
    The Institute for Genomic Health, Icahn School of Medicine at Mount Sinai

    Phenome-wide association study of polygenic risk for asthma in the UK Biobank highlights traits with shared genetic architecture and sex specific effects

    Polygenic risk scores (PRSs) aggregate additive effects of genetic variants to estimate individual risks for heritable diseases and can be used clinically to inform decisions on screening, therapeutic intervention, and lifestyle modification. The aim of this study was to develop a PRS for asthma using genetic information from a large, multiethnic (ME) cohort and investigate its association with 267 phenotypes in the UK Biobank (UKB). Two asthma PRS models were developed based on European (EU) (19,954 cases, 107,715 controls) and ME (23,948 cases, 118,538 controls) summary statistics from the Trans-National Asthma Genetic Consortium meta-analysis. Posterior SNP effect size estimates were generated using a Bayesian regression framework, implemented in PRS-CS. To evaluate PRS prediction for asthma, each model was applied to white British (36,065 cases, 314,781 controls) and ME (43,109 cases, 377,061 controls) subjects from UKB using logistic regressions adjusting for sex and ancestry. The EU PRS applied to the white British cohort had the strongest association with doctor-diagnosed asthma (p=1.96x10-295, OR=1.34, 95% CI=1.32-1.35, AUC=0.582) and was most strongly associated with childhood onset asthma (COA; onset before age 12; p=1.77x10-181, OR=1.59, 95% CI=1.54-1.64, AUC=0.624). There were significant sex-by-PRS interaction effects for COA (p=0.049) and adult onset asthma (AOA; onset after age 25; p=0.048). Given the same PRS, males had a higher risk than females for COA but females had a higher risk than males for AOA. The phenome-wide association study identified significant associations between the PRS and 27 binary and 69 quantitative traits (Bonferroni p<1.87x10-4). The most significant association was with percent eosinophils (p=9.33x10-298, β=0.11), a known asthma-associated trait. Other associated traits included asthma age of onset (p=4.12x10-94) and measures of lung function (FEV1) (p=1.91x10-117). Some associations were less expected. For example, age at first live birth was negatively correlated with the PRS (p=8.57x10-15, β=-0.095) and HbA1c was positively correlated with the PRS (p=4.83x10-33, β=0.13). Sex-specific effects were observed for 5 binary and 15 quantitative traits, such as fat-free mass (p=1.71x10-6, β=0.028 in females; p=0.42, β=7.6x10-3 in males). Overall, our results suggest shared genetic architectures between asthma and a broad swath of pulmonary, cardiometabolic, anthropometric, and reproductive traits, many of which had not previously been linked to asthma and some with sex-specific effects. This research was conducted using the UK Biobank Resource under application number 44300. 

    Yu Lin Lee
    Biological Sciences Collegiate Division, Univ Chicago

    Modelling hidden genetic risk from family history for improved polygenic risk prediction

    With many polygenic risk scores demonstrating research and clinical utility, it is worth questioning whether family history, a traditional genetic predictor, still provides valuable information.
    Family history of complex traits may be influenced by transmitted rare pathogenic variants, intra-familial shared exposures to environmental factors, as well as a common genetic predisposition. Therefore, we propose and develop a latent factor model to quantify disease risk in excess of that captured by a common SNP-based polygenic risk score, but inferable from family history. This model enables calibration of polygenic risk scores with respect to family history without fitting regression models.
    We applied our model to predict adult height for 941 children in the Avon Longitudinal Study of Parents and Children. Our predictor was able to explain ~55% of the total variance in adult height, close to the estimated heritability of height and substantially higher than ~40% captured by a polygenic risk score for height or mid-parental height alone. For nine complex diseases, including metabolic syndromes, cardiovascular diseases, neurological disorders and several types of cancer, we used our model to improve polygenic risk prediction for >400,000 White British participants in the UK Biobank. For all nine complex diseases investigated in the UK Biobank, parental disease history brought significant improvements in the discriminative power of polygenic risk prediction. For instance, combined with age and sex, our predictor achieved an area under the receiver operating characteristic curve (AUROC) of 0.734 and an area under the precision-recall curve (AUPRC) of 0.171 in identifying individuals with type 2 diabetes, exhibiting significantly stronger discriminative power than the polygenic risk score (AUROC = 0.712; AUPRC = 0.148) or the parental disease history (AUROC = 0.707; AUPRC = 0.148) alone. Comparing to using a type 2 diabetes polygenic risk score, our predictor had a net reclassification index of 3.72% in identifying 20% of the population at an elevated risk.
    Taken together, our work showcases an innovative paradigm for risk calculation, and supports the utility of incorporating family history into polygenic risk score-based genetic risk prediction models. 

    Tianyuan Lu
    McGill University


  • Contains 1 Component(s)

    Speakers discuss global perspectives and initiatives for large-scale genomics.

    Platform sessions are abstract driven sessions with 6 talks per session. These talks are 10 minutes in length and are cross-topical in nature to represent the broad discipline our field of genetics and genomics represent. After each talk, there will be a 5-minute Q&A with each speaker. For information on each individual session, please view the "Details" tab. 

    Recorded session from the 2021 virtual meeting.

    Trans-ancestry imputation and exome sequencing of more than 1 million individuals identifies genetic variation protecting against SARS-CoV-2 infection and predicts individuals at risk for severe COVID-19 outcomes

    COVID-19 symptoms vary widely, ranging from asymptomatic in some patients to fatal in others. Elucidating the host genetics of COVID-19 holds the potential for understanding both susceptibility to SARS-CoV-2 infection as well as heterogeneity in patient presentation and outcome. Prior work focused on identifying common variants associated with COVID-19 susceptibility and severity, but little has been done to explore the entire allele frequency spectrum of genetic variation, from common to rare exonic variants. Here, we present the largest trans-ancestry exome sequencing study of COVID-19 to date in 586,713 individuals, with a larger set of 1,012,636 individuals with imputed data across 7 studies and 5 continental ancestries.
    Through exome sequencing of 21,820 COVID-19 cases and 564,893 controls, we did not identify any rare variants after Bonferroni correction (P<9.6e-10). Burden tests identified three genes tentatively associated with COVID-19: DISP3 (P=2e-8; OR=1.8±0.3), MARK1 (P=3e-9; OR=38.4±16.9), and TLR7 (P=4e-8; OR=4.5±2.2). Despite having a 100x larger sample size, we could not replicate a previous reported role for rare variants in the interferon pathway (P=0.59).
    Our larger GWAS of 56,841 cases and 955,795 controls found 11 loci (P<5e-8). Most notably, we identified a strong protective association amongst SARS-CoV-2 infected cases for rs190509934 located 60bp upstream of ACE2, the primary cell receptor for the SARS-CoV-2 spike protein (P=4.5e-13; OR=0.6±0.08; EUR MAF=0.003). Using RNA-seq, rs190509934 reduced ACE2 expression by 39% (P=3e-8), supporting the hypothesis that reduced ACE2 expression protects against SARS-CoV-2 infection.
    Lastly, we developed a polygenic risk score (PRS) to predict hospitalization and severity of COVID-19. Among those of European ancestry, individuals with the top 10% PRSs are 1.8-fold more likely to be hospitalized (P=6e-11) and 1.58-fold more likely to be placed on a ventilator or die from COVID-19 (P=7e-10). These associations hold in other non-European populations (albeit with decreased power) and after accounting for known clinical risk factors.
    Our data represents the most comprehensive survey of common and rare exonic variation associated with COVID-19 identifying new loci and polygenic risk scores that predict severity of COVID-19. 

    Jack Kosmicki
    Regeneron Genetics Center

    Rare variant analyses in 239,395 whole exome and whole genome sequenced participants of the UK Biobank reveals novel genetic associations with renal function and chronic kidney disease

    Genome-wide association studies have identified common genetic variants associated with chronic kidney disease (CKD), but the burden of rare loss-of-function (LoF) or pathogenic/likely pathogenic (P/LP) variants has not been well characterized. We performed gene-/region-based and variant association analyses for 5 renal function biomarkers (eGFR estimated from serum creatinine and/or cystatin-C, BUN, UACR) and 5 CKD endpoints (ESRD and stage4/5 CKD, CKD defined by biomarkers and/or diagnoses from NHS data, Cystic) in 239,395 UKB participants of genetically-assessed European ancestry and with whole exome (WES, n=171,172) or whole genome sequencing (WGS, n=121,019). For each trait, we fit a genome-wide regression model and tested for association using REGENIE V2.0, adjusting for age, sex, 10 principal components of ancestry, assessment center and BMI, where appropriate. For gene-based analyses, we generated 15 models to collapse ClinVar-classified P/LP, VEP(LOFTEE)-predicted putative LoF and deleterious variants predicted by 16 in silico scores (SIFT, Polyphen, BayesDel, etc.) from dbNSFP 4.1c. The WGS data further enabled annotation of promoter/enhancer variants, which were incorporated into collapsing models for gene-based association. In participants with WES, we identified 30 and 11 genes associated with ≥2 biomarkers and ≥1 CKD endpoint across collapsing models (FDR<0.05), respectively. PKD1/2, COL4A3/4, CUBN, IFT140 were associated with both biomarkers and CKD. Association analyses also highlighted other genes including: COL4A1, CST3, LAMC1, LRP2, SLC22A2, SLC34A3, SH2B3. Variant-level analyses further informed impact on protein, e.g. the SLC22A2 association signal was mainly driven by a frameshift (rs8177505) with lowering effects on eGFR (p=1.2e-27, beta=-6.2, MAF=0.12%). Exome-wide variant analyses revealed 25 genes (eg. PDILT-UMOD) with variant associations (p<5.0e-8) with >3 biomarkers or ≥1 endpoint, including 2 that were also implicated from the gene-based analyses (COL4A4 and CUBN). Analyses of WGS allowed for sequence level validation of exome derived findings and the identification of additional variants not captured in WES. This study provides a framework for the assessment of the genetic landscape of kidney disease. The results validated known genes and identified potential novel associations with renal function. 

    Shuwei Li
    Janssen

    Novel genetic associations for rare diseases with GWAS and trans-ethnic analysis of self-reported medical data

    Nearly 7000 rare diseases are known, and though each disease affects a few people, the total population prevalence of rare diseases is estimated to be 3.5-5.9%. A key challenge in the study of rare disease genetics is assembling large case cohorts for well-powered studies. Here we demonstrate use of large-scale self-reported rare disease data, combined with genetic data collected through the 23andMe direct-to-consumer platform, to study 33 rare diseases and identify genetic associations through GWAS. We developed web-based questionnaires, and gathered self-reported data on rare diseases from a cohort of over 1.6 million genotyped research-consented individuals. To reduce mis-reporting and maximize coverage, we used an autocomplete mechanism including 7000 rare diseases. We validated the approach through simulations and replication of known rare disease associations. In simulations based on genotypes from 4,957,230 European individuals, we show that GWAS can recover genome-wide significant associations in monogenic rare diseases for a variety of architectures. In rare diseases with known genetic associations, we reidentified 29 associations at a genome-wide significance level (p-value < 5e-8) with a diverse range of minor allele frequencies (minimum MAF=0.0001, maximum MAF=0.487) and effect sizes for the risk allele (minimum OR=1.24, maximum OR=273.15). We performed the first GWAS in European ancestry for Duane retraction syndrome, vestibular schwannoma and spontaneous pneumothorax, and report novel genome-wide significant associations for these diseases. For Duane retraction syndrome, an eye movement disorder, we found two independent associations near the OLIG1 and OLIG2 genes, knockdown of which causes a similar phenotype in mice. For vestibular schwannoma, we find a single association near the CDKN2A and CDKN2B genes, which are associated with many other cancers. We found three novel associations for spontaneous pneumothorax, two of which are also associated with lung function phenotypes. We replicated these associations in the UK Biobank and found that 3 of 5 replicated with p < 0.05, and all 5 had the same direction of effect. Trans-ethnic mixed-model analyses, including individuals of all ancestries, found the same associations with comparable or increased significance. Our results show that self-reported rare disease data is a viable method for discovering genetic associations for rare diseases. With increasing sample size and diverse imputation reference panels, we may also be able to study rare diseases more widely in multiple populations and improve our understanding of the trans-ethnic genetic architecture of these diseases. 

    Suyash S Shringarpure
    23andMe

    Common and rare variant analysis of 21K psoriasis cases and 623K controls identifies novel, protective associations in several genes in the type 1 interferon pathway

    Psoriasis is a complex autoimmune disease resulting in chronic inflammation and hyperproliferation of the skin. The aberrant immune response associated with psoriasis is mediated by pathogenic T cells, which are activated, in part, by type 1 interferons (IFNs). Prior large-scale analyses of psoriasis cases focusing on common genetic variants have implicated >63 loci, including genes in the IFN signaling pathway. However, large-scale analysis of rare exonic variation is lacking.
    To study the contribution of both common and rare variants to psoriasis risk, we performed whole-exome sequencing and meta-analysis of 20,810 psoriasis cases and 623,159 controls of EUR and AFR ancestry across 6 cohorts. Common variant analysis replicated 44 significant and independent associations in known psoriasis loci, including IL23RTYK2IL12BHLA-C, and DDX58, among others. Rare-variant gene-burden analysis of putative loss-of-function (pLoF) and/or predicted-deleterious missense variants (<1% AAF) identified significant and novel associations in 5 genes, including 3 genes in the IFN pathway. These include protective pLoF associations for IFIH1 (OR=0.74 [0.68, 0.81], p=4.1E-12), which encodes a pathogen sensor that activates IFN production, and TRIM65 (OR=0.63 [0.50, 0.79], p=4.8E-5), which encodes a ubiquitin ligase that binds and activates IFIH1. We find the protective TRIM65 association is driven by a rare, predicted-deleterious missense variant (rs202175254, AAF=0.1%) in the IFIH1-TRIM65 binding domain. Further, we find a nominally significant, protective association for the burden of rare pLoFs in DDX58 (OR=0.76 [0.49, 0.89], p=6.7E-3), which encodes a second pathogen sensor that activates IFNs. This DDX58 protective pLoF association helps confirm direction of effect at this known psoriasis locus.
    Consistent with inhibition of IFNs being protective in psoriasis, we also found a significant and novel gene-burden association between increased odds of psoriasis and pLoFs in ADAR (OR=2.29 [1.68, 3.12], p=1.4E-7), which encodes a protein that suppresses IFNs and in which partial LoFs have been associated with Aicardi-Goutières syndrome, an inherited disorder that features over-production of IFNs.
    Collectively, these results represent the largest rare-variant exome-sequencing analysis of psoriasis, to date. Future experiments will characterize effects of these pLoFs on protein expression and/or function, and further analysis will determine whether an IFN gene signature can identify a clinically-relevant subset of psoriasis patients who would therapeutically benefit from IFN inhibition. 

    Julie Horowitz
    Regeneron Genetics Center

    Investigating genetic and phenotypic associations for 168 blood metabolites in 120K UK Biobank participants

    In this study, we accessed the large-scale metabolomics, exome sequencing and phenomics data from the UK Biobank (UKB) to investigate gene-metabolite and metabolite-phenotype relationships. Blood metabolites (N=168) were profiled by Nightingale Health in ~120,000 UKB participants, >90% of whom had exome sequences and all had data on ~16,000 clinical traits.
    We explored genetic associations with blood metabolites by two complementary approaches: (i) single-variant analysis, and (ii) gene-level collapsing analysis, using a linear regression model, adjusted for age, sex and BMI. For the single-variant analysis, we tested ~3.2 million variants under dominant and recessive models. For the gene-level collapsing analysis, the aggregate effect of variants in each gene was tested using 11 different models, including ones that focused on rare (MAF<0.1%) missense and protein-truncating variants. We also performed a metabolite PheWAS, in which the association for each metabolite was tested with each clinical trait.
    Our analyses provide a rich catalogue of significant (p<1x10-8) associations: 10,461 variant-metabolite, 970 gene-metabolite, and 127,947 metabolite-phenotype relationships. This includes well-established, biologically plausible associations such as variants in PAH with phenylalanine levels [beta=1.2; p<1x10-300] and the concentration of intermediate-density lipoprotein particles with type 2 diabetes [beta=-1.5; p<1x10-300]. These data may also provide insights into underlying biological mechanisms: for instance, the observed metabolite signature for mutations in a gene that is a known drug target (e.g, HSD17B13) can indicate the metabolic profile expected with desirable therapeutic response.
    The catalogue of genetic and phenotypic relationships for blood metabolites, which will expand further once metabolomics data becomes available in the entire UKB cohort of ~500,000 subjects, represents an excellent resource to better understand mechanisms underlying complex human diseases. 

    Abhishek Nag
    Centre for Genomics Research, AstraZeneca

    Practical implementation of polygenic risk scores and absolute risk score estimation across diverse ancestry groups

    Polygenic risk scores (PRS) have generated considerable translational interest. Yet, most validation efforts focus on assessing relative rather than absolute risk scores (ARS), even though ARS are required for clinical decision making. ARS validation experiments are typically based on a single large cohort split into training/testing and rarely incorporate PRS. While such approaches typically generate calibrated ARS within the testing dataset, they do not properly capture the complex biases inherent to each healthcare context or account for environmental differences between countries and ethnicities. Consequently, the robustness of the ARS across different contexts is largely unknown.
    To address these gaps, we derived a framework to combine ethnicity-specific disease baselines from a range of country-specific surveys, which capture social determinants of health, with ancestry-adjusted PRS (European OR per 1SD 1.87, 2.10, 1.51 and 2.09 respectively) for breast cancer, prostate cancer (PC), cardiovascular disease (CVD) and type 2 diabetes (T2D). We validated these ARS in independent datasets, computing calibration summary statistics, including the standard incidence ratio (SIR), calibration slope and intercept, and the integrated calibration index.
    We find that inclusion of an ethnic specific baseline captures substantial ARS variability not captured by the PRS, particularly for PC, where an UK African and Caribbean baseline results in calibration (0.99-1.34 95% CI SIR) whilst the UK average baseline results in strong miscalibration (2.24-3.02 95% CI SIR). The extent of the calibration varied, with challenges arising for T2D and CVD, whose incidence has fluctuated across time and location in the US over the last decades. For T2D, baselines date from 1997-2019 but prospective testing data date from 1987-1999, resulting in miscalibration for White males (1.35-1.62 95% CI SIR). For CVD, baselines for myocardial infarction and fatal heart disease date from 2004-2011 and ischemic stroke from 1999, but prospective testing data date from 1986-2000, resulting in miscalibration for White females and males (0.66-0.92 and 1.04-1.31 95% CI SIR respectively).
    We demonstrate that with appropriate data it is possible to translate genetic risk into clinically meaningful ARS that robustly replicate in diverse contexts. Our results also demonstrate the challenges arising from variation across ethnicity, geography and time and the need for population-relevant information on which risk prediction tools are to be applied. 

    Rachel Moore
    Genomics plc

    The Kidney genome atlas reveals a novel locus on chromosome 14 associated with adult proteinuric kidney diseases

    Chronic Kidney disease (CKD) affects 1 in 9 people worldwide. There is a high unmet need for drugs that extend and restore kidney function, because dialysis and organ transplantation carry substantial economic and psychological burden. To foster drug development of genetically validated targets, we have created the Kidney Genome Atlas (KGA) by assembling ~23,000 whole genomes from 2,832 kidney disease cases including proteinuric kidney disease cases such as Focal segmental glomerulosclerosis (571 cases), minimal change disease (244 cases), nephrotic syndrome (196 cases) and idiopathic proteinuria (1,123 cases) and 19,804 controls. Following the gnomAD pipeline, we implemented a rigorous quality control procedure to obtain a high confidence dataset for downstream analyses of proteinuric kidney diseases. Ancestries were inferred genetically based on a k-NN model trained on 1,000 Genomes data which resulted in 597 cases and 10,127 controls of European (EUR) ancestry, 513 cases and 3,805 controls of African (AFR) ancestry, and 290 cases and 754 controls of Latino/Admixed American (AMR) ancestry for association testing. Meta-analysis of common variants across ancestries showed minimal impact of potential confounders, such as ancestry or sequencing center differences (lambda=1.03). We identified a novel locus on chromosome 14 (rs11160484; effect size = -0.42, P = 2.8*10-8) associated with proteinuric kidney disease. In addition, we confirmed the well-known association of APOL1 risk haplotypes (G1/G1, G2/G2 or G1/G2; effect size = 0.50, P = 2.4*10-9, under recessive model) in the AFR cohort. LD-score regression analysis revealed a trend towards a weak positive genetic correlation (rg = 0.097, 90% CI [0.010, 0.18]) between proteinuric kidney diseases and CKD defined by estimated glomerular filtration rate or eGFR (Wuttke et al, 2019). Using summary statistics from our EUR dataset, we estimated the SNP heritability of proteinuric kidney diseases at 0.15 (95% CI [0.095, 0.20]), suggesting that there may be many more genetic contributions that are yet to be discovered. These findings advance our understanding of the genetic architecture of proteinuric kidney diseases and highlight an opportunity for novel therapies and patient stratification. 

    Eva Fast
    Goldfinch Bio