Upcoming Workshop Bundle (January Only)

  • Registration Closed

If you plan on watching several workshops, consider purchasing this all-access pass. By purchasing this item you will be automatically registered to access all four (4) January workshops and their materials.

  • Contains 1 Component(s)

    This workshop will appeal to researchers and clinicians interested in exploring SV calling with long-read sequencing data, as well as anyone more broadly interested in practical ways to access and analyze data in the cloud - with or without advanced computing training.

    Growing evidence that structural variants (SVs) are responsible for many types of diseases and traits is fueling interest in taking a fresh look at different disease types using long-read sequencing. Although short-read technologies have long been cheaper and more readily available, long-read sequencing produces data that can yield significantly more accurate results for identifying SVs.

    However, the large amounts of data and complexity of the computational methods involved can make it difficult for newcomers to access this exciting area of research, particularly in the context of the traditional computing environments that are provided by default to academic researchers.

    In this workshop, we will guide you through an end-to-end SV identification journey using Galaxy, a platform designed to facilitate access to computational methods for researchers without a programming background. Specifically, we will use Galaxy in Terra, in the context of the NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL). This cloud-based environment enables you to analyze large genomic datasets with familiar tools and reproducible workflows securely.

    Through live demonstrations and interactive exercises, you will learn how to:

    1. Bring data into a project workspace in Terra
    2. Combine data (your own or controlled-access) with an open-access dataset
    3. Launch a Galaxy instance in Terra and run a complete workflow to identify SVs
    4. Visualize results and identify potentially pathogenic variants

    The skills you will learn in this workshop will extend to other scientific use cases, datasets and tools beyond the examples shown.

    Liz Kiernan, PhD

    Senior Science Writer, Data Sciences Platform

    Senior Science Writer, Data Sciences Platform, Broad Institute of MIT and Harvard

    Liz Kiernan is a Senior Science Writer for the Broad Institute’s Data Sciences Platform. Prior to joining Broad, she was a postdoctoral fellow at the University of Wisconsin-Madison, studying the long-term effects of hypoxia on brain development and immune function in the lab of Dr. Jyoti Watters. Outside of her research activities, Kiernan passionately pursued science outreach and teaching in her local community, working with organizations like PEOPLE and Expanding Your Horizons to empower burgeoning young scientists. She completed her Ph.D. in Neuroscience from the University of Wisconsin-Madison in 2019 where she was awarded the Ruth L. Kirschstein National Research Service Award for research focusing on hypoxia-induced epigenetic changes in microglia, the resident immune cells of the brain. Kiernan received her M.S. in Biology (2012) and B.S. in Psychology/Pre-Med (2010) from the College of William and Mary.

    Natalie Kucher

    Project Manager

    Johns Hopkins University

    Natalie Kucher is a Project Manager for the AnVIL Platform at Johns Hopkins University. Her focus is on outreach and training, efforts to diversify genomic data science, and supporting Galaxy on AnVIL. She spent 2 years at the National Human Genome Research Institute (NHGRI) where she supported the Computational Genomic Data Science Program and served as the Executive Secretary of the Genomic Data Science Working Group of Council. She received her B.S. in Biology from Davidson College in 2019.

    Michael C. Schatz

    Bloomberg Distinguished Professor of Computer Science and Biology

    Johns Hopkins University

    Michael Schatz is the Bloomberg Distinguished Professor of Computer Science and Biology at Johns Hopkins University, and co-lead of the NHGRI AnVIL platform. His research is at the intersection of computer science, biology, and biotechnology, and focuses on development of novel algorithms and systems for comparative genomics, human genetics, and personalized medicine. For this work, he is a recipient of the 2015 Alfred P. Sloan Foundation Fellowship, an 2014 NSF CAREER award, and since 2018 has been named a Clarivate Web of Science “Highly Cited Researcher” three times based on the publication of multiple papers that rank in the top 1% by citations at any institution in the world. Schatz received his Ph.D. and M.S. in Computer Science from the University of Maryland in 2010 and 2008, his B.S. in Computer Science from Carnegie Mellon University in 2000, and spent 5 years at the Institute for Genomic Research (TIGR) in between. More information is available on his lab website: http://schatz-lab.org

  • Contains 3 Component(s)

    This workshop is ideal scientists who would like to find and use tools in the cloud for genomic analysis. Researchers interested in NHGRI data, such as the Human Pangenome Project, are especially encouraged to attend. Basic knowledge of Python or R is recommended but not required.

    Cloud-based analysis of genomic datasets is increasingly vital for portability, reproducibility, and multi-institution collaboration, but transitioning to the cloud can be daunting. We will offer a workshop that will serve to eliminate some of the barriers to the adoption of these tools. Specifically, we will teach researchers how to access and utilize The Analysis, Visualization, and Informatic Lab-space (AnVIL), an environment that provides access to hosted data, reproducible tools, and collaborative workspaces, and comprehensive documentation to enable users to conduct research in the cloud. This workshop will demonstrate how to access and explore data in AnVIL. Participants will also learn to search for analysis tools in Dockstore, a platform for sharing portable, container-based tools and workflows written to be interoperable across local and cloud environments. Finally, they will analyze data in a Terra workspace, which is a dedicated space where researchers can access and organize the same data and tools and run analyses.


    This workshop will specifically explore and demonstrate open-access data from the Human Pangenome Reference Consortium (HPRC), an NHGRI funded effort to create a more diverse and comprehensive reference human pangenome. We will present the data and methods produced and utilized within the first year of this project, which ultimately aims to release the assembly of high-quality diploid genomes from >350 ethnically diverse individuals across five years. Currently, raw data and assemblies from 45 individuals and associated Docker-based analysis workflows written in the Workflow Description Language (WDL) are available in the AnVIL for researchers to explore and utilize. Data and workflows will continue to be publicly released as early as possible to promote open science. These data make an excellent substrate for interaction with these data types and new workspaces and methods.


    Using data and workflows from the HPRC, participants of this workshop will follow along with instructors to learn how to:


    1. Register for a Terra account and set up a project using $300 in free Google Cloud credits
    2. Set up a collaborative cloud workspace in Terra
    3. Access and explore Human Pangenome Data hosted by AnVIL
    4. Search for bioinformatics workflows in Dockstore and export them to a Terra workspace
    5. Configure and launch a Docker-based WDL workflow to conduct a parallel analysis
    6. Monitor cloud costs associated with an analysis


    After completing the workshop, attendees will be able to leverage AnVIL to analyze hosted datasets and launch analyses that are reproducible and scalable. Attendees will also be familiar with Human Pangenome data and resources.

    Julian Lucas

    Senior Bioinformatics Systems Analyst

    University of California, Santa Cruz

    Julian Lucas is a Senior Bioinformatics Systems Analyst in the Computational Genomics Platform at the UC Santa Cruz Genomics Institute. He leads the data coordination for the Human Pangenome Reference Consortium (HPRC) and utilizes the NHGRI AnVIL cloud compute platform for sharing data and analysis methods for this open-access project with the research community.

    Beth Sheets, MS

    Program Manager, Computational Genomics Platform

    University of California, Santa Cruz

    Beth Sheets is a Program Manager for the Computational Genomics Platform at the UC Santa Cruz Genomics Institute. She currently works with two NIH initiatives, NHLBI BioData Catalyst and NHGRI AnVIL, which are bringing researchers to secure, collaborative, cloud-based workspaces that offer petabyes of hosted data and hundreds of scientific tools. She works with a collaborative team that builds Dockstore.org, the scientific tool-sharing repository for these two NIH initiatives, which providers researchers with features and training to publish their bioinformatics pipelines using FAIR (Findable, Accessible, Interoperable, Reusable) standards.

    Trevor Pesout

    PhD Candidate

    University of California, Santa Cruz

    Trevor is a Ph.D. Candidate in the Computational Genomics Lab at the UCSC Genomics Institute.  His work revolves around the use of third-generation long reads for phasing, polishing, and haplotyping.  He has developed many of the Quality Control workflows for the HPRC assembly group. His contributions are available in the Human Pangenome Reference Consortium organization on Dockstore for the community to reuse in the AnVIL cloud ecosystem.

    Mobin Asri

    PhD Candidate

    University of California, Santa Cruz

    Mobin is a PhD Candidate in the Computational Genomics Lab at the UCSC Genomics Institute. He works on comparative genomics and developing tools for evaluating diploid assemblies. He has developed the assembly workflow and read-based QC workflows for the HPRC assembly group. His contributions are available in the Human Pangenome Reference Consortium organization on Dockstore for the community to reuse in the AnVIL cloud ecosystem.

    Karen Miga, PhD

    Association Professor

    University of California, Santa Cruz

    Karen Miga is an Assistant Professor in the Biomolecular Engineering Department at UCSC and Associate Director at the UCSC Genomics Institute. She co-leads the telomere-to-telomere (T2T) consortium and is the Project Director of the Human Pangenome Reference Consortium (HPRC) production center at UCSC. Her research program combines innovative computational and experimental approaches to produce the high-resolution sequence maps of human centromeric and pericentromeric DNAs.

  • Contains 1 Component(s)

    Students, postdocs, and scientists in academia and industry who work on population genetics problems can benefit from learning and using the methods introduced in this workshop. Human geneticists and medical researchers who wish to incorporate ancestry and demographic information in studying their trait of interest will find the demographic and ancestry inference part helpful. Statisticians, computer scientists, and those who develop methods could benefit from learning the simulation tools instructed here.

    In this interactive workshop, we aim at introducing some state-of-art population genetics tools that are most useful for the general ASHG community and provide hands-on experiences for attendees to learn and understand these tools. The audience will learn population genetic simulations, global and local ancestry inference, genealogy and tree inference, advanced demographic history inference from experts on these topics. Each topic takes 20 minutes, with 10 minutes discussion in the end.

    In the first half of this workshop, we will instruct how to perform forward simulations with SLiM and backward simulation with msPrime for simulating a large number of genomes (and phenotypes) under mutation, recombination, selection, and population structure. Data from realistic simulations are useful for understanding genomic patterns, benchmarking new methods, and testing hypotheses.

    We will then introduce global ancestry (i.e., the proportion of a genome that comes from each population) and local ancestry (i.e., genomic segments come from each population) inference methods. Ancestry information is useful for understanding population history, identifying selection, and mapping disease loci. In the second part of this workshop, we will instruct recently developed methods for constructing whole-genome genealogies. These new methods opened up possibilities to utilize the genomic information efficiently and opportunities to understand selection and demography more accurately. We will then instruct various methods for inferring complex demographic history and best practices for a given dataset

    We will collect questions the audience is interested in tackling during the workshop, and our instructors will host a short discussion at the end to discuss how these tools could be used to address their specific questions.

    Leo Speidel, PhD

    Sir Henry Wellcome Fellow

    UCL and the Francis Crick Institute

    Leo is a Sir Henry Wellcome fellow at UCL, Genetics Institute and the Francis Crick Institute interested in developing powerful statistical tools that utilise the rapidly growing numbers of genomes of modern and ancient people to reconstruct our shared genetic past. During his PhD in Simon Myers’ group at the University of Oxford, Department of Statistics, he developed a new approach, Relate, to infer genealogical trees for large sample sizes and subsequently downstream techniques that utilise these trees to study our evolutionary past.

    Philipp W. Messer

    Associate Professor

    Department of Computational Biology, Cornell University

    Philipp is a population geneticist with a broad background in computational biology. Research in his lab centers on understanding rapid evolutionary processes, with a specific interest in systems that allow us to study evolution in real-time. Philipp has contributed to a wide spectrum of topics in population and evolutionary genetics, including theoretical work on rapid adaptation by hard and soft selective sweeps, the design of methods for inference of selection from population genomic data, and the development of evolutionary simulation software.

    Natalie Telis

    Senior Scientist

    Baryshnikova Group, Calico Labs

    Natalie Telis is a statistical geneticist and computational biologist at Calico Labs focusing on inferring the shared genetic architecture of complex traits using population genetics. She holds a BS in Cell Biology and a BAa in Mathematics, and completed a PhD in Biomedical Informatics with Jonathan Pritchard, focusing on the impact of recent evolution on complex human traits. 

    Xinzhu (April) Wei

    Assistant Professor

    Department of Computational Biology, Cornell University

    April is a computational biologist working on population, evolutionary, and statistical genetics. She started her lab at Cornell University in Jan 2022. Research in her group focuses on developing and applying accurate and scalable methods for understanding gene flow, selection, and genotype-phenotype relationships in humans and model organisms. Before joining Cornell, she was a postdoc at UCLA and UC Berkeley. She earned her Ph.D. in Ecology and Evolutionary Biology at the University of Michigan in 2018.

  • Contains 2 Component(s)

    The Genome Browser is a valuable tool at all levels of genetics education and research, but this is not an introductory workshop. It is designed to show users who have at least a working familiarity with the Browser new features likely not seen before.

    The UCSC Genome Browser has been a workhorse providing data and visualization for genetics research and clinical professionals for more than 20 years.  The Browser continues to grow and add new features, and even experienced users frequently disclose that they have missed important innovations.  The proposed workshop will feature some of the newest Browser offerings.

     We have recently revised our presentations of data important to the interpretation of variants in the clinical context.  Working with ClinVar, ClinGen, gnomAD and others, we now make the details of items displayed in the Browser more available via mouseover in the main Browser graphic.  In this way, multiple variants in a region can be investigated more quickly, without a required click-through. 

     We have also implemented a new feature called "Recommended Track Sets" -- one each for copy-number variants and single nucleotide variants.  Using this feature, users may, from any location in the genome, launch a pre-configured session with important data automatically displayed.

     A new data type has been introduced to aid in the display of ClinVar SNV variants with phenotypes in the five clinical classes (pathogenic > benign), simultaneously showing the variant classes and the number of reports of the variant in each class.

     We also now display a data track of exon-capture kits from various manufacturers. This allows users to evaluate the coverage of the genome both when choosing kits for use and to assist in the interpretation of whole exome sequencing experiments.  

     We will also briefly present our coronavirus resources, both as GWAS data on the genome and viral genome data.

     Participants should have experience with the Genome Browser and are encouraged to follow along with the presentation on a separate device.  Attendees with limited experience with the Browser should view the video tutorials at http://bit.ly/ucscBasics before attending. 

    Robert Kuhn, PhD

    Associate Director

    UC Santa Cruz

    Robert Kuhn received his PhD at the University of California, Santa Barbara in Biochemistry and Molecular Biology, where he studied the centromeres of yeast. Following a postdoctoral at UC Berkeley/USDA Plant Gene Expression Center, he taught biochemistry, molecular biology and genetics at UC Santa Cruz.  He joined the UCSC Genome Browser project in 2003, where he is now Associate Director, with a particular interest in clinical genetics.  The Genome Browser is a widely used visualization tool giving access to the genomes of human and more than one hundred other animals.  Dr. Kuhn's responsibilities include identifying important datasets for inclusion into the Browser, enabling researchers through teaching the Genome Browser in workshops and seminars and learning from them how to improve the Browser.