Upcoming Workshop Bundle (January Only)

Registration Closed

Overview
Content

If you plan on watching several workshops, consider purchasing this all-access pass. By purchasing this item you will be automatically registered to access all four (4) January workshops and their materials.

Workshop: Structural variant discovery from long-read sequencing data on the cloud with Galaxy in Terra

Contains 1 Component(s)

This workshop will appeal to researchers and clinicians interested in exploring SV calling with long-read sequencing data, as well as anyone more broadly interested in practical ways to access and analyze data in the cloud - with or without advanced computing training.
Growing evidence that structural variants (SVs) are responsible for many types of diseases and traits is fueling interest in taking a fresh look at different disease types using long-read sequencing. Although short-read technologies have long been cheaper and more readily available, long-read sequencing produces data that can yield significantly more accurate results for identifying SVs.

However, the large amounts of data and complexity of the computational methods involved can make it difficult for newcomers to access this exciting area of research, particularly in the context of the traditional computing environments that are provided by default to academic researchers.

In this workshop, we will guide you through an end-to-end SV identification journey using Galaxy, a platform designed to facilitate access to computational methods for researchers without a programming background. Specifically, we will use Galaxy in Terra, in the context of the NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL). This cloud-based environment enables you to analyze large genomic datasets with familiar tools and reproducible workflows securely.

Through live demonstrations and interactive exercises, you will learn how to:
1. Bring data into a project workspace in Terra
2. Combine data (your own or controlled-access) with an open-access dataset
3. Launch a Galaxy instance in Terra and run a complete workflow to identify SVs
4. Visualize results and identify potentially pathogenic variants
The skills you will learn in this workshop will extend to other scientific use cases, datasets and tools beyond the examples shown.
Liz Kiernan, PhD

Senior Science Writer, Data Sciences Platform

Senior Science Writer, Data Sciences Platform, Broad Institute of MIT and Harvard

Liz Kiernan is a Senior Science Writer for the Broad Institute’s Data Sciences Platform. Prior to joining Broad, she was a postdoctoral fellow at the University of Wisconsin-Madison, studying the long-term effects of hypoxia on brain development and immune function in the lab of Dr. Jyoti Watters. Outside of her research activities, Kiernan passionately pursued science outreach and teaching in her local community, working with organizations like PEOPLE and Expanding Your Horizons to empower burgeoning young scientists. She completed her Ph.D. in Neuroscience from the University of Wisconsin-Madison in 2019 where she was awarded the Ruth L. Kirschstein National Research Service Award for research focusing on hypoxia-induced epigenetic changes in microglia, the resident immune cells of the brain. Kiernan received her M.S. in Biology (2012) and B.S. in Psychology/Pre-Med (2010) from the College of William and Mary.

Natalie Kucher

Project Manager

Johns Hopkins University

Natalie Kucher is a Project Manager for the AnVIL Platform at Johns Hopkins University. Her focus is on outreach and training, efforts to diversify genomic data science, and supporting Galaxy on AnVIL. She spent 2 years at the National Human Genome Research Institute (NHGRI) where she supported the Computational Genomic Data Science Program and served as the Executive Secretary of the Genomic Data Science Working Group of Council. She received her B.S. in Biology from Davidson College in 2019.

Michael C. Schatz

Bloomberg Distinguished Professor of Computer Science and Biology

Johns Hopkins University

Michael Schatz is the Bloomberg Distinguished Professor of Computer Science and Biology at Johns Hopkins University, and co-lead of the NHGRI AnVIL platform. His research is at the intersection of computer science, biology, and biotechnology, and focuses on development of novel algorithms and systems for comparative genomics, human genetics, and personalized medicine. For this work, he is a recipient of the 2015 Alfred P. Sloan Foundation Fellowship, an 2014 NSF CAREER award, and since 2018 has been named a Clarivate Web of Science “Highly Cited Researcher” three times based on the publication of multiple papers that rank in the top 1% by citations at any institution in the world. Schatz received his Ph.D. and M.S. in Computer Science from the University of Maryland in 2010 and 2008, his B.S. in Computer Science from Carnegie Mellon University in 2000, and spent 5 years at the Institute for Genomic Research (TIGR) in between. More information is available on his lab website: http://schatz-lab.org
- More Information
Workshop: Reproducible Analysis of Human Pangenome Data using the AnVIL

Contains 3 Component(s)

This workshop is ideal scientists who would like to find and use tools in the cloud for genomic analysis. Researchers interested in NHGRI data, such as the Human Pangenome Project, are especially encouraged to attend. Basic knowledge of Python or R is recommended but not required.
Cloud-based analysis of genomic datasets is increasingly vital for portability, reproducibility, and multi-institution collaboration, but transitioning to the cloud can be daunting. We will offer a workshop that will serve to eliminate some of the barriers to the adoption of these tools. Specifically, we will teach researchers how to access and utilize The Analysis, Visualization, and Informatic Lab-space (AnVIL), an environment that provides access to hosted data, reproducible tools, and collaborative workspaces, and comprehensive documentation to enable users to conduct research in the cloud. This workshop will demonstrate how to access and explore data in AnVIL. Participants will also learn to search for analysis tools in Dockstore, a platform for sharing portable, container-based tools and workflows written to be interoperable across local and cloud environments. Finally, they will analyze data in a Terra workspace, which is a dedicated space where researchers can access and organize the same data and tools and run analyses.

This workshop will specifically explore and demonstrate open-access data from the Human Pangenome Reference Consortium (HPRC), an NHGRI funded effort to create a more diverse and comprehensive reference human pangenome. We will present the data and methods produced and utilized within the first year of this project, which ultimately aims to release the assembly of high-quality diploid genomes from >350 ethnically diverse individuals across five years. Currently, raw data and assemblies from 45 individuals and associated Docker-based analysis workflows written in the Workflow Description Language (WDL) are available in the AnVIL for researchers to explore and utilize. Data and workflows will continue to be publicly released as early as possible to promote open science. These data make an excellent substrate for interaction with these data types and new workspaces and methods.

Using data and workflows from the HPRC, participants of this workshop will follow along with instructors to learn how to:
1. Register for a Terra account and set up a project using $300 in free Google Cloud credits
2. Set up a collaborative cloud workspace in Terra
3. Access and explore Human Pangenome Data hosted by AnVIL
4. Search for bioinformatics workflows in Dockstore and export them to a Terra workspace
5. Configure and launch a Docker-based WDL workflow to conduct a parallel analysis
6. Monitor cloud costs associated with an analysis
After completing the workshop, attendees will be able to leverage AnVIL to analyze hosted datasets and launch analyses that are reproducible and scalable. Attendees will also be familiar with Human Pangenome data and resources.
Julian Lucas

Senior Bioinformatics Systems Analyst

University of California, Santa Cruz

Julian Lucas is a Senior Bioinformatics Systems Analyst in the Computational Genomics Platform at the UC Santa Cruz Genomics Institute. He leads the data coordination for the Human Pangenome Reference Consortium (HPRC) and utilizes the NHGRI AnVIL cloud compute platform for sharing data and analysis methods for this open-access project with the research community.

Beth Sheets, MS

Program Manager, Computational Genomics Platform

University of California, Santa Cruz

Beth Sheets is a Program Manager for the Computational Genomics Platform at the UC Santa Cruz Genomics Institute. She currently works with two NIH initiatives, NHLBI BioData Catalyst and NHGRI AnVIL, which are bringing researchers to secure, collaborative, cloud-based workspaces that offer petabyes of hosted data and hundreds of scientific tools. She works with a collaborative team that builds Dockstore.org, the scientific tool-sharing repository for these two NIH initiatives, which providers researchers with features and training to publish their bioinformatics pipelines using FAIR (Findable, Accessible, Interoperable, Reusable) standards.

Trevor Pesout

PhD Candidate

University of California, Santa Cruz

Trevor is a Ph.D. Candidate in the Computational Genomics Lab at the UCSC Genomics Institute. His work revolves around the use of third-generation long reads for phasing, polishing, and haplotyping. He has developed many of the Quality Control workflows for the HPRC assembly group. His contributions are available in the Human Pangenome Reference Consortium organization on Dockstore for the community to reuse in the AnVIL cloud ecosystem.

Mobin Asri

PhD Candidate

University of California, Santa Cruz

Mobin is a PhD Candidate in the Computational Genomics Lab at the UCSC Genomics Institute. He works on comparative genomics and developing tools for evaluating diploid assemblies. He has developed the assembly workflow and read-based QC workflows for the HPRC assembly group. His contributions are available in the Human Pangenome Reference Consortium organization on Dockstore for the community to reuse in the AnVIL cloud ecosystem.

Karen Miga, PhD

Association Professor

University of California, Santa Cruz

Karen Miga is an Assistant Professor in the Biomolecular Engineering Department at UCSC and Associate Director at the UCSC Genomics Institute. She co-leads the telomere-to-telomere (T2T) consortium and is the Project Director of the Human Pangenome Reference Consortium (HPRC) production center at UCSC. Her research program combines innovative computational and experimental approaches to produce the high-resolution sequence maps of human centromeric and pericentromeric DNAs.
- More Information
Workshop: Simulation and inference in population genetics

Contains 1 Component(s)

Students, postdocs, and scientists in academia and industry who work on population genetics problems can benefit from learning and using the methods introduced in this workshop. Human geneticists and medical researchers who wish to incorporate ancestry and demographic information in studying their trait of interest will find the demographic and ancestry inference part helpful. Statisticians, computer scientists, and those who develop methods could benefit from learning the simulation tools instructed here.

In this interactive workshop, we aim at introducing some state-of-art population genetics tools that are most useful for the general ASHG community and provide hands-on experiences for attendees to learn and understand these tools. The audience will learn population genetic simulations, global and local ancestry inference, genealogy and tree inference, advanced demographic history inference from experts on these topics. Each topic takes 20 minutes, with 10 minutes discussion in the end.

In the first half of this workshop, we will instruct how to perform forward simulations with SLiM and backward simulation with msPrime for simulating a large number of genomes (and phenotypes) under mutation, recombination, selection, and population structure. Data from realistic simulations are useful for understanding genomic patterns, benchmarking new methods, and testing hypotheses.

We will then introduce global ancestry (i.e., the proportion of a genome that comes from each population) and local ancestry (i.e., genomic segments come from each population) inference methods. Ancestry information is useful for understanding population history, identifying selection, and mapping disease loci. In the second part of this workshop, we will instruct recently developed methods for constructing whole-genome genealogies. These new methods opened up possibilities to utilize the genomic information efficiently and opportunities to understand selection and demography more accurately. We will then instruct various methods for inferring complex demographic history and best practices for a given dataset

We will collect questions the audience is interested in tackling during the workshop, and our instructors will host a short discussion at the end to discuss how these tools could be used to address their specific questions.

Leo Speidel, PhD

Sir Henry Wellcome Fellow

UCL and the Francis Crick Institute

Leo is a Sir Henry Wellcome fellow at UCL, Genetics Institute and the Francis Crick Institute interested in developing powerful statistical tools that utilise the rapidly growing numbers of genomes of modern and ancient people to reconstruct our shared genetic past. During his PhD in Simon Myers’ group at the University of Oxford, Department of Statistics, he developed a new approach, Relate, to infer genealogical trees for large sample sizes and subsequently downstream techniques that utilise these trees to study our evolutionary past.

Philipp W. Messer

Associate Professor

Department of Computational Biology, Cornell University

Philipp is a population geneticist with a broad background in computational biology. Research in his lab centers on understanding rapid evolutionary processes, with a specific interest in systems that allow us to study evolution in real-time. Philipp has contributed to a wide spectrum of topics in population and evolutionary genetics, including theoretical work on rapid adaptation by hard and soft selective sweeps, the design of methods for inference of selection from population genomic data, and the development of evolutionary simulation software.

Natalie Telis

Senior Scientist

Baryshnikova Group, Calico Labs

Natalie Telis is a statistical geneticist and computational biologist at Calico Labs focusing on inferring the shared genetic architecture of complex traits using population genetics. She holds a BS in Cell Biology and a BAa in Mathematics, and completed a PhD in Biomedical Informatics with Jonathan Pritchard, focusing on the impact of recent evolution on complex human traits.

Xinzhu (April) Wei

Assistant Professor

Department of Computational Biology, Cornell University

April is a computational biologist working on population, evolutionary, and statistical genetics. She started her lab at Cornell University in Jan 2022. Research in her group focuses on developing and applying accurate and scalable methods for understanding gene flow, selection, and genotype-phenotype relationships in humans and model organisms. Before joining Cornell, she was a postdoc at UCLA and UC Berkeley. She earned her Ph.D. in Ecology and Evolutionary Biology at the University of Michigan in 2018.
- More Information
Workshop: UCSC Genome Browser - the latest features

Contains 2 Component(s)

The Genome Browser is a valuable tool at all levels of genetics education and research, but this is not an introductory workshop. It is designed to show users who have at least a working familiarity with the Browser new features likely not seen before.

The UCSC Genome Browser has been a workhorse providing data and visualization for genetics research and clinical professionals for more than 20 years. The Browser continues to grow and add new features, and even experienced users frequently disclose that they have missed important innovations. The proposed workshop will feature some of the newest Browser offerings.

We have recently revised our presentations of data important to the interpretation of variants in the clinical context. Working with ClinVar, ClinGen, gnomAD and others, we now make the details of items displayed in the Browser more available via mouseover in the main Browser graphic. In this way, multiple variants in a region can be investigated more quickly, without a required click-through.

We have also implemented a new feature called "Recommended Track Sets" -- one each for copy-number variants and single nucleotide variants. Using this feature, users may, from any location in the genome, launch a pre-configured session with important data automatically displayed.

A new data type has been introduced to aid in the display of ClinVar SNV variants with phenotypes in the five clinical classes (pathogenic > benign), simultaneously showing the variant classes and the number of reports of the variant in each class.

We also now display a data track of exon-capture kits from various manufacturers. This allows users to evaluate the coverage of the genome both when choosing kits for use and to assist in the interpretation of whole exome sequencing experiments.

We will also briefly present our coronavirus resources, both as GWAS data on the genome and viral genome data.

Participants should have experience with the Genome Browser and are encouraged to follow along with the presentation on a separate device. Attendees with limited experience with the Browser should view the video tutorials at http://bit.ly/ucscBasics before attending.

Robert Kuhn, PhD

Robert Kuhn Consulting

Robert Kuhn received his PhD in Biochemistry and Molecular Biology from the University of California Santa Barbara. Dr. Kuhn recently retired from the University of California Santa Cruz Genome Browser Project, where he contributed to the growth of the Browser for more than 19 years and established the outreach and training program, presenting more than 300 workshops and trainings in 30 countries. He continues to teach the Genome Browser doing business as Robert Kuhn Consulting.
- More Information

Realizing the benefits of human genetics andgenomics research for people everywhere.

Upcoming Workshop Bundle (January Only)

Workshop: Structural variant discovery from long-read sequencing data on the cloud with Galaxy in Terra

Liz Kiernan, PhD

Senior Science Writer, Data Sciences Platform

Senior Science Writer, Data Sciences Platform, Broad Institute of MIT and Harvard

Natalie Kucher

Project Manager

Johns Hopkins University

Michael C. Schatz

Bloomberg Distinguished Professor of Computer Science and Biology

Johns Hopkins University

Workshop: Reproducible Analysis of Human Pangenome Data using the AnVIL

Julian Lucas

Senior Bioinformatics Systems Analyst

University of California, Santa Cruz

Beth Sheets, MS

Program Manager, Computational Genomics Platform

University of California, Santa Cruz

Trevor Pesout

PhD Candidate

University of California, Santa Cruz

Mobin Asri

PhD Candidate

University of California, Santa Cruz

Karen Miga, PhD

Association Professor

University of California, Santa Cruz

Workshop: Simulation and inference in population genetics

Leo Speidel, PhD

Sir Henry Wellcome Fellow

UCL and the Francis Crick Institute

Philipp W. Messer

Associate Professor

Department of Computational Biology, Cornell University

Natalie Telis

Senior Scientist

Baryshnikova Group, Calico Labs

Xinzhu (April) Wei

Assistant Professor

Department of Computational Biology, Cornell University

Workshop: UCSC Genome Browser - the latest features

Robert Kuhn, PhD

Robert Kuhn Consulting

Search

Realizing the benefits of human genetics and
genomics research for people everywhere.