The 2017 Galaxy Community Conference (GCC2017) features two days of training on 27-28 June. The first day will be single track and focus on how to use Galaxy for research. The second day will have multiple tracks, with each track featuring three sessions each.
Topics will then be selected and scheduled based on topic interest, and the organisers' ability to confirm instructors for each session. Some very popular sessions may be scheduled more than once. The final schedule will be posted before registration opens.
This workshop will focus on introducing the Galaxy user interface and how it can be used to analyze large datasets. We will cover the basic features of Galaxy, including where to find tools, how to import and use your data, and an introduction to workflows. This session is recommended for anyone who has not used, or only rarely uses Galaxy.
This workshop continues where the Introduction to Galaxy session ends. This session focuses on applying Galaxy to large data scale analysis, including:
This workshop will focus on visualisation of large datasets using the built-in tools of Galaxy, focusing on primary next-generation sequencing (NGS) data and the resulting downstream, aggregated data. First, using a multi-omic dataset consisting of exome and transcriptome (RNA-seq) data, participants will visualise alignments, variation, expression levels, and annotations using the Galaxy’s built-in genome browser, Trackster. Participants will learn how to create a genome visualisation, add data, configure data, move between a linear genome browser view and a Circos view, and generate complex genome visualisations (figures) with more than 12 NGS datasets. Second, using a processed multi-omic dataset, participants will create a several numerical plots (e.g., scatter plot, histogram) to gain an overview of the data. Based on insight gained from these visualisations, participants will create a heatmap to identify patterns and potential causal factors. All visualisations will be created, saved, and shared using only Galaxy and a web browser; no data or software uploads or downloads will be necessary.
With affordable sequencing, a myriad of protocols to query omics features, and large repositories of public data, it is often within reach to study omics features across cell types or conditions. However, there is very limited availability of bioinformatics methodology for integrative or comparative studies across datasets. The introduction of dataset lists in Galaxy has made it much more practical to repeatedly perform an operation on each datasets of a collection. Nevertheless, the challenge of how to meaningfully integrate or contrast results across datasets remains (across different cell types, conditions, diseases, sources etc).
You will in this training session learn how to use GSuite HyperBrowser - a recently finalized system that we consider to be the first to offer such analytical capabilities. Collections of up to hundreds of datasets are easily retrieved from repositories like ENCODE and Roadmap epigenomics, stored in an editable tabular format GSuite, customized for analysis in a variety of ways, and analyzed visually or statistically using a diverse range of tools. As the system is based on a general treatment of data as genomic tracks (e.g. bed-files), it can be used for a variety of omics features and questions. Examples include the study of co-binding of transcription factors, the study of cancer mutations, and the study of disease-associated variation versus cell-specific transcription or gene regulation. A manuscript on the system has recently been re-submitted, with a preprint being available at http://biorxiv.org/content/early/2016/08/03/067561. The system itself is available from https://hyperbrowser.uio.no/hb, along with guides, demos and interactive tutorials.
You will in the training session compose data collections for a variety of genomic features, learn how to customize the collections for specific analytical aims, perform visual and statistical analyses, as well as browse and interpret the results.
Did my IP work? Where is my signal? How well do my replicates correlate? What might my peaks even look like? Where are my peaks (or signal) in relationship to transcription start sites (or other features)? These are common questions that biologists first pose when dealing with ChIPseq data. We will use deepTools and MACS within Galaxy to demonstrate effective methods of
RADseq data allow scientists to gather genome wide information with a low-cost approach compared to complete genome sequencing. In this training session, we will show how to analyze RADseq data to:
Stacks works with restriction-enzyme based data, including GBS, CRoPS, and single and double digest RAD. Stacksidentifies loci in a set of individuals, either de novo or aligned to a reference genome (including gapped alignments), and then genotypes each locus. See the Stacks Manual for full details.
Stacks has been integrated into Galaxy and is available via the GUGGO Tool Shed.
This workshop will introduce the concepts behind transcriptomics with NGS data and how to analyze this data in Galaxy. Specifically, this workshop will focus on de novo transcriptome reconstruction of RNA-seq data with the following goals:
This workshop will introduce the basic principles of flow cytometry, various technical issues associated with its analysis, and a hands-on analysis of a published flow cytometry dataset. Participants will be introduced to the concept and purposes of flow cytometry, with an emphasis on the mechanics of the flow cytometer, the data produced by the flow cytometer, and best practices in experimental design. The workshop will proceed by describing the technical issues that need to be addressed in flow cytometry analysis and how to confront them, focusing on gating strategies and compensation with antibodies. The workshop will continue with a hands-on exercise comprehensively analyzing a published dataset with the ImmPort Galaxy platform.
Introduce the use of Galaxy for metabolomics data analysis. During this session, we will use the Workflow4Metabolomics instance which provides whole workflows for LC-MS, GC-MS and NMR purposes: pre-processing, nomalization, statistical analysis and annotation.
After metagenomic data generation, you need to extract useful information such as the taxonomic composition of your samples or the metabolics functions made by the studied environment sample. Several tools have recently been integrated into Galaxy for metagenomic data analysis: Mothur, QIIME, MetaPhlAN, HUMAnN, FROGS,....
We will show in this training how to analyze metagenomic and amplicon data inside Galaxy:
The Galaxy bioinformatics platform has emerged as a valuable resource for mass spectrometry (MS) based proteomic informatics. An active community of researchers and users, including the Galaxy for proteomics (Galaxy-P) team, continues to extend Galaxy for these applications.
This hands-on workshop will guide participants through the essential steps for using Galaxy for the analysis of MS-based proteomics data, focusing on protein identification and more advanced multi-omic applications. Workflows from emerging applications integrating genomic and proteomic data (such as proteogenomics and metaproteomics) will also be demonstrated.
In order to extend the reach of these workflows to the greater community, the Galaxy-P team has partnered with both the JetStream cyberinfrastructure resource (http://jetstream-cloud.org/) and Amazon Web Services (https://aws.amazon.com).
The workshop will be constructed to follow the steps based on the structure below:
At the end of the workshop, attendees will have working knowledge of MS-based proteomics tools; experience in setting up basic workflows for protein identification, as well as more advanced workflows in proteogenomics and metaproteomics.
Participants will be given temporary accounts to a cloud-based Galaxy instance to participate in hands-on workshop activities.
This hands-on workshop will explain the basis of chromosome conformation capture using Hi-C and will introduce pipelines to map, tabulate, filter, analyze and visualise the data. We will teach how to use a Hi-C browser within galaxy to navigate Hi-C, ChIP-seq, RNA-seq, etc. tracks.
SNiPlay workflow allows to exploit high density SNP data from a VCF file. In this trainings, we will show how to analyze SNP data in different ways:
SNiPlay has been integrated into Galaxy and is available via the main Tool Shed as a complete workflow.
Dereeper A, Homa F, Andres G, Sempere G, Sarah G, Hueber Y, Dufayard JF, Ruiz M. SNiPlay3: a web-based application for exploration and large scale analyses of genomic variations. Nucleic Acids Res. 2015 Jul 1;43(W1):W295-300.
Workshop will cover the basics of de novo genome assembly using a small genome example. This includes project planning steps, selecting fragment sizes, initial assembly of reads into fully covered contigs, and then assembling those contigs into larger scaffolds that may include gaps. The end result will be a set of contigs and scaffolds with sufficient average length to perform further analysis on, including genome annotation. This workshop will use tools and methods targeted at small genomes. The basics of assembly and scaffolding presented here will be useful for building larger genomes, but the specific tools and much of the project planning will be different.
Galaxy has an always-growing API that allows for external programs to upload and download data, manage histories and datasets, run tools and workflows, and even perform admin tasks. This session will cover various approaches to access the API, in particular using the BioBlend Python library.
What is important when you set up a Galaxy server from scratch, what are the pitfalls you might run into, how to interact with the potential users of the service you gonna offer, and how to make sure, the Galaxy instance you have set up is really used in the end. After a general introduction, several Galaxy installations are presented. The session will include some demonstrations and hands-on exercises. We will finish with a panel discussion, where we intend to discuss questions from the workshop participants.
Do you have your lab's Galaxy instance set up and configured but want to give it some more love without diving too deep into the code? This training will show you step by step how to modify some advanced but not complex parts of the installation. We will teach you how to:
A compressed top level review of the advanced parts of Galaxy Administrators Course offered at Salt Lake City in November and in Melbourne in February. Given the size of the scope of this topic we will be explaining advanced concepts, pointing out resources and providing guidance, tips, and tricks rather than going through the exercises and into details.
The Galaxy project has developed a significant number of Ansible roles that enable anyone to build a production-level Galaxy server on any infrastructure without much manual effort. In this workshop, we will cover the purpose of the available roles and how they relate to each other. To showcase their use, we will build a complete Galaxy server with personal choice of tools using only a handful of commands.
This training session will discuss how the Galaxy team tests Galaxy and how you can contribute. We will cover:
Since the last GCC, an all-new CloudLaunch application was developed as a method for deploying, accessing and monitoring a wide range of application services on the world’s compute infrastructures. It allows existing publicly available services - even non-Galaxy ones - to be linked into this central repository as well as new cloud appliances to be deployed on demand. A range of cloud computing infrastructures are supported. In this training, we will take a look at the CloudLaunch features, learn how to launch appliances on various cloud computing infrastructures and then take a deeper look at how to define and add one’s own appliance to CloudLaunch. Each appliance can have its own user interface elements and present the end-user with a suitable launch experience.
Want to know the big picture about what is going on inside Galaxy? This workshop will give participants a practical introduction to the Galaxy code base with a focus on changing those parts of Galaxy most often modified by local deployers and new contributors.
The workshop will include the following specific content:
This session will walk developers and bioinformaticians through the process of taking a working script or application and turning it into a Galaxy tool. It will also cover the basics of using Planemo: a command-line utility to assist in building and publishing Galaxy tools. We will investigate wrapping, common parameters, tool linting, best practices, loading tools into Galaxy, citations, and publishing tools to Github and the Galaxy Tool Shed. Common tips and tricks will be discussed as well as insights from experienced tool developers.
This workshop is aimed at people with some experience developing tools and will cover more advanced topics in tool development, more complex tools, and recent enhancements to the Galaxy tool development process including:
In this age of high-throughput analysis and big data, visualisations have become an invaluable resource for the presentation and exploration of these often high-dimensional, complex, and large datasets.
While many tools in Galaxy produce static visual outputs (graphs, trees, etc), often some more interactivity is desired to aid in the exploration of these datasets. To support this need, Galaxy offers a range of visualisation options, such as Trackster for browsing genomic data and Charts for the interactive visualisation of tabular data and other datatypes.
In this session, you will get an introduction to the integration of external resources into Galaxy. You will learn how to create new “Get Data” and “Send Data” tools and also how to create new links that appear within History Items that allow users to send datasets to external web resources with a single click (“External Display Applications”). We will cover existing examples and learn how to create new connections to external resources as well as to custom applications.
This workshop is aimed at people with some experience developing tools but may also be of use to deployers who need to manage complex sets of dependencies for tools.
Galaxy tools define the applications and other dependencies they require to run using their requirements section. This training session will cover the elements of the requirements section and how Galaxy can be configured to utilize these.
The current best practice for resolving these dependencies is using Conda and Bioconda, and so a substantial amount of time will be spent on these topics. We will go through the process of creating, testing, and publishing a Bioconda package. We will work through an example of connecting these packages to Galaxy.
We will also discuss how the Biocontainers project constructs Docker containers from Bioconda packages and how to emulate this process for local testing before publication. Finally, we will review approaches to leveraging these containers from Galaxy to run jobs within containers.
Galaxy has a set of useful features for running tools as Docker containers. This includes support for mulled containers where Galaxy automatically detects that a tool has an associated docker image, pull it and run it. In addition, Galaxy is able to submit containerized jobs to Docker enabled HPC platforms, e.g. HTCondor, Docker swarm, and Kubernetes.
Attending this training you will be able too:
In this session you will get an introduction to Interactive Environments (IE) as an easy and powerful way to integrate arbitrary interactive web services into Galaxy. We will demonstrate the IPython Galaxy project and the general concept of IE’s.
In this session you will get in-depth introduction to Interactive Environments (IE). You will learn how to setup and secure IE’s in a production Galaxy instance. Moreover, we will create an IE on-the-fly to get you started in creating your own Interactive Environments.In this session you will get an introduction to Interactive Environments (IE) as an easy and powerful way to integrate arbitrary interactive web services into Galaxy. We will demonstrate the IPython Galaxy project and the general concept of IE’s.
Tripal software enables the construction of genetic and genomic databases with the content management system Drupal. Typical Tripal sites house records for organisms, genes, genomes, germplasm, genotypes and phenotypes. We now have a Tripal Galaxy module that will enable anyone with a Tripal site to couple it to a Galaxy instance and enable site users to run workflows. In this tutorial, we will use NSF's new cloud resource, Jetstream, to deploy a Docker container with Galaxy and Tripal. The tutorial will cover set up and explore the Tripal-ized Galaxy interface from an administrative and from a user perspective.
The Galaxy Genome Annotation project (https://github.com/galaxy-genome-annotation) aims at providing a software infrastructure to ease the annotation of genomes inside an integrated environment. This project is based on various popular GMOD components (Jbrowse, Apollo, Chado, Tripal, ...), as well as a dedicated Galaxy flavor.
Developments are focusing on:
With these developments it is now possible to set up a complete web environment where one can produce genomic data with standard Galaxy tools, then directly visualise them into a JBrowse genome browser, insert them into a Chado database, or use them for manual curation of gene models with Apollo.
This project also makes the deployment of a complete reference information system very easy, flexible, and reliable.
In this workshop, you will learn how to deploy a custom Docker-based annotation environment with Galaxy that includes multiple GMOD tools and how to leverage this environment to do annotation using example data.
Scientific workflows handle growing amounts of data, and sometimes its sufficient to bog down conventional computational methods. Hadoop and related technologies can help in these cases, helping workflows scale to new levels of throughput through distributed computing.
In this session you will learn the basics about Hadoop and running Hadoop applications, how to integrate Hadoop applications with Galaxy and how to string them together to form workflows. We will discuss pros and cons of using Hadoop, as well as the shortcomings of the current Hadoop-Galaxy integration methods. In the practical session we will build a DNA-alignment pipeline using available Hadoop-based tools.
Galaxy is a great platform for teaching diverse scientific topics to a broad user base. The flexibility, reproducibility, and scalability of Galaxy make it an ideal environment for teaching and training. The Galaxy Training Network is a community initiative dedicated to high-quality Galaxy-based training around the world. One of its objectives is to support trainers with complete training material and recommendations about bioinformatics training. Templates and best training practices were defined to help trainers create new material, unify the different material, and make training materials more accessible and easy for users to learn and for teachers to teach.
This workshop will first introduce participants to the infrastructure of the GTN training materials and describe how to generate training materials following best practices. Participants will generate Galaxy Interactive Tours and create Docker Flavours intended for teaching and training sessions. The workshop will also cover best practices for running Galaxy-based workshops, focusing on how to plan a training session based on number of attendees, time constraints, resource availability, and some best practices for leading Galaxy training sessions.
|Online user: 1||RSS Feed|