BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Friday, July 17 • 12:16 - 14:45
Scaling genomic analysis with Glow and Apache Spark

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

 TutorialGet Started, Ingest Data, Transform Variants, Run GloWgR

Glow makes genomic data work with Apache Spark, the leading engine for working with large structured datasets. It fits natively into the ecosystem of tools that have enabled thousands of organizations to scale their workflows to petabytes of data. Glow bridges the gap between bioinformatics and the Spark ecosystem by working with datasets in common file formats like VCF, BGEN, and Plink as well as high-performance big data standards. You can write queries using the native Spark SQL APIs in Python, SQL, R, Java, and Scala. The same APIs allow you to bring your genomic data together with other datasets such as electronic health records, real world evidence, and medical images. Glow makes it easy to parallelize existing tools and libraries implemented as command line tools or Pandas functions.

  • Basic Python
  • Some exposure to Spark useful but not necessary


Henry Davidge

avatar for Karen Feng

Karen Feng

Software Engineer, Databricks
avatar for Rishi Ghose

Rishi Ghose

Solutions Architect, Databricks

Michael Shtelma

Lead Specialist Solutions Architect, Databricks

Frank Nothaft

GTM Lead-Genomics, Databricks
avatar for Kiavash Kianfar

Kiavash Kianfar

Sr. Software Engineer, Databricks
I am currently a Sr. Software Engineer in the Health and Life Sciences team at Databricks working on scalable unified analytics for Genomics, while being on leave from my tenured Associate Professor position at Texas A&M University.@Databricks:* Developing algorithms and software... Read More →

Amir Kermany


Will Brandler


Friday July 17, 2020 12:16 - 14:45 EDT
Training B