Thanks to advances in both the cost and speed of sequencing technology, the amount of genomic data available for processing is growing exponentially. As a project, our goal is to build scalable pipelines for processing genomic data on top of high performance distributed computing frameworks.
At the moment, we are working on three projects:
- ADAM: A scalable API & CLI for genome processing
- bdg-formats: Schemas for genomic data
- avocado: A Variant Caller, Distributed
The source for these projects is available at Github.
All of our development is available under the Apache 2 open source software (OSS) license. This OSS license is non-viral, and places no restrictions on users who would like to use or modify the software.