ADAM version 0.21.0 has been released!

Due to major changes between Spark versions 1.6 and 2.0, we now build for combinations of Apache Spark and Scala versions: Spark 1.x and Scala 2.10, Spark 1.x and Scala 2.11, Spark 2.x and Scala 2.10, and Spark 2.x and Scala 2.11. The Spark 2.x build-time dependency will be bumped to version 2.1.0 in the next release of ADAM, see issue #1330.

One focus of this release was documentation, both at the developer API level, including extensive javadoc and scaladoc source code comments, and at the user level (e.g. The user docs can be compiled to PDF or HTML with pandoc, but to be honest they look better rendered as Markdown on Github.

Another focus was to more closely follow the VCF specification(s) when reading from and writing to VCF. For this we made significant changes to our variant and variant annotation schema and added support for version 1.0 of the VCF INFO ‘ANN’ key specification. This work will continue for our genotype and genotype annotation schema in the next version of ADAM.

The full list of changes since version 0.20.0 is below.

Closed issues:

  • Update Markdown docs with ValidationStringency in VCF<–>ADAM CLI #1342
  • Variant VCFHeaderLine metadata does not handle wildcards properly #1339
  • Close called multiple times on VCF header stream #1337
  • BroadcastRegionJoin has serialization failures #1334
  • adam-cli uses git-commit-id-plugin which breaks release? #1322
  • move_to_xyz scripts should have interlocks… #1317
  • Lineage for partitionAndJoin in ShuffleRegionJoin causes StackOverflow Errors #1308
  • Add script and update README to mention #1307
  • adam-submit transform fails with Exception in thread “main” java.lang.IncompatibleClassChangeError: Implementing class #1306
  • private ADAMContext constructor? #1296
  • AlignmentRecord.mateAlignmentEnd never set #1290
  • how to submit my own driver class via adam-submit? #1289
  • ReferenceRegion on Genotype seems busted? #1286
  • Clarify strandedness in ReferenceRegion apply methods #1285
  • Parquet and CRAM debug logging during unit tests #1280
  • Add more ANN field parsing unit tests #1273
  • loadVariantAnnotations returns empty RDD #1271
  • Implement joinVariantAnnotations with region join #1259
  • Count how many chromosome in the range of the kmer #1249
  • ADAM minor release to support htsjdk 2.7.0? #1248
  • how to config kryo.registrator programmatically #1245
  • Does the nested record Flattener drop Maps/Arrays? #1244
  • Dead-ish code cleanup in org.bdgenomics.adam.utils #1242
  • for old adam file after upgrade to adam0.20 #1240
  • please add maven-source-plugin into the pom file #1239
  • Assembly jar doesn’t get rebuilt on CLI changes #1238
  • how to compare with the last the column for the same chromosome name? #1237
  • Need a way for users to add VCF header lines #1233
  • Enhancements to VCF save #1232
  • Must we split multi-allelic sites in our Genotype model? #1231
  • Can’t override default -collapse in reads2coverage #1228
  • Reads2coverage NPEs on unmapped reads #1227
  • Strand bias doesn’t get exported #1226
  • Move ADAMFunSuite helper functions upstream to SparkFunSuite #1225
  • broadcast join using interval tree #1224
  • Instrumentation is lost in ShuffleRegionJoin #1222
  • Bump Spark, Scala, Hadoop dependency versions #1221
  • GenomicRDD shuffle region join passes partition count to partition size #1220
  • Scala compile errors downstream of Spark 2 Scala 2.11 artifacts #1218
  • Javac error: incompatible types: SparkContext cannot be converted to ADAMContext #1217
  • Release 0.20.0 artifacts failed Sonatype Nexus validation #1212
  • Release script failed for 0.20.0 release #1211
  • gVCF – can’t load multi-allelic sites #1202
  • Allow open-ended intervals in loadIndexedBam #1196
  • Interval tree join in ADAM #1171
  • spark-submit throw exception in spark-standalone using .adam which transformed from .vcf #1121
  • BroadcastRegionJoin is not a broadcast join #1110
  • Improve test coverage of VariantContextConverter #1107
  • Variant dbsnp rs id tracking in vcf2adam and ADAM2Vcf #1103
  • Document core ADAM transform methods #1085
  • Document deploying ADAM on Toil #1084
  • Clean up packages #1083
  • VariantCallingAnnotations is getting populated with INFO fields #1063
  • How to load DatabaseVariantAnnotation information ? #1049
  • Release ADAM version 0.20.0 #1048
  • Support VCF annotation ANN field in vcf2adam and adam2vcf #1044
  • How to create a rich(er) VariantContext RDD? Reconstruct VCF INFO fields. #878
  • Add biologist targeted section to the README #497
  • Update usage docs running for EC2 and CDH #493
  • Add docs about building downstream apps on top of ADAM #291
  • Variant filter representation #194

Merged and closed pull requests: