https://github.com/bigdatagenomics/    https://twitter.com/bigdatagenomics/

A few ADAM releases have been made since the last announcement; we’ll attempt to catch up here.

The most recent is a version 0.18.2 bugfix release, built for both Scala 2.10 and Scala 2.11. It fixes a minor issue with the binary distribution artifact from version 0.18.1.

Prior to version 0.18.2, we made significant changes to support version 0.6.0 of the Big Data Genomics Avro data formats. We also improved performance on core transforms (markdups, indel realignment, bqsr) by using finer grained projection. Some issues in 2bitfile when dealing with gaps and masked regions were fixed. Round-trip transformations from native formats (e.g., FASTA, FASTQ, SAM, BAM) to ADAM and back have been improved. We made extending ADAM more straightforward.

ADAM now runs on a wide range of Apache Spark (1.2.1 up to and including the most recent, 1.5.1) and Apache Hadoop (currently 1.0.4, 2.3.0 and 2.6.0) versions. This is verified by a compatibility matrix of Spark, Hadoop, and Scala version builds in our continuous integration system.

The full list of changes since version 0.17.0 is below.

Version 0.18.2

  • ISSUE 877: Minor fix to commit script to support https.
  • ISSUE 876: Separate command line argument words by underscores
  • ISSUE 875: P Operator parsing for MDTag
  • ISSUE 873: [ADAM-872] Modify regex to capture release and SNAPSHOT jars but not javadoc or sources jars
  • ISSUE 866: [ADAM-864] Don’t force shuffle if reducing partition count.
  • ISSUE 856: export valid fastq
  • ISSUE 847: Updating build dependency versions to latest minor versions

Version 0.18.1

  • ISSUE 870: [ADAM-867] add pull requests missing from 0.18.0 release to CHANGES.md
  • ISSUE 869: [ADAM-868] make release branch and tag names consistent
  • ISSUE 862: [ADAM-861] use -d to check for repo assembly dir

Version 0.18.0

  • ISSUE 860: New release and pr-commit scripts
  • ISSUE 859: [ADAM-857] Corrected handling of env vars in bin scripts
  • ISSUE 854: [ADAM-853] allow main class in adam-submit to be specified
  • ISSUE 852: [ADAM-851] Slienced Parquet logging.
  • ISSUE 850: [ADAM-848] TwoBitFile now support nBlocks and maskBlocks
  • ISSUE 846: Updating maven build plugin dependency versions
  • ISSUE 845: [ADAM-780] Make DecadentRead package private.
  • ISSUE 844: [ADAM-843] Aggressively project out metadata fields.
  • ISSUE 840: fix flagstat output file encoding
  • ISSUE 839: let flagstat write to file
  • ISSUE 831: Support loading paired fastqs
  • ISSUE 830: better validation when saving paired fastqs
  • ISSUE 829: fix Long != null warnings
  • ISSUE 819: Implement custom ReferenceRegion hashcode
  • ISSUE 816: [ADAM-793] adding command to convert ADAM nucleotide contig fragments to FASTA files
  • ISSUE 815: Upgrade to bdg-formats:0.6.0, add Fragment datatype converters
  • ISSUE 814: [ADAM-812] fix for javadoc errors on JDK8
  • ISSUE 813: [ADAM-808] build an assembly cli jar with maven shade plugin
  • ISSUE 810: [ADAM-807] workaround for ktoso/maven-git-commit-id-plugin#61
  • ISSUE 809: [ADAM-785] Add support for all numeric array (TYPE=B) tags
  • ISSUE 806: [ADAM-755] updating utils dependency version to 0.2.3
  • ISSUE 805: Better transform error when file doesn’t exist
  • ISSUE 803: fix unmapped-read sorting
  • ISSUE 802: stop writing contig names as md5 sums
  • ISSUE 798: fix SAM-attr conversion bug; int[]’s not byte[]’s
  • ISSUE 790: optionally add MDTags to reads with transform
  • ISSUE 782: Fix SAM Attribute parser for numeric array tags
  • ISSUE 773: [ADAM-772] fix some bash var quoting
  • ISSUE 765: [ADAM-752] Build for many combos of Spark/Hadoop versions.
  • ISSUE 764: More involved README restructuring
  • ISSUE 762: [ADAM-132] allowing list of commands to be injected into adam-cli ADAMMain

Version 0.17.1

  • ISSUE 784: [ADAM-783] Write @SQ header lines in sorted order.
  • ISSUE 792: [ADAM-791] Add repartition parameter to Fasta2ADAM.
  • ISSUE 781: [ADAM-777] Add validation stringency flag for BQSR.
  • ISSUE 757: We should print a warning message if the user has ADAM_OPTS set.
  • ISSUE 770: [ADAM-769] Fix serialization issue in known indel consensus model.
  • ISSUE 763: Clean up README links, other nits
  • ISSUE 749: Remove adam-cli jar from classpath during adam-submit
  • ISSUE 754: Bump ADAM to Spark 1.4
  • ISSUE 753: Bump Spark to 1.4
  • ISSUE 748: Fix for mdtag issues with insertions
  • ISSUE 746: Upgrade to Parquet 1.8.1.
  • ISSUE 744: [ADAM-743] exclude conflicting jackson dependencies
  • ISSUE 737: Reverse complement negative strand reads in fastq output
  • ISSUE 731: Fixed bug preventing use of TLEN attribute
  • ISSUE 730: [ADAM-729] Stuff TLEN into attributes.
  • ISSUE 728: [ADAM-709] Remove FeatureHierarchy and FeatureHierarchySuite
  • ISSUE 719: [ADAM-718] Use filesystem path to get underlying file system.
  • ISSUE 712: unify header-setting between BAM/SAM and VCF
  • ISSUE 696: include SequenceRecords from second-in-pair reads
  • ISSUE 698: class-ify ShuffleRegionJoin, force setting seqdict
  • ISSUE 706: restore clause guarding pruneCache check
  • ISSUE 705: GeneFeatureRDDFunctions → FeatureRDDFunctions

Comments