Releases: exomiser/Exomiser
Releases · exomiser/Exomiser
All Change Please...
Data updated:
- DbSNP b150_GRCh37p13
- ExAC 0.3.1
- ESP ESP6500SI-V2
- dbNSFP 3.4a
CLI library changes:
- See https://github.com/exomiser/Exomiser/projects/2 for a complete list of changes.
application.properties
file has changed to useexomiser
namespace prefix. Will allow property placeholder substitution - e.g.exomiser.property=foo
can be used elsewhere in the file as${exomiser.property}
. Will support user-defined property values too.- Analysis file now requires
proband
id to be specified via the--proband
option when using multi-sample VCF and pedigrees. Bug-fix for multi-sample VCF files where the proband sample is not the first sample in the genotypes section leading to occasional scores of 0 for the exomiser_gene_variant_score in cases where the variants are heterozygous and consistent with autosomal recessive. - Analysis file
scoringMode
option has now been removed as it was never used. - Analysis now supports a new
failedVariantFilter: {}
to remove variants without aPASS
or.
in the FILTER field. - Can now filter variants by LOCAL frequency source.
- It is now possible to use UCSC, ENSEMBL or REFSEQ transcript identifiers.
- REMM data is no longer bundled with the distribution. If you want to use this for non-coding variant pathogenicity scoring you'll need to manually download and install it.
- Memory requirements are now reduced.
- Fixed AR comp-het scoring bug.
- Now partly normalises incoming variant data enabling better performance for multi-allelic sites.
- Variants contributing to the exomiser score are now flagged in output files.
- VCF output now has valid headers for info fields and more informative information.
- VCF output no longer contain invalid values in FILTER field for failed variants.
- VCF lines containing multiple alleles now contain the field
ExContribAltAllele
with an zero-based integer indicating the ALT allele contributing to the score. - HTML output now shows individual variant scores and flags contributing variants along with displaying them first.
- HTML output tweaked to display data more clearly in the genes section.
Core library changes:
- Namespace changed from
de.charite.compbio
toorg.monarchinitiative
. - Package layout has been changed to be more modular. New packages include
genome
andphenotype
. phenotype
package is independent of the others and contains the newPhenodigmModelScorer
.- Many classes are now immutable value objects, for example the
Frequency
,FrequencyData
andRsId
classes. These use staticof()
constructors. - Builders are now used extensively and are exposed using the static
Class.builder()
method. - Prioritisers have been extensively refactored and test coverage has been much improved from zero.
Prioritiser
interface signature change.Exomiser
class now has staticgetAnalysisBuilder()
exposing a fluent API for building and running an analysis.- New
GeneSymbol
class for storing mappings between HGNC and the UCSC/ENSEMBL/REFSEQ gene identifiers. - New
TranscriptAnnotation
class for storing transcript annotations. This provides a much-improved memory footprint. - New
AllelePosition
class for storing POS, REF and ALT and also providing basic variant normalisation/trimming. - New
TabixDataSource
interface to abstract theTabixReader
allowing simpler testing and other benefits.
AR HET partial bugfix
- Partial bug-fix for multi-sample VCF files where the proband sample is not the first sample in the genotypes section leading to occasional scores of 0 for the exomiser_gene_variant_score in cases where the variants are heterozygous and consistent with autosomal recessive.
IMPORTANT! As a workaround for this issue ensure the proband sample is the first sample in the VCF file. This will be properly fixed in the next major release.
Squashing bugs
- Fix for issue when using OmimPrioritiser with UNDEFINED inheritance mode which led to gene phenotype scores being halved.
- Fix for VCF output multiple allele line duplications. VCF output will now have alternate alleles written out on the same line if they were originally like that in the input VCF. The variant scores will be concatenated to correspond with the alleles. VCFs containing alleles split onto seperate lines in the input file will continue to have them like this in the output file.
The Genomiser On A Diet
Changed the bundling of data and binaries a bit more so this distribution is a lot slimmer. That data part is still enormous.
7.2.1 2016-01-05
- Fix for incorrect inheritance mode calculations where the variant chromosome number is prefixed with 'chr' in VCF file.
7.2.0 2015-11-25
- Performance in identification of causal regulatory variants as the top candidate of simulated whole genomes now improved to over 80%.
- Enhancer variants are assigned to TADs
- Variant gene assignment improvements and bug-fixes.
The Genomiser Speedomised
7.1.0 2015-10-21
- Variants in FANTOM5 enhancer and ENSEMBLE regulatory regions are now all marked REGULATORY_REGION_VARIANT even without
the regulatoryFeatureFilter being run. - Massive performance increase when running regulatoryFeatureFilter.
- Running Exomiser in exome analysis mode now requires REGULATORY_FEATURE to be included in the variantEffectFilter.
See test-analysis-exome.yml - Added missing regulatoryFeatureFilter step from the analysis steps in test-analysis-genome.yml
The Genomiser Unleashed
Exomiser can now exomise whole genomes!
The Nature Protocols Release
Core API changes:
- Package tidy-up - all packages are now use their maven package name as the root package for that project.
- PhenixPriority now dies immediately and with an informative message if no HPO terms are supplied.
- Added NONE PriorityType for when you really don't want to run any prioritiser.
- Re-named the ExomiserMouse and ExomiserAllSpecies prioritisers to their published Phive and HiPhive names.
- Removed unused List requirement from writers.
- TSV output now comes in TSV_GENE and TSV_VARIANT flavours.
- Removed unused getBefore and getAfter methods from Priority interface.
- Removed getConnection and setConnection from Priority interface as these were only used by some prioritisers.
- Prioritisers requiring database access now use a DataSource rather than a direct connection.
- Added getRowIndexForGene(entrezGeneId) and getColumnMatrixForGene(entrezGeneId) methods to DataMatrix.
- Removed getRowToEntrezIdIndex from DataMatrix.
- Refactored ExomeWalkerPriority and ExomiserAllSpeciesPriority to use new DataMatrix methods.
CLI changes:
- Added 'none' type prioritiser for when you really don't want to run any prioritiser.
- Exomiser will now show the help options when no parameters are supplied.
- New test settings files for different prioritisers and the batch file.
- Changed input parameters these are optional switches:
--remove-path-filter-cutoff to --keep-non-pathogenic
--remove-off-target-syn to --keep-off-target - Renamed somewhat misleading example.settings to template.settings to reflect it's intended use.
- TSV output now comes in TSV_GENE and TSV_VARIANT flavours.
- Added missing ehcache.xml to the distribution.
- Switched PostgreSQL driver to use pgjdbc-ng which allegedly has better performance.
- Consolidated JDBC Connection pool to use HikariCP.
New HTML output and supporting API
Exomiser-core has gained a newly styled HTML output and includes new APIs to enable this functionality.
VCF output now reports unannotated variants
- Added ability for the VariantEvaluation to report whether the Variant it is associated with has been annotated by Jannovar.
- VCF output format will now indicate which, if any variants have not been annotated by Jannovar for whatever reason.
- VariantEvaluation can now report a FilterStatus to indicate whether it has passed, failed or is unfiltered.
- Further under the hood clean-ups and improved test coverage - now at ~30%
Upgrading from the previous version? Just add the new exomiser-cli and exomiser-core jars into the relevant place and it should all work.
Patch for Jannovar NullPointerException
- Changed Jannovar to version 0.9 to fix a null pointer caused by inability to translate certain variants.