My summary of the paper by the authors of the LAST aligner.http://bioinformatics.oxfordjournals.org/content/early/2011/10/05/bioinformatics.btr537.short?rss=1
Introduce "Aligned column accuracy" which is different from "Mapping Accuracy"
in that the latter only checks if the mapped location overlaps the true
location whereas the former is per-base and more important for accuracy of variant calls.
Use probabilisitic alignments based on posterior decoding (I imagine this is
like the dynamic programming of converting colorspace alignments to
base-space?) instead of relying on the maximum score.
Actually try 2 probabilisitic alignment models.
Use dwgsim and stampy for sims and a set of 36 and 76 bp reads from SRA.
Compare their LAST with BWA, Bowtie, Novoalign SHRiMP2 and Stampy.
Probabilistic alignment improves LAST sensitivity by 2%, 6% for indel, gap
LAST has highest sensitivity accuracy and PPV among the tested for mapping,
aligned column accuracy, and gap accuracy. Novoalign fares well. BWA doesn't
look so great "because it is designed to be more accurate and faster on
queries with low error rates".
They credit LAST's adaptive seed for its good performance.
Table 1. The probabilistic models in LAST take 4-5 times longer (194, 184
minutes) than unmodified LAST (41minutes) but are still faster than novoalign
(518 minutes). Bowtie takes 3 minutes and BWA 16.
LAST seems to map many fewer reads on the 36bp data --341304 compared to 400K
to 500K for the other aligners. On 76bp data it's more in line with the
others. LAST uses a lot more memory than other aligners (though only 15GB).
Simulated Data for SNP calling
Table 2. It's seems that due to the large rate of errors, LAST greatly
outperforms the other aligners in terms of downstream SNP calls (by samtools).
Much higher sensitivity and PPV especially at 20x and 40x coverage (less so at
Would have been interesting to see how this compares to SRMA or GATK
realignment post-processing. Not sure why that wasn't included...