How a little bug got involved in a big battle: MiSeq vs Ion

There has been a lot said about the recent sequencing of E coli from the recent outbreak in Germany. Over five isolates have been sequenced on almost every sequencing platform, HiSeq, 454, Ion and MiSeq. Interestingly I am not aware of a SOLiD or a Sanger genome. I’d recommend Genomewebs coverage as a good starter if you’re interested in finding out more.

Just this week Illumina made available a slide deck and data from their analysis of E coli K12MG1655 sequenced on HiSeq (PE100) and MiSeq (PE150).

The first 8 slides are the HiSeq MiSeq comparison:

Throughput seems to have gone up to 1.5Gb, which is a 50% increase over the initial specs so growth seems to be as fast on MiSeq as it has been on GA and HiSeq platofmrs. Albeit with just one data point so far. And there is still a,long way to go before it gets to 25Gb, see my first blog.

Libraries for the comparison were made using Tru Seq and not the Nextera kits from the recent Epicentre purchase which I was a little surprised about. Ot would have been great to see a multiplexed run on the two platforms for a TruSeq and Nextera comparison in the same data set.

Essentialy MiSeq outperforms HiSeq on quality a tiny bit. Other than that the datasets are pretty much the same n all respects. The HiSeq run has the characteristic intensity fluctuation at about 75bp where laser power is adjusted mid-run. Both runs have a stepped prfile in Men Qscore which I guess is a function of alignment getting better till about 20bp then declining with read quality. In this presentation Illumina do not show the Mean Qscore at 150bp for MiSeq.

Illumina sub samples the HiSeq run for a de novo assembly comparison and the result were strickingly similar in all respects other than MiSeq gave 11 where HiSeq gave 12 contigs. I am not an assembly wiz so can’t think why this would be the case nor whether it would matter a great deal.

The next 7 slides are a comparison of MiSeq to Ion Torrent:

Again I am not a bioinformatician so can’t realistically comment on the fairness of this comparison and others are getting the data for more impartial assessment.

The Ion data was all from the 314 chip which was specced at 10MB and the average run yield was 11-24MB from the threes tes that have generated data. So it looks like Ion is increasing the yield as expected. However it is very clear that the MiSeq has a huge advantage over the Ion platform with respect to yield as it gave 1.7Gb from this run. Quality on Ion was lower but without the stepped profile seen in MIseq and HiSeq, averaging out at Q31 vs Q19.

The comparison is an interesting one although it will probably be obsolete by the time I hit post due to the rapid developments from Ion. I did hear Broad have now obtained over 290Mb of data from a 316 chips so itt looks like their roadmap is on track.

It remains to be seen which platform is going to win this particular VHS vs Betamax battle (showing my age here). An unanswered question is also whether these platforms will ultimately replace systems like HiSeq, especially when whole genome sequencing can be purchased for $4000 with a 60 day turnaround. This could drop to $2500 next year if the 1Tb runs from Illumina work outside their development labs.

HiSeq costs $750,000 to buy including a cBot, MiSeq is $125,000. If yield is not your primary concern then MiSeq may well turn out to be the kind of instrument that finally democratises sequencing.