The announcement of the first $100 30x human genome (reagents only) from Rade Drmanac in the final slot at #AGBT20 means that@MGI_BGI‘s DNBSeq Tx is going to be the biggest sequencer in the world. However. it remains to be seen if the $100 genome will be an MGI exclusive for long since Illumina started litigation in the US after MGI announced they’ll start shipping to the USA in April (as I’d prefer to focus on the cool news rather than patent wars I’ll leave discussion of this till the bottom of this post).

Performance comparison: with the capability to do just over 100 runs per year, with each 3.5 day run processing 8 slides and generating 150 genomes each, DNBSeq Tx70 kicks out 124,800 30x Human genomes per year!

  • Illumina NovaSeq S4 – 44 hours runtime for PE150 and 3Tb per flowcell.
  • MGI_BGI DNBSeq Tx – 84 hours runtime for PE150/2001 and 9Tb per flowcell2.
  1. Can someone at AGBT comment on the read length?
  2. Based on Tx 70 terabases per run, or 20 terabases per day, and a run time of 3.5 days. It can process up to eight independent slides, each of which can produce data for up to 150 human genomes at 30X coverage.

 

In his AGBT talk Drmanac presented initial sequencing data that showed high concordance to DNBSEQ-400 SNP and indel accuracy. And the accompanying bioRxiv preprint lists higher accuracy (along with: longer reads, higher throughput, lower cost, etc) as one of the performance benefits in table 2. However, the preprint somehow manages NOT to mention Illumina or their systems once (they do mention Ion Torrent, and also shy away from NGS, using MPS instead); as such, it is not stated that table 2 (see below) is saying these performance benefits are comparing to Illumina – but I think most of the people hiding from the sun in Florida while Drmanac spoke will be looking at it that way!

 

The biggest sequencer on the block

The DNBSEQ Tx sequencer is a core-lab or genome-institute in a box. Flowcells are gone and robotics abound. If anyone ran microarrays on the SciGene Little Dipper you’ll probably get an idea how it works. Big slides carry DNA nanoballs to be washed in vats of SBS reagents – goodbye vici valves! And the compute delivers all the way to VCFs.

It comes with the automation necessary to feed it – DNA extraction and library prep systems that can process significantly more samples than the almost 125,000 genomes a lab could process per year. GenomeWeb’s detailed coverage quoted Drmanac on the kind of work users might do “sequence all newborns in the US each year (on 50 boxes)” or “run a million oncology tests with cell-free DNA” or “100 million single-cell transcriptomes” or, my favourite, “shallow human genome sequencing at 2X coverage, the cost of which would be less than $10 per genome”!

The newest chemistry on the block

CoolMPS is a sequencing-by-synthesis technology but one that uses unlabelled nucleotides. Ultimately it works very similarly to Illumina’s SBS (more on that in a sec) but because the nucleotides are almost native they may well represent an advance in incorporation performance – although whether that leads to any overall performance increase remains to be seen. To enable detection of the incorporated bases MGI make use of an monoclonal CoolMPS antibodies (read the bioRxiv preprint and Keith’s blog with excellent, and more detailed, coverage than I’ll go into) that are significantly brighter (according to MGI) than labelled-nucleotides leading to improvements in signal detection and therefore sequencing quality. MGI have also been working on paired-end sequencing using a new MDA on their DNA nanoball arrays; this generates “branched DNBs” with 1-3 template copies per branch, which apparently increases signal in read 2 by increasing the number of priming sites (neat). However, Keith highlights the data in their preprint that shows a drop in signal and accuracy in the second read, still with over 99% mapping; but where external data would be great (Rade announced that GIAB data is coming soon).

Early thoughts from those in the know:

GenomeWeb quoted Shawn Levy (HudsonAlpha) and Chris Mason (Weill Cornell) who both responded positively  to the announcements. But Nik Matthews spoke in more detail at BGI’s AGBT workshop about his groups early use of the (DNBSEQ-400) technology, that is a taster of how the Cancer community might take up this non-Illumina technology. He also put two Star Wars themed posts onto his LinkedIN profile (see below) – glad to see the Star Wars theme won’t die Nik!

The elephant in the room

Until we see data from users in the field it is impossible to say how much market share MGI can take with this box. However, with projects like the UKB500k in progress on Illumina who’s to say if the 5M initiative (currently under discussion) could not be run on this box? I’d be pretty sure Illumina can dial up NovaSeq to even dizzier heights but nowhere near the numbers of a single DNBSeq Tx70.

Illumina started litigation in the US as soon as MGI announced they planned to drop a few systems to early access users for free and planned to start shipping to the USA in April. In the litigation (see GenomeWeb for more detail) Illumina say the new technology infringes its intellectual property and that they are “likely to suffer irreparable harm” from MGI being allowed to operate in the USA. The initial damage would be to their HUGE margins; and Illumina can probably keep MGI out of USA simply by reducing profits, with a bit of a knock on the share price.

Can MGI keep up the momentum and break through the US courts? Qiagen was the most recent sequencing company to throw in the towel; killing off further development of their IBS technology – the Walmart/Aldi of the sequencing world! Admittedly, MGI has more to win in this space as they appear to have a technology that can deliver what people want – cheap sequencing. But they face an uphill battle in the US courts and my bet is that protectionism in the USA is more likely to be a road block than technical issues. It makes me more than a little sick to think that Trump getting a second term be good for Illumina!

Users are looking for a price point and if they can generate data at lower cost they will – the $64,000 question is how big does the differential need to be?