When will nanopore sequencing push short-reads (i.e. Illumina) off their pedestal? According to Clive PromethION is the Illumina killer…but this same conversation was going on many times at London Calling. I wanted to highlight two areas that might be about to flip to ONTs advantage (one of which I’m really excited about).
Genomes:
With costs dropping precipitously on MinION (if you can get 10GB+) and PromethION on the way for the end of this year as a real production machine it looks like the sensible choice for a high-quality genome would be Nanopore + Illumina, but given a little more improvement genomes might be best done “nanopore only” very soon?
My lab does not sequence many genomes but when the next 1000 genome project comes along I guess I should pick up the phone and talk to Clive as well as Franics?
RNA-Seq:
Lets be absolutely clear, until ONT’s Direct RNA-Seq was launched (see most recent BioRxiv submission from Andrew Smith et al) pretty much nobody was actually sequencing RNA! However there were talks about 10 million+ 1D cDNA reads from the new protocols being developed by Dan Turners applications group at ONT.
New cDNA strand switching protocol by @nanopore tested @Genomique_ENS. Almost 10 million 1D reads ! Where will it stop ????#nanoporeconf
— Génomique ENS (@Genomique_ENS) May 4, 2017
My lab does a LOT of RNA-Seq, primarily for DGE and mainly using single-end 50bp reads (feel free to comment Mick). We aim for 10-20 million per sample.
Assuming a 1000bp ONT cDNA read equates to 20 Illumina SE50 reads (something we should debate) then those 10 million ONT cDNA reads might allow me to sequence 20 samples (at 1000bp average read length) or 40 (at 2000bp) on a single MinION flowcell. I’ve not really pushed higher than 2000bp as the number of Human transcripts longer than this is probably less than 50%.
I’ve also held off debating 1) the additional benefits of counting isoforms directly, and 2) of the impact direct RNA-Seq will have on biological analysis.
Food for thought…see you in Gordon’s “good morning” rock family trees intro.
Let me know what you think will be the first application “lost” by short-read technologies?
From a cancer perspective I think Nanopore should win first for fusion gene detection (DNA and RNA), allele phasing, transcript variation and large rearrangements inversion and deletions. Super high accuracy is less important here and targeting seems to be improving.
Short reads still fine for panel capture for SNV and counting applications like chipseq, DGE (without differential transcripts) and lcWGS.
The biggest thing for a clinical lab now is consistency and reproducibility. If Voltrax really works that could be great but I’ve learned to take Nanopore’s timescales with a pinch of salt. Seriously interested now though.
Hi Mike,
I completely agree on the need for consistency and reproducibility…people complain to me when they only get 320 million reads from a HiSeq 4000 lane!
DGE might be more tractable than we think. I’m just about to read through Georgi’s paper and 5 million reads is not super expensive, and likely to drop over the next 12 months. I’ll look at working out how much an RNA-Seq core might cost to setup on PromethION!
Assuming a 1000bp ONT cDNA read equates to 20 Illumina SE50 reads (something we should debate) then those 10 million ONT cDNA reads might allow me to sequence 20 samples (at 1000bp average read length) or 40 (at 2000bp) on a single MinION flowcell. I’ve not really pushed higher than 2000bp as the number of Human transcripts longer than this is probably less than 50%.
It doesn’t really work that way because what you are counting is individual transcripts. You are not sampling from an approximately infinite population after PCR.
I did some theoretical calculations on the subject here:
https://www.ncbi.nlm.nih.gov/pubmed/28334071
You will probably need in the neighborhood of 5M direct RNA-seq reads for good results on bulk RNA samples. And you might be limited in terms of how low you can go in terms of the amount of input material by your individual transcripts capture probability, which is at present poorly characterized. But if they can indeed do 10M reads, that will be more than sufficient in many cases
Thanks for the pointer to the paper and the reply Georgi,
I’m going to run some cDNA-Seq in my core lab and we’ll see how far we can push this along over the next few months!
I hope to get a chance to play with one myself at some point.
They key unknown to me is the capture probability (which is the same issue that plagues single-cell RNA-seq) — if there is an individual transcript molecule floating in solution, what are the chances that it will successfully go through the pore and produce a usable sequence?
So some careful experiments with spike-ins are needed.
The ONT guidelines for regular DNA sequencing are for using a lot of input material, which would point in the direction of generally low capture success, but it has to be examined directly.
[…] I said earlier today, until Direct RNA-Seq was launched nobody was really sequencing RNA! Somehow RNA-Seq seemed such […]