I’m excited to be at ONTs London Calling event today and tomorrow. Expect to see lots of coverage on Twitter. Missed the workshop yesterday or want a late preview of what’s coming then head over to Keith’s OmicsOmics blog. Update: coverage from @Gringer, ONT, NextGenSeek (storified & here)…let me know about other sources and I’ll add them here!
— Keith Robison (@OmicsOmicsBlog) May 3, 2017
Plenary session 1
- Karen Miga (Haussler lab), Linear assembly of a human Y centromere using MinION nanopore long read sequences: short reads can’t sequence through many repeats let alone centrosomes! Technical issues include a need to sequence high molecular weight DNA using 200kb BACs containing centromeric DNA (9 BACs from a 2001 Nature paper (Tilford et al) using UCSC longboard protocol: linearise BAC with 1D prep to include a single transposition event per BAC to get average length of reads of 198kb. Showed data from 1 BAC with 38 CENY repeats and 2 tandem structural rearrangements, and homopolymers – this is complex sequencing. The sequencing of all 9 BACs generated a new consensus sequence for the Y centromere of 346kb. Future work: want to move away from BAC approach to genomic DNA, requires long-reads so are optimising Longboard protocol for WGS. Needs long DNA (see earlier post on DNA extraction). Questions: how long do reads need to be to get whole centromeres on all chromosomes?
- Björn Usadel, Complex tomato genomes: Easy with nanopores: Plant genomes are “particularly nasty beasts” highly repetetive and big – really big (figure in slide redrawn from 2017 review)! Tomato genome sequenced in 2014 on Illumina with scaffold N50 of 87kb. Nanopore sequencing flowcells show high variability (can’t wait till this settles down). Using Pippin Prep for size selection. Slide comparing other genomes shows how well Nanopore is doing “can get a good plant genome in a few months…will switch off Illumina sequencing” (said colleagues in plant community switching to Nanopore) and means small labs can sequence plant genomes. Questions: how about repetetive elements and centromeres?
During the coffee break in the Mini Theatre: Josh Quick gave a talk about the Loman Labs long-read sequencing: Thar she blows! Ultra-long read methods for nanopore sequencing.
Plenary session 2
- Jared Simpson, Analysis tools for nanopore data: Jared focussed on signal level nanopore data analysis from current measurements at 4Hz or segmented current signal data. Segmented event data allow algorithms to run faster and little data lost in this transformation (be interested to find out more about segmentation algorithms as this is an issue in copy-number calling). Nanopolish: a software package for signal-level analysis of Oxford Nanopore sequencing data, it uses an HMM to compute consensus sequence. Now at 99.95% accurate E.coli “nanopore only” genome assembly. Applied to Human Genome sequence to get 99.96 accuracy (see Nanopore sequencing and assembly paper on BioRxiv). Jared focused on 3 main analysis tasks: SNP calling and genotyping, Base-modifications, and Phasing variants (mentioned WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads).
Jared Simpson talking about consensus algorithms for nanopore data. Great work! #nanoporeconf
— mattloose (@mattloose) May 4, 2017
4 minutes, no more than 15 slides…where is Chris Mason (Chris was back on the US but had 3 people from his lab at the meeting)
- Michael Clark (University of Oxford): 1 CACNA1C a voltage gated calcium channel gene implicated in neuropsychiatric disorders. Need to know which isoforms are expressed in brain, want to understand how expression varies between normal and disease phenotypes. 50 exon gene with lots of isoforms. Used post-mortem human brain from which long-range RT-PCR amplified full coding sequence for Nanopore sequencing, identified novel isoforms.
- Grazioa Pesole: Epigenetics in mtDNA by Nanopore sequencing using Nanopolish (see Jared’s talk) with 800x coverage of mitochondrial genome. Detected 430 methylated cytosines in CpG context, 93% overlap with known methylation sites. Methylation levels detected by Nanopore were higher than seen in bisulphite data.
- Raja Mugasimangalam (Genotypic): Testing dairy products and probiotics using nanopore sequencing. Are products labelled with the correct bacterial species – sometimes! HIs slide mentioned home-made yoghurt in villages in India…maybe cultivating local strains for many hundreds, thousands of years – what a biological resource! Nanopore sequencing shows mixes of lactobacillus and streptococcus, but no major surprises. Need at least 500 reads to understand the metagenome of your yoghurt!
- Sebastian Johanssonn (ScLifeLab): multiplexed HLA-genotyping on nanopores. lrPCR in 8 patients to create HLA amplicons for sequencing, pooled for each patient, add nanopore barcodes (now looking at native barcoding kit), sequence.
- Franz-Josef Muller: SelectION rapid ID of genomics regions in Nanopore data sets allows filtering of data to focus on the analysis of regions of interest with dramatic reductions in compute requirements. Showing example of repeat expansion in FMR gene for diagnosing Fragile X Syndrome, can actually count repeats. Poster on github: paygiesselmann/selectION
- Sally james (University of York) Telomore to telomere sequencing in Galdieria sulphuraria. Comparable assembly size of genome with Illumina or Nanopore but tens not hundreds of contigs using Nanopore. Sequencing telomeric repeats contig 37 290kb scaffold. Slides show they are actually sequencing both ends of a chromosome!
- David Eagles (Milligan Institute): Nippo genome assembly. Every run at ONT was significantly better than any runs done back home. Here’s a user talking about bad experiences on Nanopore, we’re doing this all the time for Illumina!
- Bethany Lodge (ONT Applications Group): Beth presented an update on Voltrax – VIP is open! Portable, programmable (heater, peltier, magnets), disposable (returnable) sample prep. Burn-in kit released now, HT kit next, then Multiplex kit coming soon with four samples barcoded. Showed lambda burn-in yield variability (350-800 Mb, Voltrax reduces this to essentially zero variation). Extraction and PCR is possible. Sample prep outside the lab using Omnilyse, SPRI, Voltrax and sequencing.
— Génomique ENS (@Genomique_ENS) May 4, 2017
Plenary session 3
- Daniel Turner, Nanopore Applications. Focuses on protocol and method developments in his Oxford lab, but also presented work from the New York team too. Started by talking about long-reads from PhiX in 2012 to Josh’s 950kb read – a record that keeps on growing, but you need long DNA molecules (cannot underestimate the sample extraction and handling). Dan was the second person to mention the discrepancy between what ONT get in-house and what users get in the field (for me this variation needs to go and is opposite to what we’d see on an Illumina sequencer). introduced the library prep kit selector: starts with “DNA or RNA?” and goes through multiple questions to lead users to the best approach. Spent some time talking about structural variations (difficult to get with short-reads); unfortunately no Nanopores in the current ENCODE paper on BioRxiv which used HiC and BioNano. Strand-switching for full-length RNA-Seq of transcripts with a PCR-free version coming soon, now possible to get millions of reads. I wonder how soon RNA-Seq will get to the inflection point where we all drop Illumina? Dan presented a test for Ewing sarcoma using an RNA-seq for the EWSR1 translocation. Not a big update from DAN on Direct RNA-Seq you’ll have to wait till Libby’s talk tomorrow. Dan updates work on Nanopores as biological sensors by sequencing reporter oligos that are bound to antibodies which generated a linear relationship between protein and oligo concentration, improved analysis with multiplexing. Preimplantation genetic screening from low-coverage WGS on 5 day blastocysts, with high coverage of specific risk genes – made a PCR ampified WGS library and mixed this with extra-amplified target-specific PCR. Finished up with a video showing Beth doing a full workflow with minimal equipment – Omnilyse, SPRI, Voltrax and Seq (described above). Questions: can proteins be deteced on Nanopores instead of using reporter oligos? Yes! What are opportunities for really low-input? We’re working on it!
- Eric van der Helm & Lejla Imamovic (Novo Nordisk), Rapid resistome mapping using nanopore sequencing. An intersting talk on antibiotic resistance, which is estimated to impact us with up to 10 million extra deaths by 2050 (is that all). The resistome is the whole collection of antibiotic resistance genes, it changes over time and impacts response to therapy. Profiling the resistome is a perfect application for Nanopores as you need to detect both known and novel antibiotic resistance genes. Method avoids culture by cloning fragmented metagenomic DNA and screening on anitobiotic laced media for sequencing. Presenting results from Spring 2016 published as Rapid resistome mapping using nanopore sequencing in NAR 2017. Future work: removing PCR from workflow, and making other protocol optimisations.
- Philipp Euskirchen Rapid (epi-) genomic classification of brain tumors using nanopore sequencing. Brain tumours classified by histo for a long time, molecular testing now introduced by WHO. Aiming to use Nanopore sequencing to simplify molecular testing process by generating low coverage WGS for CNV and Methylation (in 1 day) and amplicon sequencing of hotpsots e.g. IDH1, TP53). CNV by rapid library prep and about 6h sequencing (about 30,00 reads or 0.1x coverage), data aligned and fed into qDNASeq to generate CNV plots. Discovered EGFR double-minutes (extrachromosomal circular sequences). Methylation “for free” from same reads (even at low-coverage) and compared well to Illumina 450k arrays for IDH status. Pan-cancer classification using both CNV and methylation data to determine Glio sub-type…in 24 hours! Using realtime monitoring of read-depth for amplicon sequencing…results in just 10 minutes.
Breakout room 2: Applications: Epigenetics and methylation
There were four breakouts going on today and I chose this one so you’ll have to ind coverage of the others elsewhere.
- Marcus Stoiber, Applications of raw nanopore signal processing: From modified bases to streaming basecalling and beyond. Get as much as we can from the raw nanopore signal using Nanoraw and basecRAWller (links to preprints). Lots of InDels in nanopore data, aim to use raw signal to resolve these in sequencing data and detect modified bases e.g. 5mC and 6mA with nanoraw. basecRAWller “a new paradigm for nanopore basecalling”, takes one observation at a time segments and averages data then streams read sequence “live”. Read the preprint for basecRAWller to get lots of information on parameters for analysis.
- Winston Timp, Measuring DNA methylation with the MinION. An analogy to computer systems: DNA sequence = hardware, User input = environment, Systems Biology = Running programs, Epigenetics = RAM (cellular memory). methylation detectable back in 2013 (Laszlo and Akeson publications from 2013). Used SS1 GC methylated PCR templates for training (+/- controls). Some kmers show big differences from 5mC:C, but some are very small and therefore difficult to call. Signal changes along kmer, getting bigger at the ends. Presented data from NA12878 genome. 94% accurate methylation detection across 77% of sites. RRBS on MCF10a by size-selection of genome data clearly shows phasing of methylation across reads; also seen in NA12878 data. Methylation in mitochondrial genome using a single-site enzymatic digestion to linearise mtGenome (possible to detect cell-of-origin in circulating tumour cells using this approach?). Future work: want to look at 5mA and 5hmC next.
- Miten Jain, Recent progress at UCSC: long reads, DNA, and RNA sequencing. Long reads: UCSC Longboard protocol means 100kb is no longer long! Megabase reads are almost here. Direct RNA-Seq: (watch out for Andrew Smiths talk tomorrow) burn in of enolase 2 poly-A RNA shows fairly normal read distribution and high accuracy, but only 50% of reads are full-length. RNA base modifications are also detectable. Questions: you quoted Evan Eichler saying “longreads will kill short reads when 2x cost”, but how many MinION reads are needed to replace Illumina RNA-Seq? Do you NEED full-length RNA-Seq reads.
- Discussions included comparison of signal-space versus base-space analysis. Sounds very much like point-of-care devices need to use signal space with a small target. Make the analysis as low-impact on compute as possible to get a POC on an iPhone in the field/clinic.
“Some mundane and incremental updates…” a downbeat title, but an upbeat conference. Sequence anything anywhere – not quite there yet but Clives update will show us what’s going to to get closer to this goal. Compare and contrast Clive’s earlier talks to what we can do today…it’s a pretty good track record.
Protein pores for now but working on other technologies.
Entire chromosomes through the pore. Splint to the telomere and read until….the other end! Maybe by 2018?
MinION sits in no-mans lab between field and lab. Lots of electronics on the flowcell, increases flowcell price and makes them non-disposable. 4500 MinION in the field.
CliveOME 2: Clive’s been in the lab, put on gloves and sequenced his genome. Initial run on 14 flowcells to get 30x genome. Second run on 1 GridION (5 flowcells) for 22x genome. A human genome in 2 days for a few $1000s. 200Gb of CliveOME is coming to public release soon.
Variability: 16Gb per flowcell, but not in ALL labs. Why? Sample, sample prep, “green fingers”? How much impact will Voltrax have on reducing variability? Should be able to get 20Gb but many only get 1-5Gb. Need to do proper quant of starting material. Start with 200 fmoles to load 10 fmoles into the flowcell for optimum performance (what happens if you load too much). Clive was actually talking about real-time support to troubleshoot runs! Making things easier by shipping flowcells with run buffer already loaded so users just need to add samples. A team is working on recommended protocols to help users. Better software interface “like an iPhone” to improve user experience (yay). Simplifying MinKNOW UI. Voltrax automates sample-prep and is moving along very nicely (albeit a small user group right now). Voltrax 2.0 uses lyophilised reagents, simpler flowcells, coming end of 2017. Everything should be simple and painless!
GridION: 2012 AGBT presentation presented original concept for GridION, a modular rack-based real-time sequencing device. Parked for a few years and re-released as a 5 MinION flowcell box (probably with PromethION flowcells soon). “Run-until” and “read-until” as part of the package. Licensed as Academic/Commercial fee-for-service. Real-time online base-calling also in the box on accelerated FPGA hardware to cope with the huge amount of data. FPGAs also allow very fast feedback, making read-until even better across large numbers of pores.
$125,000 upfront cost with $299 flowcells or $0 and $475.00 with a $15,000 maintenance contract (payable after year 1 on both boxes). Aiming to make 3 a day after June so there should be about 600 out in the field by Christmas! First GridION shipping on May 15th – it’s a secondhand CliveOME machine.
PromethION: 48 individually addressable flowcells. Chemistry is the same but everything else i new. Designed to take on Illumina X-Ten/NovaSeq. In principle it can produce more data than a NovaSeq = 2Tb now and 11Tb theoretical maximum (NovaSeq should generate 3Tb in December). Major benefit is flexibility to run 1 or 48 flowcells, avoiding the need to batch samples for big sequencers.
Currently debugging flowcell production. Getting about 1Gb per hour across 48 flowcells = 22Gb in 24 hours. Extending run-time to 4 days. Base-calling in real-time using attached compute (its a big server) bank of FPGAs can do huge amount of processing = 40 Tflops.
Should expect 1.2Tb on alpha, 4.8Tb on beta (Q4 2017), and 9.6Tb on MK1 in 2018! The box that will turn the market from short-reads to long-reads.
1D2: R9.5 1D2 for improved template – complement data. A sequencing scheme where reads are not joined, uses trickery to make sure second strand follows the 1st strand through at least 50% of the time. Short and simple sample-prep with better data than 2D. Aiming towards 1% error. Need to change pore but this affects hardly anything (Clive promised backwardly compatible upgrades last year). Released May 8th, and 2D goes in the bin!
Accuracy: getting towards 99.99% by this year, maybe even by this Summer!
Other stuff: New pores include a very exciting class of pore that has two “read heads”: how son till we get 8-track Nanopore. Raw data basecaller coming soon. MinKONW moving to writing out a FASTQ file. Improvements to shipping, removal of cold chain with flowcells stored and shipped in ambient temperatures. New packaging in wool and cardboard (knitting pattern included). Updated rapid ligation kit with freeze-dried reagents and multi-step tube-to-tube processing. Zumbadour is still in development (no data today) sample passes through sequential layers of chemistry. The new device (no name yet) a mobile analysis dongle which contains an FPGA to allow real-time basecalling on MinION as an in-line or off-line analysis tool – the laptop is going (coming before Summer 2017)!
Flongle/SmidgION: a flowcell dongle (there is one at London Calling) minimal electronics allows cheaper simpler smaller flowcells. Allows high-volume production. “The MinION we should have made”. Coming at the end of 2017. Beautiful slides (I’ll try to grab some shots). All of this needs to come together to enable SmidgION as a follow on to Flongle. Photo of MinION powered and run on Android mobile.
More than just sequencing: Blob counting is coming. Solid-state nanopores are coming. CAS9 programmable blobs – target fragments for selection and sequencing using beads. Or just count CAS9 blobs. “CAS me if you can” an Oxford Nanopore enrichment kit! Design your own gene panels, or buy ONT panels.
Metrichor and The Genome Foundry: Metrichor: deliver end-to-end commercial products from ONT technology. Will make an analytics platform called EPI2ME, which takes standard bioinformatics workflows and pipelines. New features include ability to design and build workflows. Might be possible to go from squiggle to answer using combination of FPGA dongle and Metrichor “get rid of basecalling”.
The Genome Foundry on the old Solexa campus working on synthetic DNA (a TWIST bioscience competitor perhaps).
A shitload of amazing updates coming by December this year…rock on Christmas!
Sadly I will not be at dinner so will miss the evening speaker. Thanks to ONT and all of todays speakers. See you tomorrow morning.