Stacey is the Director of the Genomics Platform at the Broad Institute. Gave us a brief look back to 2006: 117 instruments (GA’s) producing 600,000 bases per day and 1 genome, versus 30 machines producing 12,000,000,000,000 bases per day and 32693 genomes!
But one size does not fit all where genome sequencing is concerned. Different types of sequencing can be applied to specific projects, Population Genomics=large scale, Mendelian=phasing, Cancer=very deep sequencing. From a 10X perspective the Broad is interested primarily in the phased sequencing and low-input. So far they’ve produced about 30 Chromium genomes and have assessed bias and SNP calling from a single library (rather than previously recommended one TruSeq library and one 10X library per sample).
The workflow is compatible and scalable with current platforms at the Broad, high-MW DNA is the biggest challenge but should be “easily automatable”. The instrument is robust and automation friendly. Library prep is improved and can now run on any platform. GC bias is better on 10X Chromium than Broads PCR+ method, and are almost as good as PCR-free preps. Chromium produces good SNP/InDel calls (out of the box).
Stacey presented a test of phasing analysis from Dan MacArthurs lab. These were cell lines with mutations in TITIN so needed to get very long-range sequencing. N50 of 2.4MB, with the longest block at 11.7MB. She then presented another case resolving complexity in tumour rearrangements. And wrapped up talking about large-scale sequencing studies. Used the recent C4 association example in Schizophrenia and asked if this discovery could have been facilitated by linked-reads? Took the NA12878 10X Chromium genomes and saw that they could phase both forms of C4. They were able to genotype the sample for “BS” and “AL-BL” haplotypes, this allowed direct observation of haplotypes without trios.
Next steps include development of GATK to use linked-read information. Increasing to large-scale population studies – will the Broad use 10X as default on X Ten.
Scott’s been working on the Chromium single cell application in graft vs host disease. He wants to marry the complexity of the disease to the complexity of the available tools, flow can only query limited markers , and bulk RNA-seq cannot be sufficiently resolved. Single-cell on 10X Chromium allows analysis of 2000-3000 cells per sample. used several controls in their experiment autologous control, allogenic control, and a T-cell stimulation bead control.
Measuring gene expression changes over time which Scott described as very difficult in bluk RNA-seq, but single-cell resolves this using Monocole (a novel unsupervised machine-learning algorithm) to track cells through “psuedotime”. They could see three major branches 1) + proliferating – activating, 2) + proliferating + activating, 3) – proliferating – activating. What does this mean? Branch 1 may be most interesting as there is no activation seen (were they ever activated) but the are proliferating. Most drugs inhibit activation NOT proliferation so these may represent a new therapeutic target. All this was from 6000 cells, still another 44,284 cells to analyse!