I’ve been meaning to write up a post on a BioRxiv report from earlier this year: “Scaling single cell transcriptomics through split pool barcoding”1. The Seelig Lab at the University of Washington have developed a single-cell RNA sequencing method to enable labelling RNA molecules with cell-of-origin information using combinatorial indexing2,3.

Use of single-cell RNA sequencing has exploded in my lab. We’re running lots of 10X Genomics, as well as some Fluidigm C1, Drop-Seq and other home-brew methods. scRNA-Seq is rapidly becoming a mainstream tool in biological research. However the current methods have several barriers to entry: primarily capital cost of hardware and expensive reagents. At least one system provider is working to solve this – DolomiteBio’s new Nadia system allows low-cost entry to the Drop-seq method, and potentially complete flexibility over single-cel methods development (I’ll cover this in a future post).


The Seelig lab has removed the need for any specific single-cell instrumentation. The SPLiT-Seq method uses only basic laboratory equipment, and implements the Drop-Seq library prep4 after cell RNA is labelled. In the report the authors describe the method (see figure below) and present data from an experiment where they analysed the transcriptome of the postnatal day 5 mouse brain. They prepped and sequenced over 100,000 cells and identify 13 neuronal populations.

Perhaps the major drawback is the need to fix cells, something the single-cell community has not had huge success with.

The group have a webpage for SPLiT-Seq where you can find the protocol, oligo sequences, an FAQ and videos of the process. The FAQ offers advice on the maximum number of cells you can process (see note below), suggests tat capture efficiency is around 30-50% (from in situ RT to the final library), and points users to an email address for questions: splitseq@gmail.com.

The number of cells that can be processed in an experiment is determined by the number of barcodes available.  And the group suggest that you should process cell numbers that are no higher than ~5% of the total barcode combinations (although cells can be split into sub-libraries before lysis so more cells could be run in the initial ligation-barcoding steps, and then sub-divided for RT and library prep). There are 884,736 barcode combinations after 3 ligations. As such the method is likely to work optimally, and without the need to create sub-libraries, on 44,236 cells.

The authors did perform a species-mixing experiment using mouse (NIH/3T3) and human (Hela-S3) cells at approximately equal proportions to assess doublet rates. They present two different experiments using varying numbers of cells. The first resulted in just 629 uniquely barcoded cells, and gernateed a doublet rate of just under 10%. The second generated almost 10,000 uniquely barcoded cells, and maintained the same 10% doublet rate. Slightly unexpected is their report that increasing cell numbers to 25,000 reduces doublet rates to 6.6% (I am sure this is something reviewers will pick up on). This doublet rate is high when compared to other platforms – but this is an issue the single-cell community still has to fix, and is no more of a problem with SPLiT-Seq than other methods, as long as users are aware.

The costs of SPLiT-Seq

The abstract states that SPLiT-Seq costs $0.01 per cell, but Fig1B shows a cost more like $0.025 at 40,000 cells per experiment. Only $0.015 per cell more, but this equates to $1000 rather than $400 per sample. Whether single-cell RNA sequencing users will be willing to pay such high costs for a non-commercial method is debatable. However any development in single-cell RNA-Seq methodology is welcome. This is a technology space where domination by a single company appears to be less likely than Illumina’s total domination in the sequencing space.


In the report the group state that “as sequencing capacity increases, SPLiT-seq will enable profiling of billions of cells in a single experiment”. The method can probably scale; addition of a 4th barcoding ligation will extend the number of barcodes to 84 million, and a 5th would get to 8 billion. And the authors describe how significantly more barcode combinations can be generated by performing the RT with barcoded oligo-dT primers, processed in multiple-wells of a plate. If RT is done in a 384 well plate they estimate that 22 billion barcode combinations could be generated.

However a billion cells sequenced at 50,000 reads per cell would still require some 40 NovaSeq S4 flowcells! And even Chan-Zuckerberg might not have pockets deep enough for a $1 million per sample experiment!!!