Background

As droplet-based single cell technologies have advanced, increasingly larger sample numbers have been used to answer research questions at single cell resolution. With a larger number of droplets captured, there is an increase in the proportion of the droplets that are doublets (Figure 1).

This has been made possible because, as the droplet-based capture technologies have been optimized, methods to pool and then demultiplex samples - assign droplets to each individual in the pool - have been developed. These multiplexing methods clearly decrease cost and time of scRNA-seq experiments. If left in the dataset, doublets can significantly impact scientific conclusions such identifying spurious cell trajectories or false novel cell types. Therefore, it’s crucial to effectively clean datasets prior to downstream analyses.

In addition to demultiplexing softwares, there are also doublet detecting softwares that use the transcriptional profiles of droplets to identify doublets by simulating doublets.