DoubletFinder

DoubletFinder is a transcription-based doublet detection software that uses simulated doublets to find droplets that has a high proportion of neighbors that are doublets. We have provided a wrapper script that takes common arguments for DoubletFinder and we also provide an example script that you can run manually in R if you prefer.

Data

This is the data that you will need to have preparede to run DoubletFinder:

Required

  • A QC-filtered and normalized seurat object saved as an rds object ($SEURAT_RDS)

    • For example, using the Seurat Vignette

    • If you run DoubletFinder manually, you can use any data format of interest and read in with a method that works for your data.

  • Output directory ($DOUBLETFINDER_OUTDIR)

  • Expected number of doublets ($DOUBLETS)

Run DoubletFinder

⏱️ Expected Resource Usage

~1h using a total of 15Gb memory when using 2 thread for the full Test Dataset which contains ~20,982 droplets of 13 multiplexed donors,

You can either run DoubletFinder with the wrapper script we have provided or you can run it manually if you would prefer to alter more parameters.

DOUBLETFINDER_OUTDIR=/path/to/output/DoubletFinder
SEURAT_RDS=/path/to/TestData_Seurat.rds
DOUBLETS=3200
singularity exec Demuxafy.sif DoubletFinder.R -o $DOUBLETFINDER_OUTDIR -s $SEURAT_RDS -c TRUE -d $DOUBLETS

You can provide many other parameters as well which can be seen from running a help request:

singularity exec Demuxafy.sif DoubletFinder.R -h


usage: DoubletFinder.R [-h] -o OUT -s SEURAT_OBJECT -c SCT -d DOUBLET_NUMBER [-p PCS] [-n PN]

optional arguments:
  -h, --help            show this help message and exit
  -o OUT, --out OUT     The output directory where results will be saved
  -s SEURAT_OBJECT, --seurat_object SEURAT_OBJECT
                        A QC, normalized seurat object with classifications/clusters as Idents() saved as an rds object.
  -c SCT, --sct SCT     Whether sctransform was used for normalization.
  -d DOUBLET_NUMBER, --doublet_number DOUBLET_NUMBER
                        Number of expected doublets based on droplets captured.
  -p PCS, --PCs PCS     Number of PCs to use for 'doubletFinder_v3' function.
  -n PN, --pN PN        Number of doublets to simulate as a proportion of the pool size.

DoubletFinder Results and Interpretation

After running the DoubletFinder, you will have multiple files in the $DOUBLETFINDER_OUTDIR:

/path/to/output/DoubletFinder
├── DoubletFinder_doublets_singlets.tsv
├── DoubletFinder_doublet_summary.tsv
└── pKvBCmetric.png

Here’s a more detailed description of the contents of each of those files:

  • DoubletFinder_doublet_summary.tsv

    • A sumamry of the number of singlets and doublets predicted by DoubletFinder.

      Classification

      Droplet N

      doublet

      3014

      singlet

      16395

  • DoubletFinder_doublets_singlets.tsv

    • The per-barcode singlet and doublet classification from DoubletFinder.

      Barcode

      DoubletFinder_score

      DoubletFinder_DropletType

      AAACCTGAGATAGCAT-1

      0.206401766004415

      singlet

      AAACCTGAGCAGCGTA-1

      0.144039735099338

      singlet

      AAACCTGAGCGATGAC-1

      0.191501103752759

      singlet

      AAACCTGAGCGTAGTG-1

      0.212472406181015

      singlet

      AAACCTGAGGAGTTTA-1

      0.242273730684327

      singlet

      AAACCTGAGGCTCATT-1

      0.211368653421634

      singlet

      AAACCTGAGGGCACTA-1

      0.626379690949227

      doublet

  • pKvBCmetric.png

    • This is the metric that DoubletFinder uses to call doublets and singlets. Typically the pK value at the maximum BC value is the best doublet calling threshold.

      _images/pKvBCmetric.png
    • If you do not have a clear BC maximum, see responses from the DoubletFinder developer here and here for possible solutions.

Merging Results with Other Software Results

We have provided a script that will help merge and summarize the results from multiple softwares together. See Combine Results.

Citation

If you used the Demuxafy platform for analysis, please reference our preprint as well as DoubletFinder.