Clarification of QC metrics

Hi,

Thanks for making Scanpy! It works wonderfully, and is in general easy to use.

However, I am confused by the exact meaning of QC metrics as calculated by scanpy.pp.calculate_qc_metrics. As used in multiple tutorials this one outputs ‘n_genes_by_counts’. I am struggling to understand exactly what this value means.

I cannot see a further explanation in the documentation for this function, but the pmb3ck tutorial list this as:

  • the number of genes expressed in the count matrix

But I can still not wrap my head around exactly what it means.

I think I figured it out. The pbm3ck tutorial is a replicate of seurats tutorial, and it seems like the equivalent metric in seurat is ‘nFeature_RNA’. There this metric is listed as the number of RNA molecules detected per cell. So for every cell we would have a value for nFeature_RNA. For example cell A might have a value of 732. This means that we detected 732 genes in this cell.

I don’t know what threshold scanpy would use for detection, but I guess it would be count value of more than 0.