Calculate_qc_metrics gene count

Hi,

I’m not very used to scanpy but I’m sure I’ll be using it a lot in the next months! :slight_smile:

I’m a bit confused about the “pct_counts_in_top_X_genes” column that come from calculate_qc_metrics function.
I’ve been demultiplexing data with STARsolo and loaded the raw matrix in scanpy (BCs have not the usual meaning in this experiment).

adata = sc.read_10x_mtx("<path>/<ID>.noPCRdup.Solo.out/Gene/raw/")

I ran calculate_qc_metrics on the count matrix and found ~15k genes with at least one count.

sc.pp.calculate_qc_metrics(adata, inplace=True)
adata.var.loc[adata.var.total_counts > 0, :].total_counts.count()

But when I check the number of BC with genes out of the top 500 most expressed, only 1 is listed and it only has ~900 different genes (n_cells_by_counts).

display(adata.obs.loc[adata.obs.pct_counts_in_top_500_genes < 100, :])

How is this possible? Where am I mistaking?

Cheers,
Mathieu

Hi,

To answer my own question if anyone has the same…
I actually made a confusion, the top X genes are regarding the cell, not all cells.

Cheers,
Mathieu