Why does UMAP plot look the same regardless of the number of PCs?

Hi,

I am totally new to Scanpy as I have been using Seurat so far.

I am trying to use Scanpy in conjunction with scVI to analyze some 10X data. The behavior to which I am used in Seurat is that the UMAP plot will look somewhat different depending on the number of PCs taken into analysis. In Scanpy, there is no option to set the number of PCs in sc.tl.umap, so I assume that the only parameter that matters is n_pcs in sc.pp.neighbors. However, setting this parameter to different values changes nothing for me. Does this make sense? Am I missing something?

Thank you in advance!
Bogdan

Hi!

Your assumption sounds right, the n_pcs passed to neighbors should have an effect on the resulting UMAP. This has an effect for me, here’s an example:

import scanpy as sc
pbmc = sc.datasets.pbmc3k()

sc.pp.filter_genes(pbmc, min_counts=1)
sc.pp.normalize_total(pbmc)
sc.pp.log1p(pbmc)
sc.pp.pca(pbmc)

a = sc.pp.neighbors(pbmc, n_pcs=50, copy=True)
b = sc.pp.neighbors(pbmc, n_pcs=20, copy=True)

sc.tl.umap(a)
sc.tl.umap(b)

from matplotlib import pyplot as plt

fig, (ax1, ax2) = plt.subplots(2)
sc.pl.umap(a, ax=ax1, show=False)
sc.pl.umap(b, ax=ax2, show=False)
plt.show()

Do you get different results from this?

1 Like

Hi,

Thanks a lot for your answer and confirming my suspicion about how UMAP should behave. I’m able to reproduce your code, and I realized that the UMAP plot actually changes with the number of PCs for my data set, too, as long as I don’t use the ‘use_rep=“X_scvi”’ parameter in sc.pp.neighbors (I was following the scVI/scanpy tutorial https://scvi.readthedocs.io/en/stable/tutorials/scanpy.html). I guess, the number of PCs didn’t matter since I forced the function not to use the default X_pca representation. Seems kind of obvious in retrospect. Anyways, perhaps it would be helpful to explain in which cases n_pcs will be ignored in the sc.pp.neighbors help file?

Thanks!

1 Like