I have 10X scRNA data from human + SARSCov2.
After alignment with the correct genome, and loading in scanpy, I can view the counts for viral genes.
I am interested in cells that have viral counts aka infected cells.
For retrieving cells with viral genes, I have a list of genes which belong to virus.
I need to make a new adata but with only infected cells.
How I retrieve the adata.obs only for certain adata.var_names?
Specific instructions will be super helpful !!!
So far i have this :
covid_genes = adata.var_names.str.startswith((‘ORF1a’,‘S’, ‘ORF3a’, ‘E’, ‘M’, ‘ORF6’, ‘ORF7a’, ‘ORF7b’, ‘ORF8’, ‘N’, ‘ORF10’))
I added the columns counting the percentage of viral read expression
adata.obs[‘covid’] = np.sum(
adata[:, covid_genes].X, axis=1).A1 / np.sum(adata.X, axis=1).A1