Best practises for cross-species comparisons

I have scRNA-seq data from two species, human and mouse, for the same tissues at several developmental time points. I am interested to compare the data between the species to identify common (and different) cell-types and lineages etc. Are there any best practise guidelines or advice on how to use scanpy for this?

Naively, I imagine running the analysis separately on the human and mouse matrixes, rather than treating them as batches within the same analysis (because of the heterogeneity). Then later, one could compare the annotated cell populations and expressed genes (via mapping the human-mouse orthologs) and somehow ‘align’ the cell-lineages. I expect this to be non-trivial, given the divergence of the species and the not direct equivalence of different tissues and developmental time points.

I hope this isn’t too broad to ask here and thank you for any help!

Was trying to edit this post to make it less vague (but apparently I can’t), I’m thinking of things like using canonical correlation analysis and mutual nearest neighbours to merge datasets across-species and dynamic time warping to compare different pseudotime lineages. There are some implementations in Seurat v3 and CellAlign and the new CAPTIAL for python.

I am not aware of any ‘best practice’ but usually you will end up doing some alignment between the species cells to identify common types.

You may want to try Scanorama (http://cb.csail.mit.edu/cb/scanorama/) for the cell alignment as this works with scanpy and is supposed to be comparable to CCA (with respect to the output as the method is entirely different)

Hi, I’d love to join the discussion since I am confused of ingreting time course scRNA-seq data.

I have tried Seurat integrate(v2 or v3), but the result I think is over-fixing on biological variation with integrating time course data

My current approach is simply merging them. Me and my mentor are worried this approach will be argued by reviewer.

It looks like Scanorama have better approvement on time course data(I have NOT tried yet), according to article’s statement.