Load, process and save data in chunks

My dataset is too big (5M cells) to process in memory.
I know about backed feature and chunked loading, but can I save data back to h5ad in chunks?
Here is more concrete example of me normalizing large data file:

chunk_size =100    
ad = sc.read_h5ad(file_name, backed='r+')
for chunk, start, end in ad.chunked_X(chunk_size):
     sc.pp.normalize_total(AnnData(chunk), target_sum=1e6, inplace=True)

but how can I write the processed chunks incrementally to a new file?

Unfortunately I’m not very familiar with chunked loading and writing. @ivirshup could you help here?

So, you can modify at least the X component of your backed AnnData object in place. We’re currently looking at having a more thorough way to handle this by integrating with dask. In particular, normalize_total should be able to work with dask as of this PR. Unfortunately, there’s not much support for sparse arrays in dask at the moment.

@Koncopd may know a bit more about current chunked capabilities in scanpy.