Python API

Python API#

Basic usage:

import bio2zarr.tskit as ts2z

ts2z.convert(ts_path, vcz_path, worker_processes=8)

This will convert the tskit tree sequence stored at ts_path to VCF Zarr stored at vcz_path using 8 worker processes. The details of how we map from the tskit Data model to VCF Zarr are taken care of by tskit.TreeSequence.map_to_vcf_model() method, which is called with no parameters by default if the model_mapping parameter to convert() is not specified.

For more control over the properties of the output, for example to pick a specific subset of individuals, you can use map_to_vcf_model() to return the required mapping:

model_mapping = ts.map_to_vcf_model(individuals=[0, 1])
ts2z.convert(ts, vcz_path, model_mapping=model_mapping)

API reference#

bio2zarr.tskit.convert(ts_or_path, vcz_path, *, model_mapping=None, contig_id=None, isolated_as_missing=False, variants_chunk_size=None, samples_chunk_size=None, worker_processes=0, show_progress=False)#

Convert a tskit.TreeSequence (or path to a tree sequence file) to VCF Zarr format stored at the specified path.

Todo

Document parameters