bio2zarr#
bio2zarr
efficiently converts common bioinformatics formats to
Zarr format.
Tools#
vcfpartition is a utility to split an input VCF into a given number of partitions. This is useful for parallel processing of VCF data.
Development status#
bio2zarr
is in development, contributions, feedback and issues are welcome
at the GitHub repository.
Support for converting PLINK data to VCF Zarr is partially implemented, and adding BGEN and tskit support is also planned. If you would like to see support for other formats (or an interested in helping with implementing), please open an issue on Github to discuss!
The package is currently focused on command line interfaces, but a Python API is also planned.
Warning
Although it is possible to import the bio2zarr Python package the APIs are purely internal for the moment and will change in arbitrary ways. Please don’t use them (or open issues about them on GitHub).