Python API#
Basic usage:
import bio2zarr.plink as p2z
root = p2z.convert(plink_prefix, vcz_path)
This will convert the PLINK fileset with the given path prefix
(i.e. the shared prefix of the .bed, .bim, and .fam files)
to VCF Zarr stored at vcz_path.
API reference#
- bio2zarr.plink.convert(prefix, out=None, *, mode='r', variants_chunk_size=None, samples_chunk_size=None, worker_processes=0, show_progress=False)#
Convert a PLINK fileset to VCF Zarr format.
Parameters#
- prefixstr or Path
Path prefix for the PLINK fileset (i.e. the shared prefix of the .bed, .bim, and .fam files).
- outstr, Path, or None
Output path for the Zarr store. The output format depends on the value:
None: write to a temporary directory and return an in-memory
zarr.storage.MemoryStore-backed group.Ends with .zip: write to a directory, then package as a zip archive readable via
zarr.storage.ZipStore. The intermediate directory is removed.Otherwise: write directly to the given directory path.
- modestr
Mode in which the returned
zarr.Groupis opened. Use"r"(default) for read-only access or"r+"for read-write access.- variants_chunk_sizeint, optional
Number of variants per chunk. If None, a default is used.
- samples_chunk_sizeint, optional
Number of samples per chunk. If None, a default is used.
- worker_processesint
Number of worker processes for parallel encoding. 0 (the default) means use the main process only.
- show_progressbool
If True, display a progress bar during conversion.
Returns#
- zarr.Group
The root group of the Zarr store containing the converted data.