plink2zarr#
Convert plink data to the VCF Zarr specification reliably in parallel.
See CLI Reference for detailed documentation on command line options.
Conversion of the plink data model to VCF follows the semantics of plink1.9 as closely as possible. That is, given a binary plink fileset with prefix “fileset” (i.e., fileset.bed, fileset.bim, fileset.fam), running
$ plink2zarr convert fileset out.vcz
should produce the same result in out.vcz
as
$ plink1.9 --bfile fileset --keep-allele-order --recode vcf-iid --out tmp
$ vcf2zarr convert tmp.vcf out.vcz
Warning
It is important to note that we follow the same conventions as plink 2.0 where the A1 allele in the bim file is the VCF ALT and A2 is the REF.
Note
Currently we only convert the basic VCF-like data from plink, and don’t include phenotypes and pedigree information. These are planned as future enhancements. Please comment on this issue if you are interested in this functionality.