sgkit.io.plink.write_plink#
- sgkit.io.plink.write_plink(ds, *, path=None, bed_path=None, bim_path=None, fam_path=None)#
Convert a dataset to a PLINK file.
If any of the following pedigree-specific variables are defined in the dataset they will be included in the PLINK fam file. Otherwise, the PLINK fam file will contain missing values for these fields, except for the within-family identifier for samples, which will be taken from the dataset
sample_id
.sample_family_id
: Family identifier commonly referred to as FIDsample_member_id
: Within-family identifier for samplesample_paternal_id
: Within-family identifier for father of samplesample_maternal_id
: Within-family identifier for mother of samplesample_sex
: Sex code equal to 1 for male, 2 for female, and -1for missing
sample_phenotype
: Phenotype code equal to 1 for control, 2 for case,and -1 for missing
- Parameters:
- ds
Dataset
Dataset to convert to PLINK.
- path
str
|Path
|None
Union
[str
,Path
,None
] (default:None
) Path to PLINK file set. This should not include a suffix, i.e. if the files are at data.{bed,fam,bim} then only ‘data’ should be provided (suffixes are added internally). Either this path must be provided or all 3 of bed_path, bim_path and fam_path.
- bed_path
str
|Path
|None
Union
[str
,Path
,None
] (default:None
) Path to PLINK bed file. This should be a full path including the .bed extension and cannot be specified in conjunction with path.
- bim_path
str
|Path
|None
Union
[str
,Path
,None
] (default:None
) Path to PLINK bim file. This should be a full path including the .bim extension and cannot be specified in conjunction with path.
- fam_path
str
|Path
|None
Union
[str
,Path
,None
] (default:None
) Path to PLINK fam file. This should be a full path including the .fam extension and cannot be specified in conjunction with path.
- ds
- Return type:
Warning
This function is only applicable to diploid, biallelic datasets.
- Raises:
ValueError – If path and one of bed_path, bim_path or fam_path are provided.
ValueError – If ploidy of provided dataset != 2
ValueError – If maximum number of alleles in provided dataset != 2