sgkit.infer_call_ploidy

sgkit.infer_call_ploidy(ds, *, call_genotype='call_genotype', call_genotype_non_allele='call_genotype_non_allele', merge=True)

Infer the ploidy of each call genotype based on the number of non-allele values in each call genotype.

Parameters
ds : DatasetDataset

Dataset containing genotype calls.

call_genotype : HashableHashable (default: 'call_genotype')

Input variable name holding call_genotype as defined by sgkit.variables.call_genotype_spec. Must be present in ds.

call_genotype_non_allele : HashableHashable (default: 'call_genotype_non_allele')

Input variable name holding call_genotype_non_allele as defined by sgkit.variables.call_genotype_non_allele_spec. If the variable is not present in ds, it will be computed assuming that allele values less than -1 are non-alleles in mixed ploidy datasets, or that no non-alleles are present in fixed ploidy datasets.

merge : boolbool (default: True)

If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.

Return type

DatasetDataset

Returns

A dataset containing sgkit.variables.call_ploidy_spec.