sgkit.convert_call_to_index#
- sgkit.convert_call_to_index(ds, *, call_genotype='call_genotype', merge=True)#
Convert each call genotype to a single integer value.
- Parameters:
- ds
Dataset
Dataset containing genotype calls.
- call_genotype
Hashable
(default:'call_genotype'
) Input variable name holding call_genotype as defined by
sgkit.variables.call_genotype_spec
. Must be present inds
.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds
- Return type:
- Returns:
: A dataset containing
sgkit.variables.call_genotype_index_spec
andsgkit.variables.call_genotype_index_mask_spec
. Genotype calls with missing alleles will result in an index of-1
.
Warning
This method does not support mixed-ploidy datasets.
- Raises:
ValueError – If the dataset contains mixed-ploidy genotype calls.
Examples
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset( ... n_variant=4, ... n_sample=2, ... missing_pct=0.05, ... seed=1, ... ) >>> sg.display_genotypes(ds) samples S0 S1 variants 0 ./0 1/0 1 1/0 1/1 2 0/1 1/0 3 ./0 0/0 >>> sg.convert_call_to_index(ds)["call_genotype_index"].values array([[-1, 1], [ 1, 2], [ 1, 1], [-1, 0]]...)
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset( ... n_variant=4, ... n_sample=2, ... n_allele=10, ... missing_pct=0.05, ... seed=1, ... ) >>> sg.display_genotypes(ds) samples S0 S1 variants 0 5/4 1/0 1 7/7 8/8 2 4/7 ./9 3 3/0 5/5 >>> sg.convert_call_to_index(ds)["call_genotype_index"].values array([[19, 1], [35, 44], [32, -1], [ 6, 20]]...)