sgkit.count_call_alleles#
- sgkit.count_call_alleles(ds, *, call_genotype='call_genotype', merge=True)#
Compute per sample allele counts from genotype calls.
- Parameters:
- ds
Dataset
Dataset containing genotype calls.
- call_genotype
Hashable
(default:'call_genotype'
) Input variable name holding call_genotype as defined by
sgkit.variables.call_genotype_spec
. Must be present inds
.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds
- Return type:
- Returns:
: A dataset containing
sgkit.variables.call_allele_count_spec
of allele counts with shape (variants, samples, alleles) and values corresponding to the number of non-missing occurrences of each allele.
Examples
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset(n_variant=4, n_sample=2, seed=1) >>> sg.display_genotypes(ds) samples S0 S1 variants 0 1/0 1/0 1 1/0 1/1 2 0/1 1/0 3 0/0 0/0
>>> sg.count_call_alleles(ds)["call_allele_count"].values array([[[1, 1], [1, 1]], [[1, 1], [0, 2]], [[1, 1], [1, 1]], [[2, 0], [2, 0]]], dtype=uint8)