sgkit.call_allele_frequencies#
- sgkit.call_allele_frequencies(ds, *, call_allele_count='call_allele_count', merge=True)#
Compute per sample allele frequencies from genotype calls.
- Parameters:
- ds
Dataset
Dataset containing genotype calls.
- call_allele_count
Hashable
(default:'call_allele_count'
) Input variable name holding call_allele_count as defined by
sgkit.variables.call_allele_count_spec
. If the variable is not present inds
, it will be computed usingcount_call_alleles()
.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds
- Return type:
- Returns:
: A dataset containing
sgkit.variables.call_allele_frequency_spec
of allele frequencies with shape (variants, samples, alleles) and values corresponding to the frequency of non-missing occurrences of each allele.
Examples
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset(n_variant=4, n_sample=2, seed=1) >>> sg.display_genotypes(ds) samples S0 S1 variants 0 1/0 1/0 1 1/0 1/1 2 0/1 1/0 3 0/0 0/0 >>> sg.call_allele_frequencies(ds)["call_allele_frequency"].values array([[[0.5, 0.5], [0.5, 0.5]], [[0.5, 0.5], [0. , 1. ]], [[0.5, 0.5], [0.5, 0.5]], [[1. , 0. ], [1. , 0. ]]])