sgkit.hardy_weinberg_test#

sgkit.hardy_weinberg_test(ds, *, genotype_count=None, ploidy=None, alleles=None, merge=True)#

Exact test for HWE as described in Wigginton et al. 2005 [1].

Parameters
ds : Dataset

Dataset containing genotype calls or precomputed genotype counts.

genotype_count : Hashable | NoneOptional[Hashable] (default: None)

Name of variable containing precomputed genotype counts, by default None. If not provided, these counts will be computed automatically from genotype calls. If present, must correspond to an (N, 3) array where N is equal to the number of variants and the 3 columns contain heterozygous, homozygous reference, and homozygous alternate counts (in that order) across all samples for a variant.

ploidy : int | NoneOptional[int] (default: None)

Genotype ploidy, defaults to ploidy dimension of provided dataset. If the ploidy dimension is not present, then this value must be set explicitly. Currently HWE calculations are only supported for diploid datasets, i.e. ploidy must equal 2.

alleles : int | NoneOptional[int] (default: None)

Genotype allele count, defaults to alleles dimension of provided dataset. If the alleles dimension is not present, then this value must be set explicitly. Currently HWE calculations are only supported for biallelic datasets, i.e. alleles must equal 2.

merge : bool (default: True)

If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.

Warning

This function is only applicable to diploid, biallelic datasets.

Return type

Dataset

Returns

Dataset containing (N = num variants):

variant_hwe_p_value[array-like, shape: (N, O)]

P values from HWE test for each variant as float in [0, 1].

References

  • [1] Wigginton, Janis E., David J. Cutler, and Goncalo R. Abecasis. 2005.

    “A Note on Exact Tests of Hardy-Weinberg Equilibrium.” American Journal of Human Genetics 76 (5): 887–93.

Raises