sgkit.hardy_weinberg_test
sgkit.hardy_weinberg_test#
- sgkit.hardy_weinberg_test(ds, *, genotype_count=None, ploidy=None, alleles=None, merge=True)#
Exact test for HWE as described in Wigginton et al. 2005 [1].
- Parameters
- ds :
Dataset Dataset containing genotype calls or precomputed genotype counts.
- genotype_count :
Hashable|NoneOptional[Hashable] (default:None) Name of variable containing precomputed genotype counts, by default None. If not provided, these counts will be computed automatically from genotype calls. If present, must correspond to an (N, 3) array where N is equal to the number of variants and the 3 columns contain heterozygous, homozygous reference, and homozygous alternate counts (in that order) across all samples for a variant.
- ploidy :
int|NoneOptional[int] (default:None) Genotype ploidy, defaults to
ploidydimension of provided dataset. If the ploidy dimension is not present, then this value must be set explicitly. Currently HWE calculations are only supported for diploid datasets, i.e.ploidymust equal 2.- alleles :
int|NoneOptional[int] (default:None) Genotype allele count, defaults to
allelesdimension of provided dataset. If the alleles dimension is not present, then this value must be set explicitly. Currently HWE calculations are only supported for biallelic datasets, i.e.allelesmust equal 2.- merge :
bool(default:True) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds :
Warning
This function is only applicable to diploid, biallelic datasets.
- Return type
- Returns
Dataset containing (N = num variants):
- variant_hwe_p_value[array-like, shape: (N, O)]
P values from HWE test for each variant as float in [0, 1].
References
- [1] Wigginton, Janis E., David J. Cutler, and Goncalo R. Abecasis. 2005.
“A Note on Exact Tests of Hardy-Weinberg Equilibrium.” American Journal of Human Genetics 76 (5): 887–93.
- Raises
NotImplementedError – If ploidy of provided dataset != 2
NotImplementedError – If maximum number of alleles in provided dataset != 2