sgkit.genee#

sgkit.genee(ds, ld, *, reg_covar=1e-06)#

Compute gene-ε as described in Cheng, et al. 2020 [1].

Parameters:
ds Dataset

Dataset containing beta values (OLS betas or regularized betas).

ld ndarray | ArrayUnion[ndarray, Array]

2D array of LD values.

reg_covar float (default: 1e-06)

Non-negative regularization added to the diagonal of covariance. Passed to scikit-learn GaussianMixture.

Warning

Unlike the implementation in [2], this function will always use the second mixture component with the largest variance, rather than the first mixture component with the largest variance if it is composed of more than 50% of the SNPs.

Return type:

DataFrame

Returns:

: A dataframe containing the following fields:

  • test_q: test statistic

  • q_var: test variance

  • pval: p-value

References

[1] - W. Cheng, S. Ramachandran, and L. Crawford (2020). Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLOS Genetics. 16(6): e1008855.

[2] - ramachandran-lab/genee