sgkit.genee#

sgkit.genee(ds, ld, *, reg_covar=1e-06)#

Compute gene-ε as described in Cheng, et al. 2020 [1].

Parameters
ds : Dataset

Dataset containing beta values (OLS betas or regularized betas).

ld : ndarray | ArrayUnion[ndarray, Array]

2D array of LD values.

reg_covar : float (default: 1e-06)

Non-negative regularization added to the diagonal of covariance. Passed to scikit-learn GaussianMixture.

Warning

Unlike the implementation in [2], this function will always use the second mixture component with the largest variance, rather than the first mixture component with the largest variance if it is composed of more than 50% of the SNPs.

Return type

DataFrame

Returns

A dataframe containing the following fields:

  • test_q: test statistic

  • q_var: test variance

  • pval: p-value

References

[1] - W. Cheng, S. Ramachandran, and L. Crawford (2020). Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLOS Genetics. 16(6): e1008855.

[2] - ramachandran-lab/genee