sgkit.Weir_Goudet_beta#
- sgkit.Weir_Goudet_beta(ds, *, stat_identity_by_state='stat_identity_by_state', merge=True)#
Estimate pairwise beta between all pairs of samples as described in Weir and Goudet 2017 [1].
Beta is the kinship scaled by the average kinship of all pairs of individuals in the dataset such that the non-diagonal (non-self) values sum to zero.
Beta may be corrected to more accurately reflect pedigree based kinship estimates using the formula \(\hat{\beta}^c=\frac{\hat{\beta}-\hat{\beta}_0}{1-\hat{\beta}_0}\) where \(\hat{\beta}_0\) is the estimated beta between samples which are known to be unrelated [1].
- Parameters:
- ds
Dataset
Genotype call dataset.
- stat_identity_by_state
Hashable
(default:'stat_identity_by_state'
) Input variable name holding stat_identity_by_state as defined by
sgkit.variables.stat_identity_by_state_spec
. If the variable is not present inds
, it will be computed usingidentity_by_state()
.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds
- Return type:
- Returns:
: A dataset containing
sgkit.variables.stat_Weir_Goudet_beta_spec
which is a matrix of estimated pairwise kinship relative to the average kinship of all pairs of individuals in the dataset. The dimensions are namedsamples_0
andsamples_1
.
Examples
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset(n_variant=3, n_sample=3, n_allele=10, seed=3) >>> # sample 2 "inherits" alleles from samples 0 and 1 >>> ds.call_genotype.data[:, 2, 0] = ds.call_genotype.data[:, 0, 0] >>> ds.call_genotype.data[:, 2, 1] = ds.call_genotype.data[:, 1, 0] >>> sg.display_genotypes(ds) samples S0 S1 S2 variants 0 7/1 8/6 7/8 1 9/5 3/6 9/3 2 8/8 8/3 8/8 >>> # estimate beta >>> ds = sg.Weir_Goudet_beta(ds).compute() >>> ds.stat_Weir_Goudet_beta.values array([[ 0.5 , -0.25, 0.25], [-0.25, 0.25, 0. ], [ 0.25, 0. , 0.5 ]]) >>> # correct beta assuming least related samples are unrelated >>> beta = ds.stat_Weir_Goudet_beta >>> beta0 = beta.min() >>> beta_corrected = (beta - beta0) / (1 - beta0) >>> beta_corrected.values array([[0.6, 0. , 0.4], [0. , 0.4, 0.2], [0.4, 0.2, 0.6]])
References
[1] - Bruce, S. Weir, and Jérôme Goudet 2017. “A Unified Characterization of Population Structure and Relatedness.” Genetics 206 (4): 2085-2103.