sgkit.Weir_Goudet_beta#
- sgkit.Weir_Goudet_beta(ds, *, stat_identity_by_state='stat_identity_by_state', merge=True)#
Estimate pairwise beta between all pairs of samples as described in Weir and Goudet 2017 [1].
Beta is the kinship scaled by the average kinship of all pairs of individuals in the dataset such that the non-diagonal (non-self) values sum to zero.
Beta may be corrected to more accurately reflect pedigree based kinship estimates using the formula
where is the estimated beta between samples which are known to be unrelated [1].- Parameters
- ds
Dataset
Genotype call dataset.
- stat_identity_by_state
Hashable
(default:'stat_identity_by_state'
) Input variable name holding stat_identity_by_state as defined by
sgkit.variables.stat_identity_by_state_spec
. If the variable is not present inds
, it will be computed usingidentity_by_state()
.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.
- ds
- Return type
- Returns
A dataset containing
sgkit.variables.stat_Weir_Goudet_beta_spec
which is a matrix of estimated pairwise kinship relative to the average kinship of all pairs of individuals in the dataset. The dimensions are namedsamples_0
andsamples_1
.
Examples
>>> import sgkit as sg >>> ds = sg.simulate_genotype_call_dataset(n_variant=3, n_sample=3, n_allele=10, seed=3) >>> # sample 2 "inherits" alleles from samples 0 and 1 >>> ds.call_genotype.data[:, 2, 0] = ds.call_genotype.data[:, 0, 0] >>> ds.call_genotype.data[:, 2, 1] = ds.call_genotype.data[:, 1, 0] >>> sg.display_genotypes(ds) samples S0 S1 S2 variants 0 7/1 8/6 7/8 1 9/5 3/6 9/3 2 8/8 8/3 8/8 >>> # estimate beta >>> ds = sg.Weir_Goudet_beta(ds).compute() >>> ds.stat_Weir_Goudet_beta.values array([[ 0.5 , -0.25, 0.25], [-0.25, 0.25, 0. ], [ 0.25, 0. , 0.5 ]]) >>> # correct beta assuming least related samples are unrelated >>> beta = ds.stat_Weir_Goudet_beta >>> beta0 = beta.min() >>> beta_corrected = (beta - beta0) / (1 - beta0) >>> beta_corrected.values array([[0.6, 0. , 0.4], [0. , 0.4, 0.2], [0.4, 0.2, 0.6]])
References
[1] - Bruce, S. Weir, and Jérôme Goudet 2017. “A Unified Characterization of Population Structure and Relatedness.” Genetics 206 (4): 2085-2103.