sgkit.pedigree_contribution#
- sgkit.pedigree_contribution(ds, *, method='even', chunks=-1, parent='parent', stat_Hamilton_Kerr_tau='stat_Hamilton_Kerr_tau', merge=True)#
Calculate the expected genomic contribution of each sample to each other sample based on pedigree structure.
- Parameters:
- ds
Dataset
Dataset containing pedigree structure.
- method {‘even’, ‘variable’}
Literal
['even'
,'variable'
] (default:'even'
) The method used for estimating genomic contributions. The ‘even’ method assumes that all samples are of a single, even ploidy (e.g., diploid) and have even contributions from each parent. The ‘variable’ method allows for un-even contributions due to ploidy manipulation and/or clonal reproduction.
- chunks
Hashable
(default:-1
) Optionally specify chunks for the returned array. A single chunk is used by default. Currently, chunking is only supported for a single axis.
- parent
Hashable
(default:'parent'
) Input variable name holding parents of each sample as defined by
sgkit.variables.parent_spec
. If the variable is not present inds
, it will be computed usingparent_indices()
.- stat_Hamilton_Kerr_tau
Hashable
(default:'stat_Hamilton_Kerr_tau'
) Input variable name holding stat_Hamilton_Kerr_tau as defined by
sgkit.variables.stat_Hamilton_Kerr_tau_spec
. This variable is only required for the ‘variable’ method.- merge
bool
(default:True
) If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables.
- ds
- Return type:
- Returns:
: A dataset containing
sgkit.variables.stat_pedigree_contribution_spec
.- Raises:
ValueError – If an unknown method is specified.
ValueError – If the ‘even’ method is specified for an odd-ploidy dataset.
ValueError – If the ‘even’ method is specified and the length of the ‘parents’ dimension is not 2.
NotImplementedError – If chunking is specified for both axes.
Note
Dimensions of
sgkit.variables.stat_pedigree_contribution_spec
are namedsamples_0
andsamples_1
.Examples
>>> ds = xr.Dataset() >>> ds["sample_id"] = "samples", ["S0", "S1", "S2", "S3", "S4", "S5"] >>> ds["parent_id"] = ["samples", "parents"], [ ... [ ".", "."], ... [ ".", "."], ... ["S1", "S0"], ... ["S2", "S0"], ... ["S0", "S2"], ... ["S1", "S3"] ... ] >>> ds = pedigree_contribution(ds) >>> ds.stat_pedigree_contribution.values array([[1. , 0. , 0.5 , 0.75 , 0.75 , 0.375], [0. , 1. , 0.5 , 0.25 , 0.25 , 0.625], [0. , 0. , 1. , 0.5 , 0.5 , 0.25 ], [0. , 0. , 0. , 1. , 0. , 0.5 ], [0. , 0. , 0. , 0. , 1. , 0. ], [0. , 0. , 0. , 0. , 0. , 1. ]])