sgkit.invert_relationship_matrix#

sgkit.invert_relationship_matrix(ds, *, relationship, subset_sample=None, subset_rechunk=None, merge=True)#

Calculate the inverse relationship (sub-) matrix.

Parameters:
ds Dataset

Dataset containing a relationship matrix.

relationship Hashable

Variable containing the relationship matrix.

subset_sample Hashable | NoneOptional[Hashable] (default: None)

Optionally specify a variable containing an array of booleans which indicate samples defining a sub-matrix of relationships to invert.

subset_rechunk int | NoneOptional[int] (default: None)

Optionally specify sizes for re-chunking the sub-matrix defined by the subset variable. This can be used to avoid value errors caused by uneven chunk sizes in the resulting sub-matrix.

merge bool (default: True)

If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.

Return type:

Dataset

Returns:

: A dataset containing sgkit.variables.stat_inverse_relationship_spec The dimensions are named samples_0 and samples_1. If a subset of samples was specified, then nan values are used to indicate relationships that are not included within the subset.

Examples

>>> import xarray as xr
>>> import sgkit as sg
>>> ds = xr.Dataset()
>>> ds["stat_pedigree_relationship"] = ["samples_0", "samples_1"], [
...     [1.   , 0.   , 0.   , 0.5  , 0.   , 0.25 ],
...     [0.   , 1.   , 0.   , 0.5  , 0.5  , 0.5  ],
...     [0.   , 0.   , 1.   , 0.   , 0.5  , 0.25 ],
...     [0.5  , 0.5  , 0.   , 1.   , 0.25 , 0.625],
...     [0.   , 0.5  , 0.5  , 0.25 , 1.   , 0.625],
...     [0.25 , 0.5  , 0.25 , 0.625, 0.625, 1.125]
... ]
>>> sg.invert_relationship_matrix(
...     ds,
...     relationship="stat_pedigree_relationship",
... ).stat_inverse_relationship.values  
array([[ 1.5,  0.5,  0. , -1. ,  0. ,  0. ],
       [ 0.5,  2. ,  0.5, -1. , -1. ,  0. ],
       [ 0. ,  0.5,  1.5,  0. , -1. ,  0. ],
       [-1. , -1. ,  0. ,  2.5,  0.5, -1. ],
       [ 0. , -1. , -1. ,  0.5,  2.5, -1. ],
       [ 0. ,  0. ,  0. , -1. , -1. ,  2. ]])
>>> # inverse of a sub-matrix
>>> ds["subset_sample"] = "samples", [False, False, False, True, True, True]
>>> sg.invert_relationship_matrix(
...     ds,
...     relationship="stat_pedigree_relationship",
...     subset_sample="subset_sample",
... ).stat_inverse_relationship.values.round(3)  
array([[   nan,    nan,    nan,    nan,    nan,    nan],
       [   nan,    nan,    nan,    nan,    nan,    nan],
       [   nan,    nan,    nan,    nan,    nan,    nan],
       [   nan,    nan,    nan,  1.567,  0.233, -1.   ],
       [   nan,    nan,    nan,  0.233,  1.567, -1.   ],
       [   nan,    nan,    nan, -1.   , -1.   ,  2.   ]])