sgkit.sample_stats#

sgkit.sample_stats(ds, *, call_genotype_mask='call_genotype_mask', call_genotype='call_genotype', variant_allele_count='variant_allele_count', merge=True)#

Compute quality control sample statistics from genotype calls.

Parameters
ds : Dataset

Dataset containing genotype calls.

call_genotype : Hashable (default: 'call_genotype')

Input variable name holding call_genotype. Defined by sgkit.variables.call_genotype_spec. Must be present in ds.

call_genotype_mask : Hashable (default: 'call_genotype_mask')

Input variable name holding call_genotype_mask. Defined by sgkit.variables.call_genotype_mask_spec Must be present in ds.

variant_allele_count : Hashable (default: 'variant_allele_count')

Input variable name holding variant_allele_count, as defined by sgkit.variables.variant_allele_count_spec. If the variable is not present in ds, it will be computed using count_variant_alleles().

merge : bool (default: True)

If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details.

Return type

Dataset

Returns

A dataset containing the following variables: