sgkit.sample_stats
sgkit.sample_stats#
- sgkit.sample_stats(ds, *, call_genotype_mask='call_genotype_mask', call_genotype='call_genotype', variant_allele_count='variant_allele_count', merge=True)#
- Compute quality control sample statistics from genotype calls. - Parameters
- ds : Dataset
- Dataset containing genotype calls. 
- call_genotype : Hashable(default:'call_genotype')
- Input variable name holding call_genotype. Defined by - sgkit.variables.call_genotype_spec. Must be present in- ds.
- call_genotype_mask : Hashable(default:'call_genotype_mask')
- Input variable name holding call_genotype_mask. Defined by - sgkit.variables.call_genotype_mask_specMust be present in- ds.
- variant_allele_count : Hashable(default:'variant_allele_count')
- Input variable name holding variant_allele_count, as defined by - sgkit.variables.variant_allele_count_spec. If the variable is not present in- ds, it will be computed using- count_variant_alleles().
- merge : bool(default:True)
- If True (the default), merge the input dataset and the computed output variables into a single dataset, otherwise return only the computed output variables. See Dataset merge behavior for more details. 
 
- ds : 
- Return type
- Returns
- A dataset containing the following variables: - sgkit.variables.sample_n_called_spec(samples): The number of variants with called genotypes.
- sgkit.variables.sample_call_rate_spec(samples): The fraction of variants with called genotypes.
- sgkit.variables.sample_n_het_spec(samples): The number of variants with heterozygous calls.
- sgkit.variables.sample_n_hom_ref_spec(samples): The number of variants with homozygous reference calls.
- sgkit.variables.sample_n_hom_alt_spec(samples): The number of variants with homozygous alternate calls.
- sgkit.variables.sample_n_non_ref_spec(samples): The number of variants that are not homozygous reference calls.
 
 
