sgkit.simulate_genotype_call_dataset#
- sgkit.simulate_genotype_call_dataset(n_variant, n_sample, n_ploidy=2, n_allele=2, n_contig=1, seed=0, missing_pct=None, phased=None, additional_variant_fields=None)#
Simulate genotype calls and variant/sample data.
Note that the data simulated by this function has no biological interpretation and that summary statistics or other methods applied to it will produce meaningless results. This function is primarily a convenience on generating
xarray.Datasetcontainers so quantities of interest should be overwritten, where appropriate, within the context of a more specific application.- Parameters
- n_variant :
int Number of variants to simulate
- n_sample :
int Number of samples to simulate
- n_ploidy :
int(default:2) Number of chromosome copies in each sample
- n_allele :
int(default:2) Number of alleles to simulate
- n_contig :
int(default:1) optional Number of contigs to partition variants with, controlling values in
variant_contig. Values will all be 0 by default whenn_contigis 1.- seed :
int|NoneOptional[int] (default:0) Seed for random number generation, optional
- missing_pct :
float|NoneOptional[float] (default:None) The percentage of missing calls, must be within [0.0, 1.0], optional
- phased :
bool|NoneOptional[bool] (default:None) Whether genotypes are phased, default is unphased, optional
- additional_variant_fields :
dict|NoneOptional[dict] (default:None) Additional variant fields to add to the dataset as a dictionary of {field_name: field_dtype}, optional
- n_variant :
- Return type
- Returns
A dataset containing the following variables:
sgkit.variables.variant_contig_spec(variants)sgkit.variables.variant_position_spec(variants)sgkit.variables.variant_allele_spec(variants)sgkit.variables.sample_id_spec(samples)sgkit.variables.call_genotype_spec(variants, samples, ploidy)sgkit.variables.call_genotype_mask_spec(variants, samples, ploidy)sgkit.variables.call_genotype_phased_spec(variants, samples), ifphasedis not NoneThose specified in
additional_variant_fields, if provided