sgkit.simulate_genotype_call_dataset#
- sgkit.simulate_genotype_call_dataset(n_variant, n_sample, n_ploidy=2, n_allele=2, n_contig=1, seed=0, missing_pct=None, phased=None, additional_variant_fields=None)#
Simulate genotype calls and variant/sample data.
Note that the data simulated by this function has no biological interpretation and that summary statistics or other methods applied to it will produce meaningless results. This function is primarily a convenience on generating
xarray.Dataset
containers so quantities of interest should be overwritten, where appropriate, within the context of a more specific application.- Parameters
- n_variant :
int
Number of variants to simulate
- n_sample :
int
Number of samples to simulate
- n_ploidy :
int
(default:2
) Number of chromosome copies in each sample
- n_allele :
int
(default:2
) Number of alleles to simulate
- n_contig :
int
(default:1
) optional Number of contigs to partition variants with, controlling values in
variant_contig
. Values will all be 0 by default whenn_contig
is 1.- seed :
int
|None
Optional
[int
] (default:0
) Seed for random number generation, optional
- missing_pct :
float
|None
Optional
[float
] (default:None
) The percentage of missing calls, must be within [0.0, 1.0], optional
- phased :
bool
|None
Optional
[bool
] (default:None
) Whether genotypes are phased, default is unphased, optional
- additional_variant_fields :
dict
|None
Optional
[dict
] (default:None
) Additional variant fields to add to the dataset as a dictionary of {field_name: field_dtype}, optional
- n_variant :
- Return type
- Returns
A dataset containing the following variables:
sgkit.variables.variant_contig_spec
(variants)sgkit.variables.variant_position_spec
(variants)sgkit.variables.variant_allele_spec
(variants)sgkit.variables.sample_id_spec
(samples)sgkit.variables.call_genotype_spec
(variants, samples, ploidy)sgkit.variables.call_genotype_mask_spec
(variants, samples, ploidy)sgkit.variables.call_genotype_phased_spec
(variants, samples), ifphased
is not NoneThose specified in
additional_variant_fields
, if provided