API

class tspop.PopAncestry(left, right, population, ancestor, child, sample_nodes, sequence_length)

Bases: object

In most cases, this should be created with the tspop.get_pop_ancestry() method. An object holding local ancestry information, and various summaries of that information.

Parameters:
  • left (list(float)) – The array of left coordinates.

  • right (list(float)) – The array of right coordinates.

  • population (list(int)) – The array of population labels.

  • sample_nodes (list(int)) – The list of IDs corresponding to sample nodes.

  • sequence_length (float) – The physical length of the region represented.

ancestry_table

A pandas.DataFrame object with column labels sample, left, right, ancestor, population. Each row (sample, left, right, ancestor, population) indicates that over the genomic interval with coordinates [left, right), the sample node with ID sample has inherited from the ancestral node with ID ancestor in the population with ID population. Ancestral nodes and population labels are taken from the specified census time.

ancestry_table_write_csv(outfile, **kwargs)

Writes the ancestry table to a csv file.

param outfile: The name of the output file. type: str param kwargs: other keyword arguments for pandas.to_csv

calculate_ancestry_fraction(population, sample=None)

Returns the total fraction of genomic material inherited from a given population.

Parameters:
  • population (int) – The index of the population to use.

  • sample (int) – A specific sample node.

Returns:

the global ancestry fraction.

coverage

The proportion of the total genome length with an ancestor in the tspop.PopAncestry.squashed_table and tspop.PopAncestry.ancestry_table.

num_ancestors

The number of ancestral haplotypes. Strictly less than or equal to the number of inputted ancestral nodes.

num_samples

The number of provided samples.

plot_karyotypes(sample_pair, colors=None, pop_labels=None, title=None, length_in_Mb=True, outfile=None, height=12, width=20)

Note

Diploid only for now.

Creates a plot of the ancestry tracts in a sample pair of chromosomes using matplotlib.

Parameters:
  • sample_pair (list(int)) – a pair of sample node IDs in the PopAncestry object.

  • colors (list(str)) – A list of pyplot-compatible colours to use for the ancestral populations, given in order of their appearance in the tspop.PopAncestry.squashed_table. If None, uses the default matplotlib colour cycle.

  • pop_labels (list(str)) – Ancestral population labels for the plot legend. If None, defaults to Pop0, Pop1 etc.

  • title (str) – The title of the plot. If None, defaults to ‘Ancestry in admixed individual’.

  • length_in_Mb (bool) – Whether or not to label the horizontal axis in megabases. Defaults to True.

  • outfile (str) – The name of the output file. If None, the plot opens with the system viewer.

  • height (float) – The height of the figure in inches.

  • width (float) – The width of the figure in inches.

Returns:

a matplotlib figure.

squashed_table

A pandas.DataFrame object with column labels sample, left, right, population. Each row (sample, left, right, population) indicates that over the genomic interval with coordinates [left, right), the sample node with ID sample has inherited from an ancestral node in the population with ID population. Population labels are taken from the specified census time.

squashed_table_write_csv(outfile, **kwargs)

Writes the squashed table to a csv file.

param outfile: The name of the output file. type: str param kwargs: other keyword arguments for pandas.to_csv

subset_tables(subset_samples, inplace=False)

Subsets the ancestry table and squashed table by sample. Note: by default this returns a copy of the original tables. To overwrite the original tables, set inplace=True. (In this case, the function returns nothing).

Parameters:
  • subset_samples (list(int)) – The sample nodes to keep.

  • inplace (bool) – Whether to overwrite the original tables.

Returns:

The subsetted ancestry table and squashed table (only if inplace=True).

total_genome_length

Sequence length times the number of samples.

tspop.get_pop_ancestry(ts, census_time)

Creates a tspop.PopAncestry object from a simulated tree sequence containing ancestral census nodes. These are the ancestors that population-based ancestry will be calculated with respect to.

Parameters:
  • ts (tskit.TreeSequence) – A tree sequence containing census nodes.

  • census_time (list(int)) – The time at which the census nodes are recorded.

Returns:

a tspop.PopAncestry object