API Reference

pytrf.version

pytrf.version()

Get current version of pytrf

Returns:

version

Return type:

str

pytrf.STRFinder

class pytrf.STRFinder(chrom, seq, mono=12, di=7, tri=5, tetra=4, penta=4, hexa=4)

Find all exact or perfect short tandem repeats (STRs), simple sequence repeats (SSRs) or microsatellites that meet the minimum repeats on the input sequence

Parameters:
  • chrom (str) – the sequence name

  • seq (str) – the input DNA sequence

  • mono (int) – the minimum tandem repeats for mono-nucleotide repeats

  • di (int) – the minimum tandem repeats for di-nucleotide repeats

  • tri (int) – the minimum tandem repeats for tri-nucleotide repeats

  • tetra (int) – the minimum tandem repeats for tetra-nucleotide repeats

  • penta (int) – the minimum tandem repeats for penta-nucleotide repeats

  • hexa (int) – the minimum tandem repeats for hexa-nucleotide repeats

Returns:

STRFinder object

as_list()

Put all SSRs in a list and return, each SSR in list has 7 columns including [sequence name, start position, end position, motif sequence, motif length, repeats, SSR length]

Returns:

all SSRs found

Return type:

list

pytrf.GTRFinder

class pytrf.GTRFinder(chrom, seq, min_motif=1, max_motif=100, min_repeat=3, min_length=10)

Find all exact or perfect generic tandem repeats (GTRs) that meet the minimum repeat and minimum length on the input sequence

Parameters:
  • chrom (str) – the sequence name

  • seq (str) – the input DNA sequence

  • min_motif (int) – minimum length of motif sequence

  • max_motif (int) – maximum length of motif sequence

  • min_repeat (int) – minimum number of tandem repeats

  • min_length (int) – minimum length of tandem repeats

Returns:

GTRFinder object

as_list()

Put all GTRs in a list and return, each GTR in list has 7 columns including [sequence name, start position, end position, motif sequence, motif length, repeats, GTR length]

Returns:

all GTRs found

Return type:

list

pytrf.ATRFinder

class pytrf.ATRFinder(chrom, seq, min_motif_size=1, max_motif_size=6, min_seed_repeat=3, min_seed_length=10, max_consecutive_error=3, min_extend_identity=70, max_extend_length=2000)

Find all approximate or imperfect tandem repeats (ATRs) from the input sequence

Parameters:
  • chrom (str) – the sequence name

  • seq (str) – the input DNA sequence

  • min_motif_size (int) – minimum length of motif

  • max_motif_size (int) – maximum length of motif

  • min_seed_repeat (int) – minimum number of repeat for seed

  • min_seed_length (int) – minimum length of seed

  • max_consecutive_error (int) – maximum number of allowed consecutive aligned errors

  • min_extend_identity (float) – minimum identity of extended alignment (0~1)

  • max_extend_length (int) – maximum length allowed to extend

Returns:

ATRFinder object

as_list()

Put all ATRs in a list and return, each ATR in list has 14 columns including [sequence name, seed start position, seed end position, motif sequence, motif length, seed repeat, ATR start position, ATR end position, ATR repeat, ATR length, extend matches, extend substitutions, extend insertions, extend deletions, extend identity]

pytrf.ETR

class pytrf.ETR

Readonly exact tandem repeat (ETR) object generated by iterating over STRFinder or GTRFinder object

chrom

chromosome or sequence name where ETR located on

start

ETR one-based start position on sequence

end

ETR one-based end position on sequence

motif

motif sequence

type

motif length

repeat

number of repeats

length

length of ETR

seq

get the sequence of ETR

as_list()

convert ETR object to a list

as_dict()

convert ETR object to a dict

as_gff(terminator='')

convert ETR object to a gff formatted string

as_string(separator='\t', terminator='')

convert ETR object to a TSV or CSV string by using separator and terminator

Parameters:
  • separator (str) – a separator between columns

  • terminator (str) – a terminator added to the end of string

Returns:

a formatted string

Return type:

str

pytrf.ATR

class pytrf.ATR

Readonly imperfect or approximate tandem repeat (ATR) object generated by iterating over ATRFinder object

chrom

chromosome or sequence name where ATR located on

start

ATR one-based start position on sequence

end

ATR one-based end position on sequence

seed_start

start position of seed

seed_end

end position of seed

seed_repeat

repeat number of seed

motif

motif sequence

type

motif length

repeat

repeat number of perfect counterpart

length

length of ITR

matches

number of matches for extend

substitutions

number of substitutions for extend

insertions

number of insertions for extend

deletions

number of deletions for extend

identity

extend identity

seq

get the sequence of ATR

as_list()

convert ATR object to a list

as_dict()

convert ATR object to a dict

as_gff(terminator='')

convert ATR object to a gff formatted string

as_string(separator='\t', terminator='')

convert ATR object to a TSV or CSV string by using separator and terminator

Parameters:
  • separator (str) – a separator between columns

  • terminator (str) – a terminator added to the end of string

Returns:

a formatted string

Return type:

str