API Reference
pytrf.version
- pytrf.version()
Get current version of pytrf
- Returns:
version
- Return type:
str
pytrf.STRFinder
- class pytrf.STRFinder(chrom, seq, mono=12, di=7, tri=5, tetra=4, penta=4, hexa=4)
Find all exact or perfect short tandem repeats (STRs), simple sequence repeats (SSRs) or microsatellites that meet the minimum repeats on the input sequence
- Parameters:
chrom (str) – the sequence name
seq (str) – the input DNA sequence
mono (int) – the minimum tandem repeats for mono-nucleotide repeats
di (int) – the minimum tandem repeats for di-nucleotide repeats
tri (int) – the minimum tandem repeats for tri-nucleotide repeats
tetra (int) – the minimum tandem repeats for tetra-nucleotide repeats
penta (int) – the minimum tandem repeats for penta-nucleotide repeats
hexa (int) – the minimum tandem repeats for hexa-nucleotide repeats
- Returns:
STRFinder object
- as_list()
Put all SSRs in a list and return, each SSR in list has 7 columns including [sequence name, start position, end position, motif sequence, motif length, repeats, SSR length]
- Returns:
all SSRs found
- Return type:
list
pytrf.GTRFinder
- class pytrf.GTRFinder(chrom, seq, min_motif=1, max_motif=100, min_repeat=3, min_length=10)
Find all exact or perfect generic tandem repeats (GTRs) that meet the minimum repeat and minimum length on the input sequence
- Parameters:
chrom (str) – the sequence name
seq (str) – the input DNA sequence
min_motif (int) – minimum length of motif sequence
max_motif (int) – maximum length of motif sequence
min_repeat (int) – minimum number of tandem repeats
min_length (int) – minimum length of tandem repeats
- Returns:
GTRFinder object
- as_list()
Put all GTRs in a list and return, each GTR in list has 7 columns including [sequence name, start position, end position, motif sequence, motif length, repeats, GTR length]
- Returns:
all GTRs found
- Return type:
list
pytrf.ATRFinder
- class pytrf.ATRFinder(chrom, seq, min_motif_size=1, max_motif_size=6, min_seed_repeat=3, min_seed_length=10, max_consecutive_error=3, min_extend_identity=70, max_extend_length=2000)
Find all approximate or imperfect tandem repeats (ATRs) from the input sequence
- Parameters:
chrom (str) – the sequence name
seq (str) – the input DNA sequence
min_motif_size (int) – minimum length of motif
max_motif_size (int) – maximum length of motif
min_seed_repeat (int) – minimum number of repeat for seed
min_seed_length (int) – minimum length of seed
max_consecutive_error (int) – maximum number of allowed consecutive aligned errors
min_extend_identity (float) – minimum identity of extended alignment (0~1)
max_extend_length (int) – maximum length allowed to extend
- Returns:
ATRFinder object
- as_list()
Put all ATRs in a list and return, each ATR in list has 14 columns including [sequence name, seed start position, seed end position, motif sequence, motif length, seed repeat, ATR start position, ATR end position, ATR repeat, ATR length, extend matches, extend substitutions, extend insertions, extend deletions, extend identity]
pytrf.ETR
- class pytrf.ETR
Readonly exact tandem repeat (ETR) object generated by iterating over STRFinder or GTRFinder object
- chrom
chromosome or sequence name where ETR located on
- start
ETR one-based start position on sequence
- end
ETR one-based end position on sequence
- motif
motif sequence
- type
motif length
- repeat
number of repeats
- length
length of ETR
- seq
get the sequence of ETR
- as_list()
convert ETR object to a list
- as_dict()
convert ETR object to a dict
- as_gff(terminator='')
convert ETR object to a gff formatted string
- as_string(separator='\t', terminator='')
convert ETR object to a TSV or CSV string by using separator and terminator
- Parameters:
separator (str) – a separator between columns
terminator (str) – a terminator added to the end of string
- Returns:
a formatted string
- Return type:
str
pytrf.ATR
- class pytrf.ATR
Readonly imperfect or approximate tandem repeat (ATR) object generated by iterating over ATRFinder object
- chrom
chromosome or sequence name where ATR located on
- start
ATR one-based start position on sequence
- end
ATR one-based end position on sequence
- seed_start
start position of seed
- seed_end
end position of seed
- seed_repeat
repeat number of seed
- motif
motif sequence
- type
motif length
- repeat
repeat number of perfect counterpart
- length
length of ITR
- matches
number of matches for extend
- substitutions
number of substitutions for extend
- insertions
number of insertions for extend
- deletions
number of deletions for extend
- identity
extend identity
- seq
get the sequence of ATR
- as_list()
convert ATR object to a list
- as_dict()
convert ATR object to a dict
- as_gff(terminator='')
convert ATR object to a gff formatted string
- as_string(separator='\t', terminator='')
convert ATR object to a TSV or CSV string by using separator and terminator
- Parameters:
separator (str) – a separator between columns
terminator (str) – a terminator added to the end of string
- Returns:
a formatted string
- Return type:
str