sugar.index.fastaindex module

FASTA indexer

class sugar.index.fastaindex.FastaBinarySearchFile[source]

Bases: object

Custom BinarySeachFile used to store the index

headerstart = b'SugarFASTAindex v0.1.0, sugar v1.1.1-dev\n'
magic = b'\xfe\x8a\x01\x01'
record = (0, 50, 50, 50)
class sugar.index.fastaindex.FastaIndex(dbname=None, create=False, path='{dbpath}', mode=None)[source]

Bases: object

Index FASTA files and query (sub-) sequences

Parameters:
  • dbname (str) – Name of index file, default: last used index file

  • create (str) – Create a new index file

  • path (str) – Common path for FASTA file names, default '{dbpath}', i.e. filenames are relative to path of index file, only needed for new index file

  • mode (str) – 'binary' uses a binary search file, 'db' uses a database via Python’s dbm module, only needed for new index file

add(fnameexpr, seek=None, force=False, silent=False)[source]
get_basket(seqids)[source]

Return BioBasket with all corresponding sequences

See iter() for documentation of seqids argument.

get_fasta(seqids)[source]

Return FASTA string with all sequences

See iter() for documentation of seqid argument.

get_fastaheader(seqids)[source]

Return FASTA header from all sequences

See iter() for documentation of seqid argument.

get_seq(seqid)[source]

Return a BioSeq given its id

See iter() for documentation of seqid argument.

iter(seqids)[source]

Yield BioSeq sequences from given sequence ids

Parameters:

seqids – List of sequence ids, might also be a single seqid, a 3-len tuple with start and stop indices (seqid, start, stop) (start or stop can be None), or a list with 3-len tuples

iter_fasta(seqids)[source]

Yield FASTA strings from given sequence ids

See iter() for documentation of seqids argument.

iter_fastaheader(seqids)[source]

Yield FASTA headers from given sequence ids

See iter() for documentation of seqids argument.

property dbtype
property totalsize