cogent3.core.alphabet.CharAlphabet#

class CharAlphabet(chars: TStrOrBytes | PySeq[TStrOrBytes], gap: TStrOrBytes | None = None, missing: TStrOrBytes | None = None)#

representing fundamental monomer character sets.

Attributes:
gap_char
gap_index
missing_char
missing_index
moltype
motif_len
num_canonical

Methods

array_to_bytes(seq)

returns seq as a byte string

as_bytes()

returns self as a byte string

convert_seq_array_to(*, alphabet, seq[, ...])

converts a numpy array with indices from self to other

count(value, /)

Return number of occurrences of value.

from_indices(seq)

returns a string from a sequence of indices

from_rich_dict(data)

returns an instance from a serialised dictionary

get_kmer_alphabet(k[, include_gap])

returns kmer alphabet with words of size k

get_subset(motif_subset[, excluded])

Returns a new Alphabet object containing a subset of motifs in self.

index(value[, start, stop])

Return first index of value.

is_valid(seq)

seq is valid for alphabet

to_indices(seq[, validate])

returns a sequence of indices for the characters in seq

to_json()

returns a serialisable string

to_rich_dict([for_pickle])

returns a serialisable dictionary

with_gap_motif([gap_char, missing_char, ...])

returns new monomer alphabet with gap and missing characters added

Notes

Provides methods for efficient conversion between characters and integers from fundamental types of strings, bytes and numpy arrays.