Skip to content

Core

Function text_to_dna

Encodes text string into DNA string with dna-bases (A, C, G, T).

Parameters:

Name Type Description Default
text str

a string

required
schema str

the conversion schema to use (default: AGCT)

'AGCT'
binary_string_length int

the length of the binary-string during conversion (use 8 for strategy='ascii' and 24 for strategy='utf-8')

8
strategy str

the text-encoding/decoding strategy to use (default: ascii) (options: ascii, utf-8)

'ascii'
converter Optional[Converter]

optionally you can provide a converter (genespeak.converter.Converter)

None
Source code in genespeak/core.py
def text_to_dna(
    text: str,
    schema: str = "AGCT",
    binary_string_length: int = 8,
    strategy: str = "ascii",
    converter: Optional[Converter] = None,
) -> str:
    """Encodes text string into DNA string with dna-bases (``A``, ``C``, ``G``, ``T``).

    Arguments:
        text: a string
        schema: the conversion schema to use (default: ``AGCT``)
        binary_string_length: the length of the binary-string during conversion
                              (use ``8`` for ``strategy='ascii'`` and ``24`` for ``strategy='utf-8'``)
        strategy: the text-encoding/decoding strategy to use (default: ``ascii``)
                  (options: ``ascii``, ``utf-8``)
        converter: optionally you can provide a converter (``genespeak.converter.Converter``)
    """

    if (converter is None) or (not isinstance(converter, Converter)):
        converter = Converter(
            schema=schema,
            binary_string_length=binary_string_length,
            strategy=strategy,
        )
    dnabase_chr: List[str] = []
    text_encoded_bin2N = converter.convert_info_to_2Nbit_binary(
        text, length2N=converter.strategy.binary_string_length
    )
    dnabase_chr, _ = converter.convert_to_dnabase(
        text_encoded_bin2N, strategy=converter.strategy.name
    )
    text_as_dnabase = "".join(dnabase_chr)
    return text_as_dnabase

Function dna_to_text

Decodes valid encoded DNA string back into the equivalent text string.

Parameters:

Name Type Description Default
dna str

a string of dna-base

required
schema str

the conversion schema to use (default: AGCT)

'AGCT'
binary_string_length int

the length of the binary-string during conversion (use 8 for strategy='ascii' and 24 for strategy='utf-8')

8
strategy str

the text-encoding/decoding strategy to use (default: ascii) (options: ascii, utf-8)

'ascii'
converter Optional[Converter]

optionally you can provide a converter (genespeak.converter.Converter)

None
Source code in genespeak/core.py
def dna_to_text(
    dna: str,
    schema: str = "AGCT",
    binary_string_length: int = 8,
    strategy: str = "ascii",
    converter: Optional[Converter] = None,
) -> str:
    """Decodes valid encoded DNA string back into the equivalent text string.

    Arguments:
        dna: a string of dna-base
        schema: the conversion schema to use (default: ``AGCT``)
        binary_string_length: the length of the binary-string during conversion
                              (use ``8`` for ``strategy='ascii'`` and ``24`` for ``strategy='utf-8'``)
        strategy: the text-encoding/decoding strategy to use (default: ``ascii``)
                  (options: ``ascii``, ``utf-8``)
        converter: optionally you can provide a converter (``genespeak.converter.Converter``)
    """

    converter = set_converter(
        schema=schema,
        binary_string_length=binary_string_length,
        strategy=strategy,
        converter=converter,
    )

    dnabase_as_bin = converter.encoder.chr2bin
    dnabase_as_bin2 = [dnabase_as_bin.get(x_chr) for x_chr in dna]
    length2N = converter.strategy.binary_string_length
    dnabase_as_bin2N = converter.get_bin2_to_bin2N("".join(dnabase_as_bin2), length2N=length2N)  # type: ignore
    # num_digits = 2 # length of any('00', '01', '10', '11')
    # binary to decimal conversion: int(x, 2)
    dnabase_as_chr = [chr(int(x, 2)) for x in dnabase_as_bin2N]
    dnabase_as_text = "".join(dnabase_as_chr)
    return dnabase_as_text

Last update: 2022-11-20
Back to top