This function generates sequences with provided motifs and constructs k-mer representation table based on this data.

generate_kmer_data(
  n_seq,
  sequence_length,
  alphabet,
  motifs,
  n_injections,
  fraction = 0.5,
  seqProbs = NULL,
  n = 4,
  d = 6
)

Arguments

n_seq

number of sequences to be generated

sequence_length

sequence length

alphabet

elements used to build sequence

motifs

list of motifs

n_injections

maximal number of motifs injected to each positive sequence (from 1 to n_injections will be injected)

fraction

fraction of positive sequences

seqProbs

alphabet probabilities for sequences

n

maximum number of alphabet elements in n-gram

d

maximum number of gaps in n-gram

Value

generated sequences

Examples

n_seq <- 20
sequence_length <- 20
alph <- letters[1:4]
motifs <- generate_motifs(alph, 3, 3, 4, 6)
results <- generate_kmer_data(n_seq, sequence_length, alph, motifs, 1)