get_target_interactions.Rd
This function samples target variable according to the logiistic model with interactions
get_target_interactions(kmer_dat, zero_weight = NULL, binary = TRUE)
output of generate_kmer_data
a single value denoting the weight of no-motifs case. If
NULL
, then we sample the weight from the uniform distribution on the
[-2, -1] interval. Default to NULL
.
logical, indicating whether the produced target variable should be binary or continuous.
a binary vector of target variable sampled based on interaction model and provided/calculated probabilities.
approach is based on logistic regression with interactions indicating that the effect of one predictor depends on the value of another predictor. Let's define maximum number of motifs per sequence \(k = \max\lbrace k_i, i = 1, \ldots, n\rbrace\). Let \(w_{1}, \ldots, w_{k}\) denote weights of single effects. Namely:
\(g(EY) = w_0 + \sum_{i = 1}^{k} w_{i} X_{m_i} + \left(\sum_{i = 1}^{k-1}\sum_{j = i + 1}^{k} w_{ij} X_{m_i}X_{m_j}\right) + \ldots + w_{1\ldots k} X_{m_1}\ldots X_{m_k}\)
In the case when probs
is NULL
we calculate the probabilities
based on the formula \( exp(x_i)/(1 + exp(x_i))\) where \(x_i\) denotes the
number of motifs in ith sequence.
n_seq <- 20
sequence_length <- 20
alph <- letters[1:4]
motifs <- generate_motifs(alph, 4, 4, 4, 6)
results <- generate_kmer_data(n_seq, sequence_length, alph,
motifs, n_injections = 4)
get_target_interactions(results)
#> [1] 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0