get_target_logic.Rd
This function samples target variable according to the logic regression model (assuming that the occurrence of certain combinations of motifs affects the feature). In the case of logical models, simulating a binary variable involves defining logical conditions that determine the variable's value based on motifs, e.g., the binary variable takes the value 1 if certain l ogical criteria are met, and 0 if these criteria are not met.
get_target_logic(
kmer_dat,
random = TRUE,
zero_weight = NULL,
weights = NULL,
n_exp = NULL,
max_exp_depth = NULL,
expressions = NULL,
binary = TRUE
)
output of generate_kmer_data
a logical. Indicating whether expressions have to be generated
randomly. Default to TRUE
.
a single value denoting the weight of no-motifs case. If
NULL
, then we sample the weight from the uniform distribution on the
[-2, -1] interval. Default to NULL
.
a vector of weights of considered logic expression based on
available motifs. The length of weights
should be the same as the
provided number of expressions to use n_exp
. If weights
parameter is NULL
, then weights will be sampled from the uniform
distribution on 0-1 interval. The probability of success for target sampling
will be calculated based on the formula provided in details section. Default
to NULL
.
number of random logic expressions to create. It is used only
when random
equals TRUE
.
a maximum number of motifs used in a logic expression. Default to 3.
a matrix of binary variables corresponding to custom
logic expressions. You can create them based on motifs. It's dimension should
be related to the length of weights
vector if it's provided. Default
to NULL
. If NULL
, random logic expressions will be created.
logical, indicating whether the produced target variable should be binary or continuous.
Here, we consider new variables, \(L_1, \ldots, L_l\) where each of them is a logic expression based on a subset of motifs \(m_1, \ldots, m_m\). For example,
\(L_1(m_1, m_2, m_3) = (X_{m_1} \land X_{m_2}) \lor X_{m_3}.\)
Each variable \(L_i\) obtains its own weight in the model. Our model is following:
\(g(EY) = w_0 + \sum_{i = 1}^{l} w_i L_i.\)
n_seq <- 20
sequence_length <- 20
alph <- letters[1:4]
motifs <- generate_motifs(alph, 4, 4, 4, 6)
results <- generate_kmer_data(n_seq, sequence_length, alph,
motifs, n_injections = 4)
get_target_logic(results)
#> [1] 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 1 0 0 0 0