This function inserts NA's to the provided metabolomic matrix based on the MCAR (Missing Completely At Random), MAR (Missing At Random) and MNAR (Missing Not At Random) patterns according to provided probabilities.
Arguments
- mcar
a number from (0, 1) interval. Ratio of the data missing completely at random (MCAR).
- mar
a number from (0, 1) interval. Ratio of the data missing at random (MAR).
- mnar
number from (0, 1) interval. Ratio of the data missing not at random (MCAR).
- thresh
a value from 0 to 1: limit value indicating maximum ratio of missing observations in one column
Details
This function uses ampute
for simulating
the data MCAR and MAR, and insert_MNAR
implementation for simulation the data missing because of the limit of
detection (LOD).
It's a wrapper for the following functions:
The sum of mcar_ratio
+ mar_ratio
+ mnar_ratio
should
not surpass 1, otherwise the function will throw an error.
Note that the missing mechanisms are used in the following order: MAR, MNAR, MCAR. It may happen that some of missing values will overlap themselves and in the result missing ratio may be slightly smaller.
Examples
set.seed(1)
m <- as.data.frame(matrix(rnorm(200), ncol = 50))
simulate_miss_value(m, mcar = 0.05, mar = 0.01, mnar = 0.05)
#> Error in get_missing_per_column(dat, ratio = ratio, thresh = thresh): The total number of required missing values (10) is larger than the number of missing values allowed by threshold (0)