Skip to contents

This function inserts NA's to the provided metabolomic matrix based on the MCAR (Missing Completely At Random), MAR (Missing At Random) and MNAR (Missing Not At Random) patterns according to provided probabilities.

Usage

simulate_miss_value(data_set, mcar = 0, mar = 0, mnar = 0, thresh = 0.2)

Arguments

mcar

a number from (0, 1) interval. Ratio of the data missing completely at random (MCAR).

mar

a number from (0, 1) interval. Ratio of the data missing at random (MAR).

mnar

number from (0, 1) interval. Ratio of the data missing not at random (MCAR).

thresh

a value from 0 to 1: limit value indicating maximum ratio of missing observations in one column

Value

A matrix with NA values inserted.

Details

This function uses ampute for simulating the data MCAR and MAR, and insert_MNAR implementation for simulation the data missing because of the limit of detection (LOD).

It's a wrapper for the following functions:

The sum of mcar_ratio + mar_ratio + mnar_ratio should not surpass 1, otherwise the function will throw an error.

Note that the missing mechanisms are used in the following order: MAR, MNAR, MCAR. It may happen that some of missing values will overlap themselves and in the result missing ratio may be slightly smaller.

Examples

set.seed(1)
m <- as.data.frame(matrix(rnorm(200), ncol = 50))
simulate_miss_value(m, mcar = 0.05, mar = 0.01, mnar = 0.05)
#> Error in get_missing_per_column(dat, ratio = ratio, thresh = thresh): The total number of required missing values (10) is larger than the number of missing values allowed by threshold (0)