Missing data
From Knowledge Discovery
Data can be missing at random, in which case a good first order approximation is to put the mean of each feature in for the missing value.
Often, however, data is not missing at random. In this case, in general, you add an indicator function indicating that the variable is missing, or you add one more category to your different categorical variables (e.g. {red, green, blue, missing}).
This, in a regression setting a feature x would be replaced with two features:
x_a = if (missing(x)) then 0 else x x_b = if (missing(x)) then 1 else 0
