LogikBot · Applied Statistics

Write something

Jan 24 in

Many machine learning algorithms are sensitive to the range and distribution of attribute values in the input data. Outliers in input data can skew and mislead the training process of machine learning algorithms resulting in longer training times, less accurate models and ultimately poorer results. Even before predictive models are prepared on training data, outliers can result in misleading representations and in turn misleading interpretations of collected data. Outliers can skew the summary distribution of attribute values in descriptive statistics like mean and standard deviation and in plots such as histograms and scatterplots, compressing the body of the data. If you're building machine learning models, you ALWAYS remove outliers. You want your model to find trends in the data, not spend its time chasing down outliers.

Mike West

Jan 22 in

Applied Statistics

Applied = Real-World

Learning statistics in machine learning isn't enough. You need to be able to apply all your statistical knowledge to your data. Data sourcing and data cleansing is 80% of the work of a machine learning engineer. You might know what mean is but do know what mean value imputation and how to apply it to your data? Applied statistics is taking your statistical knowledge and being able to apply it to your data.

1-2 of 2

LogikBot

skool.com/logikbot-1657

Real-World Machine Learning

Leaderboard (30-day)