👉 Preprocessing, also known as data cleaning or data pre-processing, is a process that involves transforming and analyzing raw data in order to make it suitable for use in machine learning algorithms. This can include steps such as removing missing values, normalizing numerical data, feature engineering, and handling outliers.
Common preprocessing techniques used in data science include:
1.
Data Cleaning
: Removing or correcting missing values, duplicating columns, filling in missing data, and converting categorical variables to numerical.
2.
Feature