Data Preprocessing MCQs December 22, 2025November 18, 2024 by u930973931_answers 15 min Score: 0 Attempted: 0/15 Subscribe 1. Which of the following is the first step in data preprocessing? (A) Data transformation (B) Data cleaning (C) Data integration (D) Data reduction 2. What is the primary purpose of data cleaning in preprocessing? (A) To convert data into a more useful format (B) To eliminate irrelevant features (C) To identify and correct errors in the data (D) To visualize data for better understanding 3. Which technique is used to handle missing values in a dataset? (A) Imputation (B) Normalization (C) Standardization (D) Feature extraction 4. Which of the following is an example of data transformation? (A) Handling missing data (B) Combining data from different sources (C) Removing duplicate records (D) Scaling or normalizing numerical values 5. In data preprocessing, what does normalization refer to? (A) Removing duplicate records from the dataset (B) Selecting a subset of relevant features for analysis (C) Handling missing values by substituting with the mean (D) Adjusting data values to a common scale without distorting differences 6. What is the main goal of data reduction in preprocessing? (A) To reduce the size of the data without losing important information (B) To improve the accuracy of data models (C) To combine data from multiple sources (D) To eliminate irrelevant features from the dataset 7. Which of the following is a technique for handling outliers during preprocessing? (A) Z-score transformation (B) Normalization (C) Feature extraction (D) Data augmentation 8. What is feature scaling? (A) Creating new features from existing ones (B) Adjusting feature values to a uniform range (C) Reducing the number of features in the dataset (D) Selecting important features for analysis 9. What is one disadvantage of removing missing values during preprocessing? (A) It increases the dataset size (B) It introduces bias in the data (C) It causes the model to overfit (D) It might lead to a loss of important data 10. Which method can be used for encoding categorical variables into numeric values? (A) Data cleaning (B) Feature selection (C) Normalization (D) One-hot encoding 11. Why is data transformation important in data preprocessing? (A) It improves model performance by making data more consistent (B) It helps in handling different data formats (C) It increases the complexity of the data (D) It removes all irrelevant data 12. Which of the following preprocessing techniques is used to handle categorical data? (A) One-hot encoding (B) Min-max scaling (C) Standardization (D) Principal component analysis (PCA) 13. What is the effect of data normalization on features with different units of measurement? (A) It keeps the data in its original form without change (B) It increases the variance of the data (C) It removes units and adjusts all features to the same scale (D) It performs feature selection based on importance 14. Which of the following is an example of data imputation? (A) Removing rows with missing values (B) Scaling data to a specific range (C) Combining datasets from multiple sources (D) Replacing missing values with the mean or median of the column 15. What is the primary challenge of feature extraction in preprocessing? (A) Reducing the number of features too much (B) Creating new features that represent the data well (C) Scaling features to a uniform range (D) Dealing with missing values