Data transformation MCQs

1. What is the main purpose of data transformation in the data preprocessing pipeline?

a) To reduce the number of features in the dataset
b) To convert the data into a more suitable format for analysis or modelingfcom
c) To combine data from different sources
d) To remove duplicates and errors in the dataset

Answer: b) To convert the data into a more suitable format for analysis or modeling


2. Which of the following is a common method used for data transformation?

a) Scaling
b) Data imputation
c) Data cleaning
d) Feature extraction

Answer: a) Scaling


3. What is normalization in the context of data transformation?

a) Converting categorical variables into numeric form
b) Rescaling features to a fixed range, usually [0, 1]
c) Removing duplicate records from the dataset
d) Merging data from multiple sources

Answer: b) Rescaling features to a fixed range, usually [0, 1]


4. Which of the following is an example of log transformation?

a) Adding noise to the data
b) Converting values using the natural logarithm function
c) Applying the Min-Max scaling
d) Encoding categorical variables as binary values

Answer: b) Converting values using the natural logarithm function


5. When is standardization typically used during data transformation?

a) When the data has a skewed distribution
b) When the data needs to be rescaled to a fixed range
c) When the data has a normal distribution and needs to be centered around zero
d) When missing values need to be handled

Answer: c) When the data has a normal distribution and needs to be centered around zero


6. What does log transformation help with in data preprocessing?

a) Reducing the impact of extreme values or skewed distributions
b) Increasing the variance of data
c) Handling missing values
d) Scaling features to a common range

Answer: a) Reducing the impact of extreme values or skewed distributions


7. What does binning refer to in the context of data transformation?

a) Scaling numerical values to a specific range
b) Grouping continuous data into discrete categories or intervals
c) Encoding categorical variables as numerical values
d) Removing outliers from the data

Answer: b) Grouping continuous data into discrete categories or intervals


8. Which of the following is an example of feature extraction during data transformation?

a) Applying one-hot encoding to categorical variables
b) Selecting the most relevant features using statistical methods
c) Combining two or more features to create a new feature
d) Scaling data to a standard range

Answer: c) Combining two or more features to create a new feature


9. What is the z-score transformation used for in data transformation?

a) Removing duplicate entries from the dataset
b) Rescaling data to a fixed range
c) Standardizing data to have a mean of 0 and a standard deviation of 1
d) Handling missing values by imputing the mean

Answer: c) Standardizing data to have a mean of 0 and a standard deviation of 1


10. Which of the following transformation methods is used when the data has a skewed distribution?

a) Standardization
b) Log transformation
c) Binning
d) One-hot encoding

Answer: b) Log transformation


11. In data transformation, what does feature scaling refer to?

a) Reducing the number of features by selecting the most relevant ones
b) Converting categorical data into numerical data
c) Rescaling numerical features to a similar range or distribution
d) Creating new features by combining existing ones

Answer: c) Rescaling numerical features to a similar range or distribution


12. What is the purpose of Min-Max scaling in data transformation?

a) To rescale all features into a range between 0 and 1
b) To reduce the dataset size
c) To eliminate outliers from the data
d) To combine multiple features into a single feature

Answer: a) To rescale all features into a range between 0 and 1


13. What does one-hot encoding do during data transformation?

a) Converts categorical values into binary columns
b) Scales numerical values to a standard range
c) Combines categorical features into a single column
d) Imputes missing values with the mean

Answer: a) Converts categorical values into binary columns


14. Which of the following methods is used to handle skewed data before applying machine learning algorithms?

a) Binning
b) Normalization
c) Log transformation
d) Data imputation

Answer: c) Log transformation


15. What is the result of applying data discretization in data transformation?

a) Converting continuous data into categorical data by grouping values into bins
b) Normalizing the data to a range of [0,1]
c) Creating new features by combining multiple attributes
d) Removing irrelevant features from the dataset

Answer: a) Converting continuous data into categorical data by grouping values into bins

Leave a Comment

All copyrights Reserved by MCQsAnswers.com - Powered By T4Tutorials