Classification Techniques MCQs

1. Which of the following is an advantage of using Decision Trees for classification?

A) They require a lot of data preprocessing
B) They are easy to interpret and visualize
C) They cannot handle missing data
D) They always produce highly accurate results

Answer: B) They are easy to interpret and visualize


2. Which of the following is a key characteristic of Naive Bayes classification?

A) It assumes that features are independent given the class label
B) It works by minimizing squared errors between predicted and true labels
C) It splits the data using a hyperplane to separate classes
D) It is an ensemble learning method

Answer: A) It assumes that features are independent given the class label


3. What does the support vector represent in Support Vector Machines (SVM)?

A) The data point that is closest to the decision boundary
B) The point that is farthest from the decision boundary
C) The data point that is incorrectly classified
D) The average of the data points from both classes

Answer: A) The data point that is closest to the decision boundary


4. In a Random Forest classifier, what does the term “ensemble learning” refer to?

A) Using multiple models to make decisions, and combining their outputs for better accuracy
B) Using a single model trained on all the data
C) Randomly selecting a subset of features for each decision tree
D) Using a single decision tree to classify data points

Answer: A) Using multiple models to make decisions, and combining their outputs for better accuracy


5. Which of the following algorithms is primarily used for linear classification?

A) K-Nearest Neighbors
B) Logistic Regression
C) Decision Trees
D) Random Forest

Answer: B) Logistic Regression


6. What is the purpose of the “kernel trick” in Support Vector Machines (SVM)?

A) It increases the complexity of the decision boundary
B) It transforms non-linearly separable data into linearly separable data
C) It makes the model simpler and easier to interpret
D) It helps in reducing the training time of SVM

Answer: B) It transforms non-linearly separable data into linearly separable data


7. Which of the following classifiers works by partitioning the feature space using axis-aligned rectangles?

A) K-Nearest Neighbors
B) Decision Trees
C) Logistic Regression
D) Naive Bayes

Answer: B) Decision Trees


8. In k-Nearest Neighbors (k-NN), what does the “k” represent?

A) The number of classes in the dataset
B) The number of data points considered for classification
C) The number of features used to classify a data point
D) The number of nearest neighbors to consider when making a prediction

Answer: D) The number of nearest neighbors to consider when making a prediction


9. Which of the following metrics is used to evaluate classification models in terms of both precision and recall?

A) Accuracy
B) F1 Score
C) Mean Squared Error
D) Area Under the Curve (AUC)

Answer: B) F1 Score


10. In Gradient Boosting, how are weak learners (typically decision trees) combined to form a strong learner?

A) By averaging their predictions
B) By taking the majority vote of their classifications
C) By sequentially correcting the errors of previous trees
D) By creating new features based on the errors of previous trees

Answer: C) By sequentially correcting the errors of previous trees


11. Which of the following is NOT a typical application of classification techniques?

A) Email spam detection
B) Predicting stock prices
C) Handwritten digit recognition
D) Disease diagnosis

Answer: B) Predicting stock prices


12. Which of the following methods is best suited for handling imbalanced classes in classification problems?

A) Over-sampling the minority class
B) Using more features for classification
C) Applying a linear kernel to the data
D) Reducing the size of the majority class

Answer: A) Over-sampling the minority class


13. What type of model is Logistic Regression considered to be?

A) Non-linear model
B) Linear model for classification
C) Non-parametric model
D) Clustering model

Answer: B) Linear model for classification


14. Which of the following is the main difference between Random Forests and Boosting methods like Gradient Boosting?

A) Random Forests use parallel learning while Boosting is sequential
B) Boosting models are simpler and faster than Random Forests
C) Random Forests combine weak learners, while Boosting combines strong learners
D) Boosting models overfit less than Random Forests

Answer: A) Random Forests use parallel learning while Boosting is sequential


15. What is the primary role of Regularization in classification models?

A) To increase the complexity of the model
B) To reduce the number of features used in the model
C) To penalize large model coefficients and reduce overfitting
D) To speed up the training process

Answer: C) To penalize large model coefficients and reduce overfitting

Leave a Reply

Your email address will not be published. Required fields are marked *