1. Which of the following is an advantage of using Decision Trees for classification?
A) They require a lot of data preprocessing
B) They are easy to interpret and visualize
C) They cannot handle missing data
D) They always produce highly accurate results
Answer: B) They are easy to interpret and visualize
2. Which of the following is a key characteristic of Naive Bayes classification?
A) It assumes that features are independent given the class label
B) It works by minimizing squared errors between predicted and true labels
C) It splits the data using a hyperplane to separate classes
D) It is an ensemble learning method
Answer: A) It assumes that features are independent given the class label
3. What does the support vector represent in Support Vector Machines (SVM)?
A) The data point that is closest to the decision boundary
B) The point that is farthest from the decision boundary
C) The data point that is incorrectly classified
D) The average of the data points from both classes
Answer: A) The data point that is closest to the decision boundary
4. In a Random Forest classifier, what does the term “ensemble learning” refer to?
A) Using multiple models to make decisions, and combining their outputs for better accuracy
B) Using a single model trained on all the data
C) Randomly selecting a subset of features for each decision tree
D) Using a single decision tree to classify data points
Answer: A) Using multiple models to make decisions, and combining their outputs for better accuracy
5. Which of the following algorithms is primarily used for linear classification?
A) K-Nearest Neighbors
B) Logistic Regression
C) Decision Trees
D) Random Forest
Answer: B) Logistic Regression
6. What is the purpose of the “kernel trick” in Support Vector Machines (SVM)?
A) It increases the complexity of the decision boundary
B) It transforms non-linearly separable data into linearly separable data
C) It makes the model simpler and easier to interpret
D) It helps in reducing the training time of SVM
Answer: B) It transforms non-linearly separable data into linearly separable data
7. Which of the following classifiers works by partitioning the feature space using axis-aligned rectangles?
A) K-Nearest Neighbors
B) Decision Trees
C) Logistic Regression
D) Naive Bayes
Answer: B) Decision Trees
8. In k-Nearest Neighbors (k-NN), what does the “k” represent?
A) The number of classes in the dataset
B) The number of data points considered for classification
C) The number of features used to classify a data point
D) The number of nearest neighbors to consider when making a prediction
Answer: D) The number of nearest neighbors to consider when making a prediction
9. Which of the following metrics is used to evaluate classification models in terms of both precision and recall?
A) Accuracy
B) F1 Score
C) Mean Squared Error
D) Area Under the Curve (AUC)
Answer: B) F1 Score
10. In Gradient Boosting, how are weak learners (typically decision trees) combined to form a strong learner?
A) By averaging their predictions
B) By taking the majority vote of their classifications
C) By sequentially correcting the errors of previous trees
D) By creating new features based on the errors of previous trees
Answer: C) By sequentially correcting the errors of previous trees
11. Which of the following is NOT a typical application of classification techniques?
A) Email spam detection
B) Predicting stock prices
C) Handwritten digit recognition
D) Disease diagnosis
Answer: B) Predicting stock prices
12. Which of the following methods is best suited for handling imbalanced classes in classification problems?
A) Over-sampling the minority class
B) Using more features for classification
C) Applying a linear kernel to the data
D) Reducing the size of the majority class
Answer: A) Over-sampling the minority class
13. What type of model is Logistic Regression considered to be?
A) Non-linear model
B) Linear model for classification
C) Non-parametric model
D) Clustering model
Answer: B) Linear model for classification
14. Which of the following is the main difference between Random Forests and Boosting methods like Gradient Boosting?
A) Random Forests use parallel learning while Boosting is sequential
B) Boosting models are simpler and faster than Random Forests
C) Random Forests combine weak learners, while Boosting combines strong learners
D) Boosting models overfit less than Random Forests
Answer: A) Random Forests use parallel learning while Boosting is sequential
15. What is the primary role of Regularization in classification models?
A) To increase the complexity of the model
B) To reduce the number of features used in the model
C) To penalize large model coefficients and reduce overfitting
D) To speed up the training process
Answer: C) To penalize large model coefficients and reduce overfitting