Classification MCQs

1. What is the primary goal of a classification algorithm in machine learning?

a) To predict continuous numerical values
b) To group similar data points into clusters
c) To assign each data point to one of the predefined classes or categories
d) To reduce the dimensionality of the data

Answer: c) To assign each data point to one of the predefined classes or categories


2. Which of the following is a supervised learning algorithm used for classification tasks?

a) K-means clustering
b) Naive Bayes
c) Principal Component Analysis (PCA)
d) DBSCAN

Answer: b) Naive Bayes


3. Which of the following techniques is commonly used for binary classification?

a) Decision trees
b) K-means clustering
c) Linear regression
d) Apriori algorithm

Answer: a) Decision trees


4. In a confusion matrix, what does the True Positive (TP) represent?

a) The number of instances incorrectly predicted as positive
b) The number of instances correctly predicted as negative
c) The number of instances correctly predicted as positive
d) The number of instances incorrectly predicted as negative

Answer: c) The number of instances correctly predicted as positive


5. What is the accuracy metric in classification?

a) The percentage of correctly classified instances out of all instances
b) The percentage of false positive instances
c) The ratio of True Positives to False Positives
d) The number of features used for classification

Answer: a) The percentage of correctly classified instances out of all instances


6. Which of the following classifiers is based on the Bayes’ Theorem?

a) K-nearest neighbors (KNN)
b) Naive Bayes
c) Support Vector Machines (SVM)
d) Decision Trees

Answer: b) Naive Bayes


7. In the K-nearest neighbors (KNN) algorithm, how is the class of a new data point determined?

a) By finding the most frequent class among its k nearest neighbors
b) By calculating the average of all classes in the dataset
c) By applying the mean of the features for classification
d) By using a decision tree

Answer: a) By finding the most frequent class among its k nearest neighbors


8. What is the decision boundary in classification?

a) The point where the algorithm separates training and testing data
b) The line or surface that separates different classes in the feature space
c) The boundary between training and validation data
d) The threshold that classifies data into positive or negative

Answer: b) The line or surface that separates different classes in the feature space


9. Which of the following algorithms is used for multiclass classification?

a) Logistic Regression
b) Naive Bayes
c) Decision Trees
d) All of the above

Answer: d) All of the above


10. In logistic regression, the output of the model is a:

a) Continuous value
b) Binary class (0 or 1)
c) Multiclass output
d) Probability value between 0 and 1

Answer: d) Probability value between 0 and 1


11. What does the F1-score measure in classification tasks?

a) The balance between precision and recall
b) The total number of correct predictions
c) The number of false positives
d) The area under the ROC curve

Answer: a) The balance between precision and recall


12. Which of the following is an advantage of Support Vector Machines (SVM) for classification?

a) It works well on both linear and non-linear data
b) It is computationally less expensive than KNN
c) It requires fewer training examples
d) It can only handle binary classification tasks

Answer: a) It works well on both linear and non-linear data


13. What is overfitting in a classification model?

a) The model performs well on both the training and testing data
b) The model performs poorly on the testing data but well on training data
c) The model fails to recognize patterns in the data
d) The model cannot handle new, unseen data

Answer: b) The model performs poorly on the testing data but well on training data


14. Which metric is used to evaluate binary classification models when there is a class imbalance?

a) Accuracy
b) Precision
c) Recall
d) Area under the ROC curve (AUC-ROC)

Answer: d) Area under the ROC curve (AUC-ROC)


15. What is cross-validation used for in classification tasks?

a) To find the optimal number of features
b) To improve the classification accuracy
c) To assess the model’s generalizability and avoid overfitting
d) To select the best classification algorithm

Answer: c) To assess the model’s generalizability and avoid overfitting

Leave a Comment