1. What is the main goal of the Support Vector Machine (SVM) algorithm?
- A) To minimize the distance between the support vectors and the decision boundary
- B) To maximize the margin between classes while minimizing classification errors
- C) To calculate the probability of each class
- D) To reduce the dimensionality of the feature space
Answer: B) To maximize the margin between classes while minimizing classification errors
Explanation: The main goal of SVM is to find a decision boundary (hyperplane) that maximizes the margin between the two classes, leading to better generalization.
2. In SVM, what is the “support vector”?
- A) A data point that is located far from the decision boundary
- B) A data point that lies on or near the margin and affects the position of the decision boundary
- C) A data point used for cross-validation
- D) A data point that is misclassified by the algorithm
Answer: B) A data point that lies on or near the margin and affects the position of the decision boundary
Explanation: Support vectors are the critical points that influence the placement of the decision boundary in SVM.
3. Which of the following is true about the kernel trick in SVM?
- A) It is used to make the SVM algorithm more interpretable.
- B) It maps the data into a higher-dimensional space to make it linearly separable.
- C) It reduces the computational cost of training an SVM.
- D) It is only applicable for binary classification tasks.
Answer: B) It maps the data into a higher-dimensional space to make it linearly separable.
Explanation: The kernel trick is used to transform data into a higher-dimensional space where a linear hyperplane can separate the data, even if it’s not linearly separable in the original space.
4. Which kernel is commonly used in SVM for non-linear classification problems?
- A) Linear kernel
- B) Polynomial kernel
- C) Sigmoid kernel
- D) Radial basis function (RBF) kernel
Answer: D) Radial basis function (RBF) kernel
Explanation: The RBF kernel is the most commonly used kernel in SVM for non-linear classification, as it can handle cases where the data is not linearly separable.
5. What does the “margin” in SVM refer to?
- A) The distance between the decision boundary and the nearest data points of each class
- B) The difference between the maximum and minimum values of the features
- C) The gap between training and testing data
- D) The number of support vectors used by the algorithm
Answer: A) The distance between the decision boundary and the nearest data points of each class
Explanation: The margin is the distance between the decision boundary (hyperplane) and the nearest data points from either class. SVM aims to maximize this margin.
6. In the context of SVM, what does the parameter “C” control?
- A) The size of the margin between classes
- B) The complexity of the decision boundary
- C) The regularization of the SVM, controlling the trade-off between achieving a larger margin and allowing some misclassification
- D) The choice of kernel to use in the model
Answer: C) The regularization of the SVM, controlling the trade-off between achieving a larger margin and allowing some misclassification
Explanation: The “C” parameter controls the regularization: a larger C leads to a narrower margin and fewer misclassifications, while a smaller C allows more misclassifications but a wider margin.
7. Which of the following is a common disadvantage of using SVM?
- A) SVM is highly interpretable and easy to explain.
- B) SVM is sensitive to the choice of kernel.
- C) SVM is not effective for high-dimensional data.
- D) SVM is only suitable for binary classification problems.
Answer: B) SVM is sensitive to the choice of kernel.
Explanation: The performance of SVM heavily depends on the choice of kernel. Selecting the right kernel and hyperparameters can be challenging.
8. In SVM, what does the “slack variable” represent in the context of soft margin classification?
- A) The number of support vectors
- B) The penalty for misclassification
- C) The error tolerance for non-separable data
- D) The margin width
Answer: C) The error tolerance for non-separable data
Explanation: Slack variables allow for some misclassification in cases where the data is not linearly separable, providing a soft margin that balances the margin width and misclassification penalties.
9. Which of the following would be an ideal application for a Support Vector Machine?
- A) When you have a very large dataset with many features but few samples
- B) When the data is linearly separable and you want to classify it efficiently
- C) When the task is unsupervised learning
- D) When the data is highly unstructured, such as images or text data
Answer: B) When the data is linearly separable and you want to classify it efficiently
Explanation: SVM works best when the data is either linearly separable or can be transformed into a higher-dimensional space where it becomes linearly separable.
10. What is the difference between “hard margin” and “soft margin” in SVM?
- A) Hard margin allows for misclassification, while soft margin does not.
- B) Hard margin is used only in binary classification, while soft margin can be used in multi-class classification.
- C) Hard margin does not allow any misclassification, while soft margin allows for some errors in the case of non-separable data.
- D) Hard margin requires a larger dataset compared to soft margin.
Answer: C) Hard margin does not allow any misclassification, while soft margin allows for some errors in the case of non-separable data.
Explanation: Hard margin SVM requires the data to be linearly separable without any errors, whereas soft margin SVM allows for some misclassification, making it more flexible for real-world datasets.