1. Which of the following is a key assumption made by the Naive Bayes classifier?
- A) Features are independent of each other.
- B) Features are dependent on each other.
- C) The target variable is normally distributed.
- D) The features are linearly related to the target variable.
Answer: A) Features are independent of each other.
Explanation: Naive Bayes assumes that features are conditionally independent, given the class label, which is a simplification that makes the algorithm computationally efficient.
2. In Naive Bayes, which distribution is commonly assumed for continuous data?
- A) Poisson distribution
- B) Exponential distribution
- C) Normal distribution
- D) Binomial distribution
Answer: C) Normal distribution
Explanation: For continuous data, Naive Bayes typically assumes that the features follow a normal (Gaussian) distribution.
3. What is the main advantage of the Naive Bayes classifier?
- A) It performs well even with a small amount of training data.
- B) It can only be used for binary classification tasks.
- C) It is sensitive to irrelevant features.
- D) It requires a lot of computational resources.
Answer: A) It performs well even with a small amount of training data.
Explanation: Naive Bayes is a probabilistic classifier that tends to perform well with relatively small datasets, especially when the features are conditionally independent.
4. In Naive Bayes, what is the role of Bayes’ theorem?
- A) It helps in calculating the probability of a feature belonging to a particular class.
- B) It is used to optimize the hyperparameters of the model.
- C) It is used for feature selection.
- D) It calculates the loss function for model training.
Answer: A) It helps in calculating the probability of a feature belonging to a particular class.
Explanation: Bayes’ theorem is used to calculate the posterior probability of a class given the features, which is the core of the Naive Bayes algorithm.
5. Which type of data can Naive Bayes be applied to?
- A) Only numerical data
- B) Only categorical data
- C) Both categorical and numerical data
- D) Only data with binary features
Answer: C) Both categorical and numerical data
Explanation: Naive Bayes can be applied to both types of data. For categorical data, it uses the multinomial distribution, and for continuous data, it assumes a normal distribution.
6. What is the primary disadvantage of the Naive Bayes classifier?
- A) It requires a large amount of training data.
- B) It assumes that features are independent, which is often unrealistic.
- C) It is computationally expensive.
- D) It performs poorly with large datasets.
Answer: B) It assumes that features are independent, which is often unrealistic.
Explanation: The independence assumption is a strong simplification that often does not hold in real-world data, which can lead to suboptimal performance.
7. In the context of Naive Bayes, what does the term “likelihood” refer to?
- A) The probability of a class given the features
- B) The probability of the features given the class
- C) The prior probability of a class
- D) The probability of the features occurring
Answer: B) The probability of the features given the class
Explanation: In Naive Bayes, the likelihood refers to the probability of observing the given features under each class.
8. What is the purpose of Laplace smoothing in Naive Bayes?
- A) To ensure that no probability is zero when a feature value does not appear in the training data.
- B) To reduce the computational complexity of the model.
- C) To optimize the prior probabilities.
- D) To handle missing data.
Answer: A) To ensure that no probability is zero when a feature value does not appear in the training data.
Explanation: Laplace smoothing adds a small constant to the probability estimates to avoid zero probabilities for unseen feature combinations.
9. Naive Bayes is particularly suited for which of the following tasks?
- A) Regression with large datasets
- B) Multiclass classification problems
- C) Feature engineering for deep learning models
- D) Clustering similar data points
Answer: B) Multiclass classification problems
Explanation: Naive Bayes is particularly effective for multiclass classification, where it can efficiently calculate probabilities for multiple classes.
10. Which of the following would likely reduce the performance of a Naive Bayes classifier?
- A) Using features that are highly correlated
- B) Using a smaller training dataset
- C) Using a very large number of features
- D) Using Laplace smoothing
Answer: A) Using features that are highly correlated
Explanation: Naive Bayes assumes feature independence. If features are highly correlated, the model’s performance can degrade because the independence assumption is violated.