decision trees MCQs

Basic Concepts:

What is a decision tree in machine learning?

A. A method for clustering data points
B. A supervised learning algorithm used for classification and regression
C. An unsupervised learning technique for dimensionality reduction
D. A technique for visualizing high-dimensional data
Answer: B
In a decision tree, what are internal nodes responsible for?

A. Representing the root node of the tree
B. Making predictions based on input features
C. Connecting nodes to form branches
D. Testing specific conditions on input features
Answer: D
How does a decision tree handle categorical variables?

A. By converting them into binary variables
B. By using them directly without any transformation
C. By clustering them into distinct groups
D. By applying logistic regression to them
Answer: A
What is pruning in the context of decision trees?

A. Adding more branches to increase model complexity
B. Removing branches to prevent overfitting
C. Standardizing the splits across different nodes
D. Converting continuous variables into categorical variables
Answer: B
What is entropy used for in decision tree algorithms?

A. To measure the impurity or randomness of a dataset
B. To calculate the variance of the target variable
C. To normalize the distribution of residuals
D. To penalize complex models
Answer: A
Splitting Criteria:
6. Which criterion is commonly used to measure impurity in classification trees?

A. Mean Squared Error (MSE)

B. Information Gain (IG)

C. Gini Index

D. Variance

Answer: C

How does the Gini Index differ from Information Gain as a splitting criterion?

A. Gini Index prefers splits that maximize the information gain.
B. Gini Index is more sensitive to outliers compared to Information Gain.
C. Gini Index is based on the variance of the target variable.
D. Gini Index measures the probability of incorrect classification.
Answer: D
When constructing a decision tree, what is the role of the splitting criterion?

A. To evaluate the statistical significance of each feature
B. To choose the best feature and threshold for splitting at each node
C. To standardize the coefficients of the independent variables
D. To preprocess data for analysis
Answer: B
What does the term “pruning” refer to in decision trees?

A. The process of removing outliers from the dataset
B. The method of handling missing values in variables
C. The technique for reducing the size of the tree to avoid overfitting
D. The step of transforming categorical variables into numerical ones
Answer: C
Which splitting criterion is preferred when dealing with continuous variables in decision trees?

A. Gini Index
B. Information Gain
C. Mean Squared Error (MSE)
D. Root Mean Squared Error (RMSE)
Answer: C
Model Evaluation and Applications:
11. How does a decision tree handle missing values during training?
– A. It imputes missing values using the mean of the variable.
– B. It removes instances with missing values from the dataset.
– C. It assigns missing values to a separate category.
– D. It splits the node based on available data.
– Answer: D

What does the term “feature importance” refer to in decision trees?

A. The complexity of the decision tree model
B. The significance of each feature in making accurate predictions
C. The distribution of residuals in the dataset
D. The number of nodes and branches in the tree
Answer: B
How does a decision tree model prevent overfitting?

A. By using regularization techniques like Ridge or Lasso
B. By increasing the number of nodes and branches
C. By pruning the tree to reduce its size
D. By converting continuous variables into categorical variables
Answer: C
In what scenarios would you prefer using a decision tree over other machine learning algorithms?

A. When dealing with high-dimensional data
B. When the relationships between variables are linear
C. When transparency and interpretability are important
D. When there are multicollinearity issues among predictors
Answer: C
How does the depth of a decision tree affect its performance?

A. Deeper trees generally lead to better generalization but may overfit.
B. Shallower trees are more accurate in predicting outcomes.
C. Deeper trees are less sensitive to changes in the dataset.
D. Shallower trees are more computationally expensive to train.
Answer: A
Practical Considerations and Interpretations:
16. What does a decision tree model output for a given instance in a dataset?
– A. The predicted class label
– B. The value of the target variable
– C. The distribution of residuals
– D. The p-value of the regression coefficients
– Answer: A

How does the CART algorithm differ from other decision tree algorithms?

A. CART uses the Gini Index as its default criterion for splitting.
B. CART is specifically designed for regression tasks only.
C. CART cannot handle categorical variables.
D. CART is more computationally intensive compared to other algorithms.
Answer: A
What is the primary advantage of using decision trees in ensemble methods like Random Forests?

A. They reduce the variance of predictions and improve accuracy.
B. They increase the bias of the model and prevent overfitting.
C. They standardize the coefficients of independent variables.
D. They simplify complex relationships between variables.
Answer: A
How does the computational complexity of training a decision tree scale with the size of the dataset?

A. Linearly
B. Quadratically
C. Logarithmically
D. Exponentially
Answer: B
What is the primary disadvantage of using decision trees in machine learning?

A. They are prone to overfitting, especially with noisy data.
B. They cannot handle both categorical and numerical variables.
C. They require extensive preprocessing of data.
D. They are computationally expensive for large datasets.
Answer: A

decision trees MCQs

More MCQS on Management Sciences

Leave a Comment Cancel reply