1. What is the primary characteristic of Big Data?
A. It involves large amounts of structured data only
B. It is used for storing and analyzing small datasets
C. It requires specialized software for real-time processing
D. It involves high volume, velocity, and variety of data
Answer: D
(Big Data is characterized by the 3Vs: volume, velocity, and variety)
2. Which of the following is NOT a typical feature of Big Data?
A. Variety
B. Volume
C. Verification
D. Velocity
Answer: C
(Verification is not one of the “Vs” of Big Data, which are volume, variety, and velocity)
3. Which of the following is an example of structured data?
A. Tweets from Twitter
B. A relational database
C. Video content from YouTube
D. Sensor data from a smart device
Answer: B
(A relational database stores data in a structured format, with clearly defined tables and schemas)
4. In Data Mining, what is the main goal?
A. To collect and store data
B. To clean data for processing
C. To analyze large datasets and find patterns or relationships
D. To visualize data for better understanding
Answer: C
(Data Mining involves extracting useful patterns, trends, and relationships from large datasets)
5. Which of the following best describes Data Mining?
A. The process of storing data in a warehouse for future use
B. The process of analyzing large datasets to discover hidden patterns
C. The process of cleaning data for storage in databases
D. The process of designing a data storage system
Answer: B
(Data Mining is the process of analyzing large datasets to uncover hidden patterns and relationships)
6. Which of the following Big Data technologies is primarily used for distributed storage?
A. Hadoop
B. R
C. Weka
D. RapidMiner
Answer: A
(Hadoop is widely used for distributed storage and processing of Big Data)
7. What is the main function of Hadoop Distributed File System (HDFS)?
A. To process large datasets in parallel
B. To provide a fault-tolerant storage system for Big Data
C. To analyze data using machine learning algorithms
D. To visualize data trends in real-time
Answer: B
(HDFS is a distributed file system designed to store large datasets reliably across a distributed cluster)
8. What does NoSQL stand for?
A. Not Only SQL
B. New SQL
C. Non-Structured Query Language
D. None of the above
Answer: A
(NoSQL stands for Not Only SQL, referring to databases that store data in ways other than traditional relational models)
9. Which of the following is a key feature of Data Mining?
A. Clustering
B. Sorting
C. Indexing
D. Fragmenting
Answer: A
(Clustering is a common technique used in Data Mining to group similar data points together)
10. What is the main difference between Big Data and traditional data processing?
A. Big Data requires a faster network connection
B. Big Data involves smaller datasets with faster processing
C. Big Data requires new technologies to handle large, diverse, and fast data streams
D. Big Data involves more manual work for data cleaning and analysis
Answer: C
(Big Data requires new technologies, like Hadoop, to handle large, diverse, and fast-moving data streams)