# Data Mining Methods Q&A

This is 20 question set of Data Mining Methods. Please NOTE that all questions and answers are based on our research and self-study.

1. The process of extracting valid, useful, unknown info from data and using it to make a proactive knowledge-driven business is called____________

###### Ans: Data Mining

2. What is the other name for Data Preparation stage of Knowledge Discovery Process in data mining?

Ans: ETL

3. Which of the following role is responsible for performing validation on analysis datasets?
Ans: Statisticians

4. Which of the following activities is performed as part of data pre processing?
Ans: Detect Missing Values

5. Which of the following modelling type should be used for Labelled data?
Ans: Predictive Modelling

6. Noisy values are the values that are valid for the dataset, but are incorrectly recorded
Ans: True

7. Which statistical technique deals with finding a structure in a collection of unlabeled data?
Ans: Clustering

8. Probability of theft in an area is 0.03 with expected loss of 20% or 30% of things with probabilities 0.55 and 0.45. Insurance policy from A costs \$150 pa with 100% repayment. Policy with B, costs \$100 pa and first \$500 of any loss has to be paid by the owner. Which data mining technique can be used to choose the policy?
Ans: Decision Tree

9. What is the type of learning where a function is inferred to describe hidden structure from unlabeled data?
Ans: Unsupervised Learning

10. Statistical technique used for investigating and modelling the relationship between two or more variables is:
Ans: Regression analysis

11. If time is used as an independent variable in a simple linear regression analysis, which of the following assumptions could be violated?
Successive observations of the dependent variable are uncorrelated

12. Machine learning task of inferring a function from labelled training data is known as____________
Ans: Supervised Learning

13. Which is the statistical technique used for investigating and modelling the relationship between two or more variables?
Ans: Regression analysis

14. Regression is typically carried out to develop a mathematical model of the process.
Ans: True

15. Associate rule is known as ____________
Ans: Affinity analysis

16. Which data mining method groups together objects that are similar to each other and dissimilar to the other objects?
Ans: Clustering

17. Which of the following activities are performed as part of data pre processing?
Ans: All the options

18. ______________ are the values that mark the boundaries of the confidence interval.
Ans: Confidence limits

19. Simulations are carried out to develop a mathematical model of the process
Ans: False

20. Which of the following is not applicable to Data Mining?
Ans: Involves working with known information