Quiz
#ch1_review
張貼日期:Sep 24, 2013 6:39:8 AM
Discuss whether or not each of the following activities is a data mining task. If so, what is the style (classification, association, clustering, numeric prediction)?
(a) Dividing the customers of a company according to their gender.
(b) Dividing the customers of a company according to their profitability.
(c) Computing the total sales of a company.
(d) Sorting a student database based on student identification numbers.
(e) Predicting the outcomes of tossing a (fair) pair of dice.
(f) Predicting the future stock price of a company using historical records.
(g) Monitoring the heart rate of a patient for abnormalities.
(h) Monitoring seismic waves for earthquake activities.
(i) Extracting the frequencies of a sound wave.
For each of the following data sets, explain whether or not data privacy is an important issue.
(a) Census data collected from 1900-1950.
(b) IP addresses and visit times of Web users who visit your Website.
(c) Images from Earth-orbiting satellites.
(d) Names and addresses of people from the telephone book.
(e) Names and email addresses collected from the Web.
(f) Wireless data collected from non-encrypted WiFi network.
Watch the following videos on TED and write your thought of personal data and privacy protection in the digital era.
Suppose you are asked to develop a stock price prediction model. Consider the following questions.
(a) What attributes do you want to use? Write down the reasons.
(b) Where to collect the attributes you want to use? List their URLs for download.
(c) In our course, I mentioned two methods used in power load forecasting: kNN and linear regression. Which one do you want to apply in stock price prediction? Write down you reason.
Consider the problem of finding the k-nearest neighbors of a data object. A programmer designs an algorithm for this task as follows.
(a) Describe the potential problems with this algorithm if there are duplicate objects in the data set. Assume the distance function will only return a distance of 0 for objects that are the same.
(b) How would you fix this problem?