Quiz
#ch1_review

張貼日期:Sep 24, 2013 6:39:8 AM

    (a) Dividing the customers of a company according to their gender.

    (b) Dividing the customers of a company according to their profitability.

    (c) Computing the total sales of a company.

    (d) Sorting a student database based on student identification numbers.

    (e) Predicting the outcomes of tossing a (fair) pair of dice.

    (f) Predicting the future stock price of a company using historical records.

    (g) Monitoring the heart rate of a patient for abnormalities.

    (h) Monitoring seismic waves for earthquake activities.

    (i) Extracting the frequencies of a sound wave.


    (a) Census data collected from 1900-1950.

    (b) IP addresses and visit times of Web users who visit your Website.

    (c) Images from Earth-orbiting satellites.

    (d) Names and addresses of people from the telephone book.

    (e) Names and email addresses collected from the Web.

    (f) Wireless data collected from non-encrypted WiFi network.



    (a) What attributes do you want to use? Write down the reasons.

    (b) Where to collect the attributes you want to use? List their URLs for download.

    (c) In our course, I mentioned two methods used in power load forecasting: kNN and linear regression. Which one do you want to apply in stock price prediction? Write down you reason.


    (a) Describe the potential problems with this algorithm if there are duplicate objects in the data set. Assume the distance function will only return a distance of 0 for objects that are the same.

    (b) How would you fix this problem?