Course Information (Spring 2016)

General Information
  • Course Number: CE5020 @ NCU, Taiwan
  • Location: E6-A205
  • Semester: Spring 2016, Friday 9:00am-11:50am
  • Target students: Senior undergraduates, graduates
Teaching Assistant
  • 林圓皓(、張國斌 ( )
  • Office: E6-A307 ext. 35327 or R2-206 ext. 57868
Evaluation and Grading
  • Three Assignments: 20%
  • Oral presentation: 15%
  • Quiz: 10%
  • Mid-term Exam: 25%
  • Project: 30%
Course Topics
  • Preliminary (1 week)
  • Predictive data mining (2 weeks)
  • Association rule mining (2 weeks)
  • Cluster analysis (2 weeks)
  • Recommendation (2 weeks)
  • Finding Similar Items (2 weeks)


Recent Announcements

  • Assignment #4 1. Use the given gps data from flickr to finish a clustering task based. Try different clustering algorithms.2. Show your clustering result on "3D Map" which is a function in excel 2016(name "power view" for 2013).3. Compute correlation of your clustering result for evaluation 3D Map tutorial: power view tutorial: date: 2016/4/15 23:59 (1st round) 2016/4/22 23:55 (2nd round)Submission ...
    Posted Apr 8, 2016, 12:25 AM by 林圓皓
  • Reference Answer for Assignment #2
    Posted Apr 7, 2016, 5:57 PM by Chia-Hui Chang
  • Assignment #3 1. Video lectures: More Data Mining with Weka: 3.3~3.4Exercise: Use data.txt attached below to find top 10 association rules with highest support and minimum confidence greater than 0.8. 2. Use the given grocery shopping dataset released by ACM RecSys ( to find interesting patterns. The dataset collected users` transaction data of 4 months, from November 2000 to February 2001. The total count of transactions in this dataset is 817741, which belong to 32266 users and 23812 products. The file D.txt records users` transaction history. Each line in the file corresponds to a transaction in the following format: Transaction date; customerID; Age group; Residence Area ...
    Posted Mar 25, 2016, 6:39 PM by Chia-Hui Chang
  • Assignment #2 Problem 1: for the given dataset Produce a learning curve (accuracy vs. training size) for the given data set. Plot ROC curve for your model.Produce a plot (accuracy on training and testing vs. model size) to observe overfitting and underfitting scenarios by tuning the parameters of your learning algorithms. Problem 2: Performance evaluationFor k-labels (k>2) classification problem, propose some metric for performance evaluation.For 5-grade outcome, propose a measure for performance evaluation.Due date: 2016/3/18 23:59 (1st round) 2016/3/25 23:59 (2nd round)Submission: upload to LMS
    Posted Mar 12, 2016, 5:31 PM by Chia-Hui Chang
  • Assignment #1 1. Construct a pivot table for the given precipitation data (in file 2015_rainfall.xlsx) to (a) generate a summary of the data (b) generate a report of each location's average rainfall for each month, i.e. location ( "BANQIAO, 板橋","TAMSUI,淡水") vs. week2. Prepare the arff file so that Weka can import for the given weather data (2015_weather.csv).References: Getting started with Weka 1.1~1.5Due date: 2016/3/3 23:59 (1st round) 2016/3/10 23:59 (2nd round)
    Posted Mar 10, 2016, 5:04 PM by Chia-Hui Chang
Showing posts 1 - 5 of 6. View more »