### data mining/R-Stats

#### TYPE 1 AND TYPE 2 ERROR

posted Feb 5, 2015, 12:40 PM by shuo zhang   [ updated Feb 5, 2015, 12:45 PM ]

 WHICH ONE IS IT?Suppose we do know the true mean of the sampling distribution, it turns out that our estimate with a sample of 30 is correct. (H0)

#### great explanation of prosecutor's fallacy

posted Mar 12, 2014, 9:52 AM by shuo zhang

 http://www.programminglogic.com/the-prosecutors-fallacy/

#### set theory

posted Mar 11, 2014, 6:46 PM by shuo zhang

 union and intersectionhttp://online.math.uh.edu/MiddleSchool/Modules/Module_5_Prob_Stat/Content/Prob/Ch3_3.pdf

#### second order derivative

posted Jan 15, 2014, 12:13 PM by shuo zhang   [ updated Feb 13, 2014, 6:56 PM ]

#### must update R to 3.0.2 to support MAC MAVERICK

posted Jan 3, 2014, 9:42 AM by shuo zhang

#### classification evaluation measures

posted Dec 15, 2013, 9:16 AM by shuo zhang

 predicted predicted + - actual + TP FN actual - FP TN

TPR=recall=TP/(TP+FN)
precision=TP/(TP+FP)
TNR=TN/(TN+FP)=specificity
FPR=FP/(FP+TN)
FNR=FN/(FN+TP)
F=2*recall*precision/(precision+recall)=2*TP/(2*TP+FP+FN)

F-statistic is a harmonic mean of precision and recall, i.e.,
F=2/(1/r+1/p)

#### k-medoids algorithm and demo

posted Dec 13, 2013, 12:24 PM by shuo zhang

 The most common realisation of k-medoid clustering is the Partitioning Around Medoids (PAM) algorithm and is as follows:[2]Initialize: randomly select k of the n data points as the medoidsAssociate each data point to the closest medoid. ("closest" here is defined using any valid distance metric, most commonly Euclidean distance, Manhattan distance or Minkowski distance)For each medoid mFor each non-medoid data point oSwap m and o and compute the total cost of the configurationSelect the configuration with the lowest cost.repeat steps 2 to 4 until there is no change in the medoiddemo:http://en.wikipedia.org/wiki/K-medoids

#### Cute illustration of k-means

posted Dec 13, 2013, 12:13 PM by shuo zhang   [ updated Feb 13, 2014, 6:56 PM ]

#### using plot3d to visualize 3d data in R

posted Dec 12, 2013, 12:57 PM by shuo zhang