Hypothesis 1

Hypothesis1:

H0: there is no significant difference in mean values of feature energy between pop songs and non-pop songs.

· T-test:

T test is used to test the difference between means when two samples are from independently normal distributions. We use the t statistic to evaluate whether two means are identical. A p value will be calculated. If p value is lower than the threshold, we can reject the Null hypothesis. Here we set the threshold as 0.05.

Motivation:

In the real world, there are no connection between pop songs and non pop songs. A reasonable assumption is that pop songs data and non-pop songs data are from independent normal distribution. To compare mean values of energy for two independent normal datasets, t test is a common choice.

Experiment:

1. select records with Parentcat as pop group

2. Rest rercords belong to non-pop group

3. apply t test on energy values of two groups

Result: The statistical test result is

Since p value is smaller than 0.05, we thus can reject the null hypothesis. There is significant difference between means of feature energy values between pop songs and non-pop songs. We can infer that songs of certain category have its characteristics, so we will be more confident to conduct further analysis like classification.

Report abuse