年齡與失業率之假設檢定

場景

  • 根據統計某城市 50 位樣本,其年齡與失業週數如附件檔案所示。
  • 全國平均失業週數=14.6

問題

  • 使用敘述性統計(Descriptive Statistics)表達年齡與失業週數資料。
  • 信心水準為 95% 時,求算失業年齡平均值之區間範圍。
  • 顯著水準為 0.01 時,檢定樣本平均失業週數是否大於全國。
  • 分析年齡是否跟失業週數有關。

GNU R

# 讀取資料檔案至表格中
AgeWeeks <- read.table("HypothesisTesting-Case2.dat", header=TRUE)

# 敘述統計資料
summary(AgeWeeks)

# 結果
#      Age            Weeks      
# Min.   :22.00   Min.   : 1.00  
# 1st Qu.:25.00   1st Qu.:11.50  
# Median :40.00   Median :18.00  
# Mean   :38.93   Mean   :17.47  
# 3rd Qu.:48.50   3rd Qu.:22.00  
# Max.   :59.00   Max.   :37.00  

# 繪製 Box Plot 圖
par(mfrow=c(2,2))
boxplot(main="Age Mean", AgeWeeks$Age, ylab="Age")
boxplot(main="Weeks Mean", AgeWeeks$Weeks, ylab="Weeks")

###############################################################################
Age.Size = length(AgeWeeks$Age)
Age.Mean = mean(AgeWeeks$Age)
Age.StdDeviation = sd(AgeWeeks$Age)

Confidence.Level = 0.95
Z.SemiApha = pnorm((1 - Confidence.Level)/2, lower.tail=FALSE)
AgeMean.IntervalEstimation = Z.SemiApha * Age.StdDeviation / sqrt(Age.Size)
AgeMean.Max = Age.Mean + AgeMean.IntervalEstimation
AgeMean.Min = Age.Mean - AgeMean.IntervalEstimation

print(sprintf("Age Mean=%.4f StdDeviation=%.4f", Age.Mean, Age.StdDeviation))
print(sprintf("%.4f >= Age.Mean >= %.4f", AgeMean.Max, AgeMean.Min))

# 結果
# Age Mean=38.9333 StdDeviation=13.3495
# 40.6224 >= Age.Mean >= 37.2443

###############################################################################
Weeks.Size = length(AgeWeeks$Weeks)
Weeks.Mean = mean(AgeWeeks$Weeks)
Weeks.StdDeviation = sd(AgeWeeks$Weeks)

Level.Significance = 0.01
Z.SemiApha = pnorm((1 - Confidence.Level)/2, lower.tail=FALSE)
WeeksMean.IntervalEstimation = Z.SemiApha * Weeks.StdDeviation / sqrt(Weeks.Size)
WeeksMean.Max = Weeks.Mean + WeeksMean.IntervalEstimation
WeeksMean.Min = Weeks.Mean - WeeksMean.IntervalEstimation

print(sprintf("Weeks Mean=%.4f StdDeviation=%.4f", Weeks.Mean, Weeks.StdDeviation))
print(sprintf("%.4f >= Weeks.Mean >= %.4f", WeeksMean.Max, WeeksMean.Min))

# 結果
# Weeks Mean=17.4667 StdDeviation=9.9848
# 18.7300 >= Weeks.Mean >= 16.2034

# H0: μ <= 14.6
# H1: μ > 14.6 

Z.Value = (14.6 - Weeks.Mean) / (Weeks.StdDeviation / sqrt(Weeks.Size))
Z.Probability = pnorm(Z.Value, lower.tail=FALSE)
Alpha.Probability = 1 - Level.Significance

print(sprintf("H0: μ <= %.4f", 14.6))
print(sprintf("H0: μ > %.4f", 14.6))
print(sprintf("Do NOT Reject (Right-Tailed): %.4f < %.4f (Apha)", Z.Probability, Alpha.Probability))

# 結果
# H0: μ <= 14.6000"
# H0: μ > 14.6000"
# Do NOT Reject (Right-Tailed): 0.8669 < 0.9900 (Apha)"

###############################################################################
Age.Var = AgeWeeks$Age - Age.Mean
Weeks.Var = AgeWeeks$Weeks - Weeks.Mean

plot(Age.Var, Weeks.Var, main="Covariance Age-Weeks", xlab="Age", ylab="Weeks")

Sum.Var = sum(Age.Var * Weeks.Var)
Covariance.AgeWeeks = Sum.Var / (Age.Size - 1)

print(sprintf("Covariance.Age-Weeks=%.4f -> Positive Linear Associated", Covariance.AgeWeeks))

# 結果
# Covariance.Age-Weeks=98.9619 -> Positive Linear Associated

解答

      Age            Weeks      
 Min.   :22.00   Min.   : 1.00  
 1st Qu.:25.00   1st Qu.:11.50  
 Median :40.00   Median :18.00  
 Mean   :38.93   Mean   :17.47  
 3rd Qu.:48.50   3rd Qu.:22.00  
 Max.   :59.00   Max.   :37.00  

 Age Mean=38.9333 StdDeviation=13.3495"
 40.6224 >= Age.Mean >= 37.2443"
 
 Weeks Mean=17.4667 StdDeviation=9.9848"
 18.7300 >= Weeks.Mean >= 16.2034"

 H0: μ <= 14.6000"
 H0: μ > 14.6000"
 Do NOT Reject (Right-Tailed): 0.8669 < 0.9900 (Alpha)"
 
 Covariance.Age-Weeks=98.9619 -> Positive Linear Associated"
ċ
HypothesisTesting-Case2.R
(3k)
李智,
2011年6月28日 下午12:47
ċ
HypothesisTesting-Case2.dat
(0k)
李智,
2011年6月28日 下午12:24