探索不願表態者隱藏意向 (Tacit Intention)

場景：

設計問卷的目的是要探索在填問卷的當下，探索受問者隱藏的意向。問卷中的題項不能與問卷目的太過直接，否則受問者可能會依照或故意相反地回答，而失去了問卷設計初衷。問卷中的題項隱含著一個理論模型，說明著為什麼要這樣問，以及問完後能獲致何種結果，或者更直接地說，就適用問卷來證明那個理論模型是否正確，因此沒有理論模型的存在，問卷的價值就只在蒐集一堆個人化情報而已，對於知識建構或發現沒有助益，真正問題是理論模型的建構牽涉很廣，並不是件容易的事。

今有一組問卷資料，要探索受問者對某個理念是否願意表態 (是/否)，也有許多人不願意表態 (NA)，在理論模型中假說：「表態」是受另外 5 個因素所左右，其中一個是「年齡」。在這些因素中，除了「年齡」外，有些人也不願意全部告知。

問題：

能否有一個方法及程序，「推敲」不願意表態者隱藏的意向，也就是說從影響因素中「推敲」受問者表態 (是/否) 呢？

資料：

- 筆數：1077，其中「表態=是」：670，「表態=否」：350，「表態=NA」：57
- 各因素筆數分析：

- 各因素統計值分析：

- - - - RawData <- read.table("RawData.csv", header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE) summary(RawData)

- 某因素迴歸分析：在 95% 信賴區間下，Factor4 由 Factor2 及是否表態 (Attitude) 所決定。
  - - Call: lm(formula = Factor4 ~ Age + Factor1 + Factor2 + Factor3 + Attitude, data = SRC) Residuals: Min 1Q Median 3Q Max -6.1798 -1.1013 0.0496 1.8800 4.6377 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.448575 0.427161 10.414 < 2e-16 *** Age -0.002476 0.006456 -0.384 0.70140 Factor1 -0.027297 0.110072 -0.248 0.80421 Factor2 0.164183 0.068692 2.390 0.01710 * Factor3 0.005206 0.089847 0.058 0.95381 Attitude 1.148066 0.363276 3.160 0.00164 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.536 on 721 degrees of freedom Multiple R-squared: 0.02735, Adjusted R-squared: 0.02061 F-statistic: 4.055 on 5 and 721 DF, p-value: 0.001229

資料處理程序：

- 資料分群處理：將原始資料 Attitude 欄位資料，將 {1=Y, 0=N} 與 { NA } 分成各自獨立的檔案。以下為 Python 程式：

- - - - # Data Normalization # Filter out NA Records # Source File fileName = "RawData.csv" csvFile = open(fileName, 'r') # Normalized File solidFileName = "Data_NA_None.csv" solidCsvFile = open(solidFileName, 'w') isFirstLine = True csvHeader = [] while True: csvLine = csvFile.readline() if not csvLine: break if isFirstLine: isFirstLine = False csvHeader = csvLine.strip().split(',') solidCsvFile.write('"%s","%s","%s","%s","%s","%s"\n' % (csvHeader[1], csvHeader[2], csvHeader[3], csvHeader[4], csvHeader[5], csvHeader[6])) else: csvFields = csvLine.strip().split(',') isNA = False for i in range(1, 6): if csvFields[i] == 'NA': isNA = True break if not isNA: Attitude = 'N' if csvFields[1] == '1': Attitude = 'Y' solidCsvFile.write('"%s",%s,%s,%s,%s,%s\n' % (Attitude, csvFields[2], csvFields[3], csvFields[4], csvFields[5], csvFields[6])) solidCsvFile.close() csvFile.close()

- - - - # Data Normalization # Extract Turnout=NA Records # Filter out Others=NA Records # Source File fileName = "RawData.csv" csvFile = open(fileName, 'r') # Normalized File solidFileName = "Data_NA.csv" solidCsvFile = open(solidFileName, 'w') isFirstLine = True csvHeader = [] while True: csvLine = csvFile.readline() if not csvLine: break if isFirstLine: isFirstLine = False csvHeader = csvLine.strip().split(',') solidCsvFile.write('"%s","%s","%s","%s","%s","%s"\n' % (csvHeader[1], csvHeader[2], csvHeader[3], csvHeader[4], csvHeader[5], csvHeader[6])) else: csvFields = csvLine.strip().split(',') isNA = False for i in range(2, 6): if csvFields[i] == 'NA': isNA = True break if not isNA: if csvFields[1] == 'NA': Attitude = 'NA' solidCsvFile.write('"%s",%s,%s,%s,%s,%s\n' % (Attitude, csvFields[2], csvFields[3], csvFields[4], csvFields[5], csvFields[6])) solidCsvFile.close() csvFile.close()

- 轉換資料格式：將 CSV 檔案格式轉換為 Weka ARFF 格式。
  - - @attribute Attitude {Y,N,NA} @attribute FactorA numeric @attribute FactorB numeric @attribute FactorC numeric @attribute FactorD numeric @attribute Age numeric

資料分類模型 (Random Tree)： Multilayer Perceptron Naive Bayes Logistic K-nearest Neighbours