分析模式: 選項1.數值分析,輸入檔案為數值型態,可以為.csv檔案、.json檔案、.log檔案、.txt檔案、.jsonl檔案,需要至少兩個數值欄位進行分析,並執行K-Means、DBSCAN、SVM、Decision Tree、Random Forest,最後並提供PCA 降維資料分布圖
Analysis Mode: Option 1. Numerical Analysis. The input file is in numerical format, which can be a .csv file, .json file, .log file, .txt file, or .jsonl file. At least two numerical fields are required for analysis. The analysis will perform K-Means, DBSCAN, SVM, Decision Tree, and Random Forest operations, and finally provide a PCA dimensionality reduction data distribution plot.
分析模式: 選項2.英文關鍵字分析,輸入檔案為數值型態,可以為.pdf檔案、.json檔案、.log檔案、.txt檔案、.csv檔案,需要至少兩份以上的文件進行分析,透過使用者提供"停用詞彙資料檔案",排除停用詞彙後,將分析出的關鍵詞執行K-Means、DBSCAN、SVM、Decision Tree、Random Forest,最後並提供PCA 降維資料分布圖
Analysis Mode: Option 2. English Article Analysis. Input files are in numerical format and can be .pdf, .json, .log, .txt, or .csv files. At least two files are required for analysis. Using a user-provided "stop word data file," after excluding stop words, the analyzed keywords are processed using K-Means, DBSCAN, SVM, Decision Tree, and Random Forest methods. Finally, a PCA dimensionality reduction data distribution plot is provided.
分析模式: 選項3.中文關鍵字分析,輸入檔案為數值型態,可以為.pdf檔案、.json檔案、.log檔案、.txt檔案、.csv檔案,需要至少兩份以上的文件進行分析,透過使用者提供"停用詞彙資料檔案",排除停用詞彙後,將分析出的關鍵詞執行K-Means、DBSCAN、SVM、Decision Tree、Random Forest,最後並提供PCA 降維資料分布圖。中文詞會使用jieba做詞彙斷詞
Analysis Mode: Option 3. Chinese Article Analysis. Input files are in numerical format and can be .pdf, .json, .log, .txt, or .csv files. At least two files are required for analysis. Using a user-provided "stop word data file," after excluding stop words, the analyzed keywords are processed using K-Means, DBSCAN, SVM, Decision Tree, and Random Forest. Finally, a PCA dimensionality reduction data distribution plot is provided. Chinese words are segmented using jieba.
分析模式:選項4.精準文字段落搜尋,輸入檔案為數值型態,可以為.pdf檔案、.json檔案、.log檔案、.txt檔案、.csv檔案,透過使用者提供"特定詞彙A、B,以及所預先給定之字元長度"條件,於文章中找出特定詞彙以及所連結之文字段落,以精準搜尋所需要之資訊
Analysis Mode: Option 4. Precision Article Section Search. The input file is in numerical format and can be a .pdf, .json, .log, .txt, or .csv file. Users provide specific words A and B, along with a pre-defined character length, to find those words and their associated text paragraphs within the document, thus precisely searching for the desired information.
請選擇要上傳的檔案(可多選):
Please select the files you wish to upload (multiple selections are allowed):
請選擇要上傳的停用詞彙資料檔案:
Please select the stop word data file you wish to upload:
停用詞彙資料檔案格式為每一橫列為一筆停用詞彙
The format of the stop word data file is that each row represents one stop word.
請選擇要做為搜尋詞彙的資料檔案:
Please select the file you want to use as the search term:
搜尋詞彙的資料檔案格式為 關鍵詞彙A-關鍵詞彙B-字元長度C (例如: Word-Vocabulary-25),B可以為空白(例如: Word- -25),將會搜尋關鍵詞彙A與B出現在字元長度C數值內之 往前往後字元長度數值C 所涵蓋的文字段落內容。
The data file format for searching keywords is Keyword A-Keyword B-Character Length C (e.g., Word-Vocabulary-25), where B can be blank (e.g., Word- -25). It will search for text paragraphs containing character lengths C before and after keywords A and B that appear within the character length C value.