森川研究室（Morikawa Lab） - 研究内容／Research

研究内容/Research

我々の研究トピック（下記に限らず）：

AIセキュリティ
IoTセキュリティ
モバイルセキュリティ
Webセキュリティ
ネットワークセキュリティ
システムセキュリティ
オフェンシブセキュリティ
ユーザブルセキュリティ

詳しくは、下記の例①～➄を参考にしてください。

Our research Topic (not limited to) :

AI Security
IoT Security
Mobile Security
Web Security
Network Security
System Security
Offensive Security
Usable Security

Please refer to ①～➄ for more details.

①スケーラブル且つ高精度な特徴表現方法を用いたAndroidマルウェア検知に関する研究

　　スマートフォンの普及に伴い、新しいAndroidマルウェアが急激に増加してきている。アプリケーションストアからAndroidマルウェアを検出するのは、かなり手間がかかる作業である。これまでの研究では、機械学習を利用したAndroidマルウェアの自動検知に焦点が当てられているが、大規模なデータに対してはスケーラブル且つ高精度なソリューションがまだ欠けている。そこで、本研究では、Androidマルウェアの検知精度を向上させるのと、同時に解析処理にかかる計算時間も短縮するための特徴表現方法を提案した。それだけではなく、自動化のため、データ収集、静的特徴抽出、および機械学習アルゴリズムを組み合わせた検知システムも構築した。

① A scalable and accurate feature representation method for identifying malicious mobile applications

　　With the dramatic growth in smartphone usage, the number of new malicious mobile applications has increased rapidly. Identifying malicious applications in large-scale datasets is intensive and time consuming. Multiple previous studies have focused on automating the process of malicious application detection using machine (or deep) learning technology. However, a scalable and accurate solution is still lacking for large-scale applications. Therefore, in this study, we propose a novel approach to improve the accuracy of discovering malicious applications and decrease the computation time for processing the analysis. We implemented our proposed approach combining data collection, static feature extraction, and machine learning algorithms. Using a large dataset collected from a mobile application store that included 49,045 benign samples and 12,685 malicious samples, we demonstrate that the F-measure of the malicious application detection of our approach ranges from 0.968 to 0.995, with a false positive rate of 0.48% ~3.3%. We find that a multi-layer perceptron classifier performs best in these algorithms. Moreover, the analysis processing running time can be compressed to less than 18 min. Finally, we compare our method to those of two types of previous studies and report a better performance in terms of scalability and accuracy.

➁偽レストランレビューの自動生成技術に関する研究

　　自動的に生成された偽レストランのレビューはオンラインレビューシステムに対する脅威である．最近の調査では，ユーザが実際のレストランのレビューの中に隠れている偽レビューを検出することが困難であることが明らかになっている．そこで，本研究では，トピックに沿ったレビューを生成することができるニューラル機械翻訳（NMT）の技術を利用することによって，より洗練された生成モデルを提案し，評価を行った． Amazon Mechanical Turkを使用して，英語を母国語として話す人を募集して，本研究の手法により生成された複数タイプのレビューを評価した．

➁ Generating Context-Specific Fake Restaurant Reviews

　　Automatically generated fake restaurant reviews are a threat to online review systems. Recent research has shown that users have difficulties in detecting machine-generated fake reviews hiding among real restaurant reviews. The method used in this work (char-LSTM) has one drawback: it has difficulties staying in context, i.e. when it generates a review for specific target entity, the resulting review may contain phrases that are unrelated to the target, thus increasing its detectability. In this work, we present and evaluate a more sophisticated technique based on neural machine translation (NMT) with which we can generate reviews that stay on-topic. We test multiple variants of our technique using native English speakers on Amazon Mechanical Turk. We demonstrate that reviews generated by the best variant have almost optimal undetectability (class-averaged F-score 47%).

③サンドボックスログを用いてマルウェア解析レポートの自動生成に関する研究

　　マルウェア解析者はマルウェア解析ツール（サンドボックス）に生成された大量の動的分析のログを手動で調査しなくてはならない．一方で，アンチウイルスベンダーはインターネットを通じて，エキスパートにより作成されたマルウェアの解析レポートを公開している．サンドボックスが出力するログと，エキスパートが作成するマルウェア解析レポートはそれぞれ異なる原理で生成されており，直接的な対応関係はない．そこで，本研究は，自然言語処理や機械学習などの技術を適用することにより，サンドボックスによって生成されるマルウェア解析ログデータを入力とし，その入力データに対応したマルウェア解析レポートを自動生成することを狙いとする．

③Automatically Generating Malware Analysis Reports Using Sandbox Logs

　　Analyzing a malware sample requires much more time and cost than creating it. To understand the behavior of a given malware sample, security analysts often make use of API call logs collected by the dynamic malware analysis tools such as a sandbox. As the amount of the log generated for a malware sample could become tremendously large, inspecting the log requires a time-consuming effort. Meanwhile, antivirus vendors usually publish malware analysis reports (vendor reports) on their websites. These malware analysis reports are the results of careful analysis done by security experts. The problem is that even though there are such analyzed examples for malware samples, associating the vendor reports with the sandbox logs is difficult. This makes security analysts not able to retrieve useful information described in vendor reports. To address this issue, we developed a system called AMAR-Generator that aims to automate the generation of malware analysis reports based on sandbox logs by making use of existing vendor reports. Aiming at a convenient assistant tool for security analysts, our system employs techniques including template matching, API behavior mapping, and malicious behavior database to produce concise human-readable reports that describe the malicious behaviors of malware programs. Through the performance evaluation, we first demonstrate that AMAR-Generator can generate human-readable reports that can be used by a security analyst as the first step of the malware analysis. We also demonstrate that AMAR-Generator can identify the malicious behaviors that are conducted by malware from the sandbox logs; the detection rates are up to 96.74%, 100%, and 74.87% on the sandbox logs collected in 2013, 2014, and 2015, respectively. We also present that it can detect malicious behaviors from unknown types of sandbox logs.

④評判情報に基づくモバイルアプリのセキュリティ対策に関する研究

　　スマートフォン等モバイル端末のアプリを配布するマーケットにおいて，アプリの評判情報を悪意で操作し，悪性アプリを多くのユーザにダウンロード・感染させる脅威が存在する．本研究は，アプリマーケットにユーザが投稿する大規模かつ不均一なレビュー・コメント情報に自然言語処理および機械学習をベースとした解析手法を適用し，偽の評判情報高精度に検出することを狙いとする．偽の評判情報を検出することにより，関連する悪性な開発者とアプリを早期に除去することが可能となる．

④Characterizing Promotional Attacks in Mobile App Store

Mobile app stores, such as Google Play, play a vital role in the ecosystem of mobile apps. When users look for an app of interest, they can acquire useful data from the app store to facilitate their decision on installing the app or not. This data includes ratings, reviews, number of installs, and the category of the app. The ratings and reviews are the user-generated content (UGC) that affect the reputation of an app. Unfortunately, miscreants also exploit such channels to conduct promotional attacks (PAs) that lure victims to install malicious apps. In this paper, we propose and develop a new system called PADetective to detect miscreants who are likely to be conducting promotional attacks. Using a dataset with 1,723 of labeled samples, we demonstrate that the true positive rate of detection model is 90%, with a false positive rate of 5.8%. We then applied PADetective to a large dataset for characterizing the prevalence of PAs in the wild and find 289 K potential PA attackers who posted reviews to 21 K malicious apps.

➄空間検索とフィルターを用いて自動的なブラックリストの生成技術に関する研究

　　Webアプリケーションにおいてはドライブバイダウンロードやフィッシング等，悪性サイトに誘導する媒介としてURLを利用する攻撃手法が存在する．URL Blacklistは上述の被害を食い止めるための有効な手段の一つである．しかしながら，URL Blacklist構築における未解決の課題として，対象となるデータの超大規模化と，つねに変化し続ける URL をタイムリーな発見が挙げられる．本研究では，既存のURL Blacklist を拡張しつつ，未知の悪性 URL を効率的に抽出するシステム（AutoBLG）を提案する．高性能のクライアント型のハニーポットを利用した検証実験を通じて，AutoBLGはこれまで未知であった新たなドライブバイダウンロードURLsを抽出したことが確認できた．

➄Automating URL Blacklist Generation with Similarity Search Approach

　　Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.

Page updated

Google Sites

Report abuse