PowerPOI
Project Leader: Prof. Chia-Hui Chang
Team members: Hsiu-Min Chuang, Ting-Yao Kao, Chung-Ting Cheng, Ya-Yun Huang, Guo-Bin Chang,
Kai-Chien Yang, Chien-Fu Ling, Yuan-Hao Lin, Hung-Wei Chang
Abstract
With the popularity of mobile devices and smartphones, we have witnessed rapid growth in mobile applications and services, especially in location-based services (LBS). According to a mobile marketing survey, maps/location searches are among the most utilized services on smartphones. Points of interest (POIs), such as stores, shops, gas stations, parking lots, and bus stops, are particularly important for maps/location searches. Existing map services such as Google Maps and Wikimapia are constructed manually either professionally or with crowdsourcing. However, manual annotation is costly and limited in current POI search services. With the abundance of information on the Web, many business POIs can be extracted from the Web.
In this project, we focus on automatically constructing a POI database to enable business POI map searches. We propose techniques that are required to construct a POI database, including focused crawling, information extraction, and information retrieval techniques. We first crawl Yellow Page websites to obtain vocabularies of business names. These vocabularies are then investigated with search engines to obtain sentences containing these business names from search snippets in order to train a business-name recognition model. To extract POIs scattered across the Web, we propose a query-based crawler to find address-bearing pages that might be used to extract addresses and business names. We crawled 1.25 million distinct POI pairs scattered across the Web and implemented a POI search service via Apache Lucent's search platform, called Solr. The experimental results demonstrate that the proposed geographical information retrieval model outperforms Wikimapia and a commercial app called "What's the Number?"
Publication
H.-M. Chuang, C.-H. Chang, Ting-Yao Kao, Chung-Ting Cheng and Ya-Yun Huang, K.-P. Cheong, Enabling Maps/Location Searches on Mobile Devices: Constructing a POI Database via Focused Crawling and Information Extraction, International Journal of Geographical Information Science, Volume 30, Issue 7, pp 1405-1425, 2016.
C.-H. Chang, H.-M. Chuang, C.-Y. Huang, Y.-S. Su, S.-Y. Li. Enhancing POI Search on Maps via Online Address Extraction and Associated Information Extraction, Applied Intelligence, Volume 44, Issue 3, pp 539–556, 2015.
H.-M. Chuang and C.-H. Chang, Verification of POI and Location Pairs via Weakly Labeled Web Data. LocWeb (WWW workshop), Italy, May 18-22, 2015. (pdf)
H.-M. Chuang, C.-H. Chang, and T.-Y. Kao, Effective Web Crawling for Chinese Addresses and Associated Information, The 15th International Conference on Electronic Commerce and Web Technologies (ECWeb 2014), Munich, Germany, Sep. 1-5, 2014. (pdf)
C.-H. Chang, C.-Y. Huang, and Y.-Y. Su: On Chinese Postal Address and Associated Information Extraction, JSAI IOS: Special Session on Web Intelligence & Data Mining, June 13-15, 2012.
C.-H. Chang and S.-Y. Lee. MapMarker: Extraction of Postal Addresses And Associated Information for General Web Pages, IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies (WI-IAT 2010), Toronto, Canada. Sep. 1-3, 2010.
Related Technologies (Provide the training datasets for download)
Address extraction: (Data for Chinese; Collection for English provided by Z. Yu)
Business-name recognition: (Data)
Address-POI name pairing: (Data, window size=100)
Address-POI name verification: (Data)
Acknowledgement
This project is partially sponsored by the Ministry of Science and Technology in Taiwan under grant MOST103-2221-E-008-094