ダハラン ナリマン
  DAHLAN Nariman
   所属   立命館アジア太平洋大学  サステイナビリティ観光学部
   職種   准教授
言語種別 英語
発行・発表の年月 2024/04/10
形態種別 学術論文
査読 査読あり
標題 Enhancing Effectiveness and Efficiency of Customers Reviews Data Collection Through Multithreaded Web Scraping Approach
執筆形態 単著
掲載誌名 Lecture Notes on Data Engineering and Communications Technologies, vol 200,pp. 282–291, 2024.
掲載区分国外
出版社・発行元 Springer, Cham
巻・号・頁 200,pp.282-291
総ページ数 11
概要 We are developing a tool to collect hotels costumers’ reviews called MULARS (Multi-languages Reviews Scrapers) with the aim of efficiently scraping multilingual online reviews. In our previous research, we made it possible to scrape multilingual data. However, faster speed and more effective system of scraping is required for scraping large-scale data. Therefore, we are looking for ways to enhance the performance and effectiveness of the scraping process. In this research, we propose data scraping using multithreading. We implement a multithreaded scraping system and test its performance to demonstrate its improved speed and efficiency. This paper describes the proposed system, and the results of experiment of data collection performances will be presented. Finally, the paper discusses the improvement by adopting the multi-threading, the tool advantages, and future possibilities of the tool applications on the different areas.
researchmap用URL https://doi.org/10.1007/978-3-031-57853-3_24