64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Web Scraping for Price Statistics in the Philippines


Shushimita Pelayo



64th ISI World Statistics Congress - Ottawa, Canada

Format: CPS Abstract

Keywords: cpi, rvest, selenium, webscraping

Session: CPS 42 - Finance and business statistics III

Tuesday 18 July 8:30 a.m. - 9:40 a.m. (Canada/Eastern)


Official price statistics in the Philippines are mainly sourced from the conduct of regular surveys and censuses which entail high costs. As businesses move into digital platforms, alternatives to these traditional data sources have become more available; one of which is web scraping. Web scraping is the process of collecting information from the web. As digital and online platforms become increasingly utilized for commerce, web scraping offers a way to increase the frequency of data collection while reducing its cost compared to price surveys. This paper aims to compute an online Consumer Price Index (CPI) of the National Capital Region (NCR), specifically for Divisions 1 and 2 of the Philippine Classification of Individual Consumption According to Purpose (PCOICOP), which will be compared to the official CPI of NCR calculated by the PSA. In addition to the official methodology of the CPI, a hybrid approach is introduced in this study for the computation of the online CPI. Finally, this paper presents the results of the year-round official run of the developed web scraping programs and provides recommendations that will be useful for future web scraping projects in the Philippines.