Web Scraping of Prices of Commodities Included in the Generation of Consumer Price Index (CPI) for the National Capital Region (NCR)
Conference
64th ISI World Statistics Congress - Ottawa, Canada
Format: CPS Paper
Keywords: cpi, marketbasket, webscraping
Session: CPS 15 - Finance and business statistics IV
Monday 17 July 4 p.m. - 5:25 p.m. (Canada/Eastern)
Abstract
Online stores are becoming popular as a new platform of business transactions, not only in the country, but also globally. To take advantage of this new approach, the Philippine Statistics Authority (PSA) started in 2019 to explore the use of web scraping as an alternative price collection method of commodities that are included in the computation of Consumer Price Index (CPI) for the National Capital Region (NCR). Currently, the PSA uses face-to-face price collection of commodities from sample outlets or stores. In this paper, prices collected from traditional method or face-to-face method are called offline prices, while web scraped prices are termed as online prices. Prices of 514 commodities are web scraped which comprise about 71 percent of the total commodities in the market basket of NCR.
This research study aims to determine if the CPI for NCR using offline prices can be replaced by online prices or by a combination of online and offline prices (hybrid). Results show that behavior of online prices is comparable with selected commodities are not highly volatile such as clothing. However online prices of agricultural commodities which are highly volatile do not represent the same trend of volatility as the offline prices. Moreover, computation of CPI for some commodity groups are better when using offline prices, while other commodity groups can use hybrid prices.