Warianty tytułu
Języki publikacji
Abstrakty
Aim/purpose - Web-scraping is a technique used to automatically extract data from websites. After the rise-up of online shopping, it allows the acquisition of information about prices of goods sold by retailers such as supermarkets or internet shops. This study examines the possibility of using web-scrapped data from one clothing store. It aims at comparing known price index formulas being implemented to the web-scraping case and verifying their sensitivity on the choice of data filter type. Design/methodology/approach - The author uses the price data scrapped from one of the biggest online shops in Poland. The data were obtained as part of eCPI (electronic Consumer Price Index) project conducted by the National Bank of Poland. The author decided to select three types of products for this analysis - female ballerinas, male shoes, and male oxfords to compare their prices in over one-year time period. Six price indexes were used for calculation - The Jevons and Dutot indexes with their chain and GEKS (acronym from the names of creators - Gini-Éltető-Köves-Szulc) versions. Apart from the analysis conducted on a full data set, the author introduced filters to remove outliers. Findings - Clothing and footwear are considered one of the most difficult groups of goods to measure price change indexes due to high product churn, which undermines the possibility to use the traditional Jevons and Dutot indexes. However, it is possible to use chained indexes and GEKS indexes instead. Still, these indexes are fairly sensitive to large price changes. As observed in case of both product groups, the results provided by the GEKS and chained versions of indexes were different, which could lead to conclusion that even though they are lending promising results, they could be better suited for other COICOP (Classification of Individual Consumption by Purpose) groups. Research implications/limitations - The findings of the paper showed that usage of filters did not significantly reduce the difference between price indexes based on GEKS and chain formulas. Originality/value/contribution - The usage of web-scrapped data is a fairly new topic in the literature. Research on the possibility of using different price indexes provides useful insights for future usage of these data by statistics offices. (original abstract)
Rocznik
Numer
Strony
251--269
Opis fizyczny
Twórcy
autor
- University of Lodz, Poland
Bibliografia
- Australian Bureau of Statistics [ABS]. (2018). Web scraping in the CPI Australian Bureau of Statistics. Retrieved from https://www.unece.org/fileadmin/DAM/stats/ documents/ece/ces/ge.22/2018/Australia_-_poster.pdf
- Auer, J., & Boettcher, I. (2017). From price collection to price data analytics: How new large data sources require price statisticians to re-think their index compilation procedures. Experiences from web-scraped and scanner data. Paper presented on Ottawa Group Meeting. Retrieved from https://www.ottawagroup.org/Ottawa/ ottawagroup.nsf/4a256353001af3ed4b2562bb00121564/1ab31c25da944ff5ca25822 c00757f87/$FILE/From price collection to price data analytics -Josef Auer, Ingolf Boettcher -Paper.pdf
- Białek, J., & Bobel, A. (2019). Comparison of price index methods for CPI measurement using scanner data. Paper presented at the 16th Meeting of the Ottawa Group on Price Indices, Rio de Janeiro, Brazil. Retrieved from https://eventos.fgv.br/sites/ eventos.fgv.br/files/arquivos/u161/bialek_bobel_paper_2.pdf
- Bitner, T., & Stech, G. (2019). GUS: Big Data to nasz priorytet. Wywiad z Dominikiem Rozkrutem, prezesem GUS [CSO: Big Data is our priority. An interview with Dominik Rozkrut, president of Central Statistical Office in Poland]. Retrieved from https://www.computerworld.pl/wywiad/GUS-Big-Data-to-nasz-priorytet,412891.html
- ten Bosch, O. (n.d.). Uses of web scraping for official statistics ESTP course on big data sources - web, social media and text analytics. Retrieved from https://circabc. europa.eu/sd/a/5e250346-44a9-471b-87f1-5b5ddb59aa77/1_Big Data Sources part3-Day 1-A Use.pdf
- Cavallo, A. (2013). Online vs official price indexes: Measuring Argentina's inflation (Research Paper, No. 4975-12). Cambridge: MA: MIT Sloan. https://doi.org/ 10.2139/ssrn.1906704
- Cavallo, A. (2017, January). Are online and offline prices similar? Evidence from large multi-channel retailers. American Economic Review, 107(1), 283-303. https://doi. org/10.1257/aer.20160542
- Cavallo, A. (2018, March). Scraped data and sticky prices. The Review of Economics and Statistics, 100(1), 105-119. https://doi.org/10.1162/REST_a_00652
- Cavallo, A., & Rigobon, R. (2016, Spring). The billion prices project: Using online prices for measurement and research. Journal of Economic Perspectives, 30(2), 151-178. https://doi.org/10.1257/jep.30.2.151
- Chessa, A. G., & Griffioen, R. (2019). Comparing price indices and footwear for scanner data and web scraped data. Economie et Statistique, 509, 49-68. https:/doi.org/ 10.24187/ecostat.2019.509.1984
- Chuanyang, F., & Lee Wen Hao, J. (2016). Experiences with the use of online prices in consumer price index. Singapore: Singapore Department of Statistics. Retrieved from https://www.singstat.gov.sg/-/media/files/publications/reference/newsletter/ ssnsep2016.pdf
- Dutot, C. F. (1738). Reflexions politiques sur les finances et le commerce (tome 1). The Hague: Les Freres Vaillant et Nicolas Prevost.
- Eurostat. (2021). Internet purchases by individuals [Data base]. Retrieved from https:// ec.europa.eu/eurostat/web/digital-economy-and-society/data/database
- International Labour Organization, International Monetary Fund, Organisation for Economic Cooperation and Development, Statistical Office of the European Communities, United Nations, The International Bank for Reconstruction and Develop-ment, The World Bank. (2004). Consumer Price Index Manual: Theory and practice. Retrieved from https://www.ilo.org/wcmsp5/groups/public/---dgreports/---stat/documents/presentation/wcms_331153.pdf
- Jevons, W. S. (1865, June). On the variation of prices and the value of the currency since 1782. Journal of the Statistical Society of London, 28, 294-320. Retrieved from https://archive.org/details/jstor-2338419/mode/2up
- Juszczak, A. (2021). Usage of scraped data in price dynamic measurement. Acta Universitatis Lodziensis. Folia Oeconomica, 1(352), 25-37. https://doi.org/10.18778/ 0208-6018.352.02
- Lunnemann, P., & Wintr, L. (2006). Are internet prices sticky? (ECB Working Paper, No. 645). Frankfurt am Main: European Central Bank. Retrieved from https:// www.ecb.europa.eu/pub/pdf/scpwps/ecbwp645.pdf
- Macias, P., & Stelmasiak, D. (2018). Food inflation nowcasting with web scraped data (Working Paper, No. 302). Warsaw: NBP. Retrieved from https://www.nbp.pl/ publikacje/materialy_i_studia/302_en.pdf
- Office for National Statistics [ONS]. (2017). Research indices using web scraped price data: August 2017 update. Retrieved June 20, 2020, from https://www.ons.gov.uk/ economy/inflationandpriceindices/articles/researchindicesusingwebscrapedprice data/august2017update
- Office for National Statistics [ONS]. (2020). Using statistical distributions to estimate weights for web-scraped price quotes in consumer price statistics. Retrieved March 11, 2021 from https://www.ons.gov.uk/economy/inflationandpriceindices/articles/using statisticaldistributionstoestimateweightsforwebscrapedpricequotesinconsumerprice statistics/2020-09-01
- Persson, E. (2019). Evaluating tools and techniques for web scraping. Retrieved from https://www.diva-portal.org/smash/get/diva2:1415998/FULLTEXT01.pdf
- Polidoro, F., Giannini, R., Lo Conte, R., Mosca, S., & Rosetti, F. (2015). Web scraping techniques to collect data on consumer electronics and airfares for Italian HICP compilation. Statistical Journal of the IAOS, 31(2), 165-176. https://doi.org/ 10.3233/sji-150901
- Radzikowski, B., & Śmietanka, A. (2016). Online CASE CPI. Paper presented at the First International Conference on Advanced Research Methods and Analytics, Universitat Politecnica de València, València, Spain, July 6-7, 2016. https://doi.org/ 10.4995/CARMA2016.2016.3133
- Van Loon, K., & Roels, D. (2018). Integrating big data in the Belgian CPI. Paper presented at Meeting of the group of experts on Consumer Price Indices in Geneva, Switzerland 7-9 May. Brussels: StatBel Belgium in Figures. Retrieved from https:// unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.22/2018/Belgium.pdf
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171624796

JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.