📞 +91-7667918914 | ✉️ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 7, ISSUE 3, MARCH 2018

Implementation of Web Scraper Bot: Web Harvesting

NandanAS, Rahul S Niranjan, PB Rahul Choudhary, Karthik Srinivas, Nilesh Kumar Singh, Kavya P Hathwar

DOI: 10.17148/IJARCCE.2018.7362

Abstract: Web Harvesting or Web-Scraping also called web data extraction, are various methods of collecting information from across the internet. It is essentially a form of data mining. Programs are written to mine the data and to convert it into a meaningful and useful structure. In this paper we demonstrate a code/program written to harvest web-data from a particular web-site and to display the same in different file formats. The university results of a class of Students arescraped from the web and are being stored and calculated. This code can be reused several number of times and may also be altered to suit the desired/intended application. Creating a customized score sheet of all students in the college or university is a tedious task. In this paper a web scraper bot is employed to do the same within minutes.



Keywords: Web Harvesting or Web-Scraping, essentially a form of data mining, reused several number of times, Creating a customized score sheet, web scraper bot, regular expression (regex).

How to Cite:

[1] NandanAS, Rahul S Niranjan, PB Rahul Choudhary, Karthik Srinivas, Nilesh Kumar Singh, Kavya P Hathwar, “Implementation of Web Scraper Bot: Web Harvesting,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2018.7362