Secco, Alessandro (2014) How Good Is a Web Page? Data Collection for Experimental Evaluation of Link Analysis Algorithms. [Magistrali biennali]
Full text disponibile come:
This thesis describes motivations, techniques and results of a large crawl designed to obtain a suitable snapshot of the web graph. Our goal requires a properly designed crawling system to explore the whole .it domain. As a result, we obtained a fast and stable crawling system, which in a preliminary test collected more than 308 million distinct web pages in 28 days at an average rate of 204 pages per second, using a single high-end PC-class machine.
Solo per lo Staff dell Archivio: Modifica questo record