Institutional-Repository, University of Moratuwa.  

Identification and characterization of crawlers through analysis of web logs

Show simple item record Algiriyage, N Jayasena, VSD Dias, G Perera, A Dayananda, K Sharma, K 2014-06-18T16:57:07Z 2014-06-18T16:57:07Z 2014-06-18
dc.description.abstract Web crawlers are software programs that automatically traverse the hyperlink structure of the world-wide web in order to locate and retrieve information. In addition to crawlers from search engines, we observed many other crawlers which may gather business intelligence, confidential information or even execute attacks based on gathered information while camouflaging their identity. Therefore, it is important for a website owner to know who has crawled his site, and what they have done. In this study we have analyzed crawler patterns in web server logs, developed a methodology to identify crawlers and classified them into three categories. To evaluate our methodology we used seven test crawler scenarios. We found that approximately 53.25% of web crawler sessions were from 'known' crawlers and 34.16% exhibit suspicious behavior. en_US
dc.language.iso en en_US
dc.source.uri en_US
dc.title Identification and characterization of crawlers through analysis of web logs en_US
dc.type Conference-Abstract en_US
dc.identifier.faculty Engineering en_US
dc.identifier.department Department of Computer Science and Engineering en_US
dc.identifier.year 2013 en_US
dc.identifier.conference International Conference on Industrial and Information System [8th ms] - ICIIS 2013 en_US Peradeniya en_US
dc.identifier.pgnos pp. 150-155 en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record