INTELLIGENT BROWSER AGENT This dissertation is submitted to the Department of Electronic & Telecommunication Engineering, University of Moratuwa in partial fulfillment of the requirements for the Degree of M.Sc. in Telecommunications Department of Electronic & Telecommunication Engineering University of Moratuwa Supervisor Dr. Ajith Pasqual Anjana Ratnayake Index number: 06/8369 2009 93919 Abstract In today's information technology era IT infra structure and solutions provide a solid base for us to carry out our day to day work. Yet the computers themselves are unable to stand on their own with complete independence. Information technology has always been a combination of intelligent human beings and computationally efficient machines. The objective of the research is to develop an automated internet browser module which assists the user in searching content in the internet. In that sense the research attempts to pass on human intelligence as much as possible to the software such that the actions of the agent resemble those of a human being. The independent nature of the agent is focused at an agent actions and behaviors are made not to interfere with end user activities. The research begins with a literature survey on aspects of Artificial Intelligent (AI) and its evolutions. Selected AI theories are then used in designing the prototype of the agent. The prototype focused on key findings of the literature survey and key AI areas such as modeling the environment, building domain knowledge, searching and sorting, agent learning, handling uncertainty and adaptability. Areas of improvements of the initial version were identified and were implemented as an extended version of prototype. The agent monitors the end user actions and identifies the web content he browses. It then decides on the possible areas of interest of the user and tries to predict search terms and searches for targeted sites in the internet. If the user has opted for the agent to load the predicted sites the agent would open them in separate browser windows. The agent is designed to resemble a typical user such that it would monitor the traffic levels and patterns of the user and would take actions only if they won't interfere with user's actions. Observations indicate that level of end user experience improves in various aspects. The agent makes use of only idle bandwidth thus users will not feel any overloading of traffic. Also the agent predicts sites most relevant to the immediate content being browsed thus targets the immediate requirements of the user. The agent highly depends on the relevancy of a web site's Meta tags to its actual contents. Thus if it encounters a site with poorly formatted Meta tags it could fail to achieve the level of appropriateness of the predicted sites to users actual requirements. Observations of TCP/IP traffic traces prove that when the agent is active the search traffic originated by it is below 20 packets per second where as user originated traffic is in the range of 150 - 175 packets per second. Due to the back end packet tracing and analysis the CPU load is around 45% with few +/- variations. D E C L A R A T I O N I c e r t i f y t h a t t h i s d i s s e r t a t i o n d o e s n o t i n c o r p o r a t e w i t h o u t a c k n o w l e d g e m e n t a n y m a t e r i a l p r e v i o u s l y s u b m i t t e d f o r a d e g r e e i n a n y U n i v e r s i t y t o t h e b e s t o f m y k n o w l e d g e a n d b e l i e v e t h a t i t d o e s n o t c o n t a i n a n y m a t e r i a l p r e v i o u s l y p u b l i s h e d , w r i t t e n o r o r a l l y c o m m u n i c a t e d b y a n o t h e r p e r s o n o r m y s e l f e x c e p t w h e r e d u e r e f e r e n c e i s m a d e i n t h e t e x t . I a l s o h e r e b y g i v e c o n s e n t f o r m y d i s s e r t a t i o n , i f a c c e p t e d , t o b e m a d e a v a i l a b l e f o r p h o t o c o p y i n g a n d f o r i n t e r - l i b r a r y l o a n s , a n d f o r t h e t i t l e a n d s u m m a r y t o b e m a d e a v a i l a b l e t o o u t s i d e o r g a n i z a t i o n s . ; 1~ , , T;-J.J.~'\ S i g n a t u r e o f t h e C a n d i d a t e D a t e : T o t h e b e s t o f m y k n o w l e d g e , t h e a b o v e p a r t i c u l a r s a r e c o r r e c t . -----~ Supervts~---:::.:~- a n . I t t h e n . e r m s a n d t o l o a d t h e t h e t r a f f i c l e v e l s ( e r f e r e w i t h u s e r ' s l l l A C K N O W L E D G M E N T S I w o u l d l i k e t o t h a n k D r . A j i t h P a s q u a ! f o r t h e v a l u a b l e s u p e r v i s i o n e x t e n d e d t o w a r d s m y d i s s e r t a t i o n . T h e g u i d a n c e a n d d i r e c t i o n g i v e n t o m e a t p r o j e c t p r o p o s a l s t a g e s a n d p r o g r e s s r e v i e w s t a g e s h e l p e d m e t o c a r r y o u t t h e d i s s e r t a t i o n i n t h e m o s t a p p r o p r i a t e d i r e c t i o n . A l s o I w o u l d l i k e t o t h a n k M r . K i t h s i r i S a m a r a s i n g h e , t h e H e a d o f E l e c t r o n i c a n d T e l e c o m m u n i c a t i o n s d e p a r t m e n t f o r t h e v a l u a b l e s u p p o r t a n d g u i d a n c e e x t e n d e d . I w o u l d a l s o l i k e t o t h a n k m y i m m e d i a t e s u p e r i o r a n d t h e c o l l e a g u e s a t o f f i c e f o r t h e i n s i g h t p r o v i d e d i n c a r r y i n g o u t m y d i s s e r t a t i o n . F i n a l l y I w o u l d l i k e t o t h a n k m y c o l l e a g u e s i n M . S c b a t c h t h a t e x t e n d e d t h e i r s u p p o r t a n d a d v i c e a t v a r i o u s s t a g e s o f t h e d i s s e r t a t i o n . v T A B L E O F C O N T E N T S D E C L A R A T I O N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i i A B S T R A C T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i i i A C K N O W L E D G M E N T S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v T A B L E O F C O N T E N T S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v i L I S T O F F I G U R E S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v i i i L I S T O F T A B L E S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v i i i I N T R O D U C T I O N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 L I T E R A T U R E S U R V E Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 . 1 S t u d y o n e x i s t i n g p r o d u c t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 . 2 O v e r v i e w o f l i t e r a t u r e s u r v e y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 . 3 B a s i c A g e n t D e s i g n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 . 4 T h e E n v i r o n 1 n e n t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 0 2 . 5 D o m a i n K n o w l e d g e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 4 2 . 6 S e a r c h i n g i n A g e n t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 0 2 . 7 L e a r n i n g i n A g e n t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 9 2 . 8 D e s i g n i n g f o r U n c e r t a i n t y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 2 . 9 A d a p t a t i o n o f f i n d i n g o f l i t e r a t u r e s u r v e y t h r o u g h t o t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 7 A R C H I T E C T U R A L D E S I G N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 9 3 . 1 P r o p o s e d A r c h i t e c t u r e d i a g r a m f o r p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 9 3 . 2 D e s i g n O b j e c t i v e s f o r t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 0 T H E P R O T O T Y P E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 4 . 1 I n t r o d u c t i o n t o t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 4 . 2 D e s i g n O b j e c t i v e s o f t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 4 . 3 P r o p o s e d D e s i g n f o r t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 4 . 4 D e v e l o p m e n t S t a g e - 1 ( p a c k e t c a p t u r i n g a n d f i l t e r i n g ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 7 4 . 5 D e v e l o p m e n t S t a g e - 2 ( s i t e p r e d i c t o r ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 4 . 6 D e v e l o p m e n t S t a g e - 3 ( s i t e l o a d e r ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4 4 . 7 R e s o u r c e M o n i t o r i n g a n d C o n t r o l l i n g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 4 . 8 . P r o p o s e d E n h a n c e m e n t s t o t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 6 4 . 9 I m p r o v e m e n t s t o t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 7 4 . 9 . 1 A d a p t i v e t r a f f i c m o n i t o r i n g a n d c o n t r o l l i n g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 7 4 . 9 . 2 E n h a n c e m e n t s t o s i t e p r e d i c t i n g l o g i c s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 8 R E S U L T S A N D D I S C U S S I O N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 9 5 . 1 I n i t i a l R e s e a r c h o b j e c t i v e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 9 5 . 2 A p p l i c a b i l i t y o f f i n d i n g s o f t h e l i t e r a t u r e s u r v e y o n t h e p r o t o t y p e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 9 5 . 3 R e s e a r c h F i n d i n g s a n d C o m p a r i s o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 V l 5 . 4 S t a t i s t i c a l O b s e r v a t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 C O N C L U S I O N S A N D F U T U R E W O R K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8 6 . 1 O b s e r v a t i o n o n t h e i m p r o v e m e n t s i n e n d u s e r e x p e r i e n c e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8 6 . 2 P r o p o s e d e x t e n s i o n s o f t h e c u r r e n t r e s e a r c h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 9 B I B L I O G R A P H I C A L R E F E R E N C E S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 0 V l l L I S T O F F I G U R E S F i g u r e 2 . 2 . a - B a s i c c o m p o n e n t s o f t h e t y p i c a l i n t e l l i g e n t a g e n t F i g u r e 2 . 2 . b - A c t u a t o r s F i g u r e 2 . 3 . a - S i m p l e r e t 1 e x a g e n t s F i g u r e 2 . 3 . b - M o d e l - b a s e d r e f l e x a g e n t s F i g u r e 2 . 3 . c - G o a l - b a s e d a g e n t s F i g u r e 2 . 3 . d - U t i l i t y - b a s e d a g e n t s F i g u r e 2 . 3 . e - L e a r n i n g A g e n t F i g u r e 2 . 5 . a - S e m a n t i c N e t s F i g u r e 2 . 7 . a - R e i n f o r c e m e n t l e a r n i n g F i g u r e 2 . 7 . b - N e u r a l N e t w o r k b a s e d l e a r n i n g F i g u r e 3 . L a - P r o p o s e d A r c h i t e c t u r e d i a g r a m f o r p r o t o t y p e F i g u r e 4 . 3 . a - A r c h i t e c t u r e d i a g r a m F i g u r e 4 . 4 . a - L o g i c a l a r c h i t e c t u r e ( D e v e l o p m e n t S t a g e - 1 ) F i g u r e 4 . 4 . b - C o d i n g l e v e l a r c h i t e c t u r e ( D e v e l o p m e n t S t a g e - 1 ) F i g u r e 4 . 5 . a - L o g i c a l a r c h i t e c t u r e ( D e v e l o p m e n t S t a g e - 2 ) F i g u r e 4 . 5 . b - C o d i n g l e v e l a r c h i t e c t u r e ( D e v e l o p m e n t S t a g e - 2 ) F i g u r e 4 . 6 . a - L o g i c a l a n d c o d i n g l e v e l a r c h i t e c t u r e ( D e v e l o p m e n t S t a g e - 3 ) F i g u r e 4 . 7 . a - M o d u l e s t a t u s m o n i t o r i n g F i g u r e 5 . 4 . a H T T P r e q u e s t s t a t i s t i c s ( a g e n t i n a c t i v e ) F i g u r e 5 . 4 . b H T T P r e q u e s t s t a t i s t i c s ( a g e n t a c t i v e ) F i g u r e 5 . 4 . c S t a n d a l o n e a p p l i c a t i o n i s r u n n i n g w i t h o u t a n y m o d u l e s a c t i v e F i g u r e 5 . 4 . d C P U l o a d s t a t i s t i c s - P a c k e t t r a c e r a c t i v e F i g u r e 5 . 4 . e C P U l o a d s t a t i s t i c s - S i t e p r e d i c t o r a c t i v e F i g u r e 5 . 4 . f C P U l o a d s t a t i s t i c s - S i t e l o a d e r a c t i v e F i g u r e 5 . 4 . g C P U l o a d s t a t i s t i c s - A l l m o d u l e s a c t i v e L I S T O F T A B L E S T a b l e . 5 . 2 . 1 B a s i c c o m p o n e n t s o f a n i n t e l l i g e n t a g e n t V l l l