Kasun’s Blog

Kasun Indrasiri

  • Kasun Indrasiri

  • Info

    Department of CSE
    University of Moratuwa,
    Sri Lanka

  • Archives

  • Categories

Googling, Searching and Information Retrieval

Posted by kasun04 on March 8, 2009

google-cartoon-021

As many computer addicted people, I also a “Google” junkie. I believe that most computer users can’t exists without Google or at least they need a good alternative like ‘LiveSearch’ or ‘Yahoo’. I don’t want to go in to a search engine comparison but just to show how we are depended distributed information providers.

In the modern parlance, the word ‘Search’ is a very ambiguous one. However an ordinary user more likely to interpret that one as ‘Google’ or some other search engine. So, the proficient guys in the fields of ‘search engine’ replace the word ‘search’ with ‘Information Retrieval(IR)’.

Information Retrieval (IR) is defined as, finding the materials (items/documents) of an unstructured nature (text) that satisfies an information need from within large collections(stored on computers).

Its obvious that the IR is not just bounded to web search yet web search is the dominant member of Information Retrieval. In the modern days Information Retrieval is fast becoming the dominant form of information access overtaking traditional database-style search.

However the definition is restricted to ‘unstructured data’ but IR systems capable of processing ‘semi-structured’ data as well. For example a book may be structured as ‘Title’, ‘Preface’, ‘Chapters’ etc. Also Information Retrieval also supports users in browsing or filtering document collections or further processing them. (similar to arrange books on a shelf based on their topics). The classification process is more or less automated in IR systems.

IR Systems can be classified in to three prominent categories.

Web Search

google1

In web search the IR system has to deal with billions of documents distributed among millions of computers and server billions of users across the web. So, the performance is a major issue and system is more focused towards handling billions of documents and serving billions of user is most optimal manner. However, the hardware and other resources are provided in large scale and managing them in optimal way is another issue.

Personal IR

vista

This is the counterpart of web search. The Personal IR is more focused on information retrieval of a single computer and server a single user at time. So, its obvious that the resources are limited and also the scale of the system is so small. Yet we need to provide the easy to use and efficient IR system to the user. A most suitable example of a personal IR system is the ‘search’ utility provided by your OS. These IR systems are extremely lightweight (hit F3 to invoke :)).

Enterprise, Institutional and Domain-Specific Search

fast

In these IR Systems retrievals might be provided for collections such as ‘internal documents’ of the company, a collection of research articles etc. In this scenario, the documents are stored locally and distributed among an internal distributed-file system and handful of dedicated computers may provided to the system.

In those three categories, web search is the most widely used and has immense influence on typical computer users. Despite the fact that all the IR categories based on similar kind of design (document feeding , processing, indexing and ranking etc), the detailed designs and the implementations of web search specific IR are quite rare and hardly published.

However, the companies like ‘Google’ are claiming (see Sergey’s speech )that they have published a great deal of information about the ‘Google’ design etc but its really hard to find them, apart from the research article that Sergey Brin and Lawrence Page published for Stanford University during the early days of Google.(http://infolab.stanford.edu/~backrub/google.html)

Posted in Computer Science | Leave a Comment »

The Romantic Opera – Kasun Kalhara

Posted by kasun04 on February 21, 2009

frontcoverlo8

Nowadays Sri Lankan music stream is flooded with a lot RnB and Hiphop music and many people merely want to listen to such styles of music. And all the radio stations are backing such music and ruining the natural music taste of Sri Lankans. All those commercial oriented artists tend to release their 5-6 tracks per a month. This is something simply related to the quantity and they simply want to create a music track using computers and other equipments. Once they create the track there are people who can write some strange words to tally with the track and the beat. Finally they’ll come up with a music video and all the TV stations eagerly waiting to play their music video.This is the story of Iraj, BnS, .. (you name it)

But .. They have forgotten the fact that there are some (I think its few :() people in Sri Lanka who still loves the instinctive slow music (mood songs) which is a blend of soothing voices and acoustic music.

l22550177690_5867

Kasun Kalhara is one of the exceptions thrown from the Sri Lankan modern music stream. As I see he is the best production by Dr. Premasiri Kemadasaa (Kemadasa master) and doubtless selection of the best voice in the modern generation. Not only his voice but he is a composer too.. But not a synthetic composer who solitary depends on Computer and keyboard.

‘The Romantic Opera’ is his latest album and in my opinion it is the best music release of him. The titled rack, ‘The Romantic Opera’ is inspired from Opera style music.

Opera is an art form in which singers and musicians perform a dramatic work (called an opera) which combines a text (called a libretto) and a musical score -wikipedia

And this album is a combination of a set of distinct music tracks. It has some ‘Latin’ music, some of ‘Chinese or Japanese’ music, some of ‘Indian’ music and a lot of ‘Western’ and ‘Sri Lankan’ music. Most of the songs are composed using acoustic music and very less number of music instruments were used.

You’ll not able to listen to this sort of music when you switch on the radio or TV.. You’ll never hear any of these songs when you are travelling on a bus… Soo… you need to find/buy it and listen to it… because that’s the way that most true Sri Lankan music lovers used to do…

scan0003

‘The Romantic Opera’ is a must listen music album of true Sri Lankan music fan.. Believe me .. It awesome.

We always call that the ‘music’ is an universal thing… yes it is.. That’s why we love A.R Rahman’s … that’s why we love Josh Groban.. and that why we should love Kasun Kalhara.

Posted in Uncategorized | 19 Comments »

Visitor Pattern

Posted by kasun04 on December 25, 2008

In Object Oriented Design, the visitor patter is one of the obscure pattern yet powerful enough to solve many complex OO scenarios. Visitor Pattern is also not really easy to understand at the first glance but you need to dig it around with practical examples. (I guess it’s not just for Visitor pattern but also for all the other patterns)

I would like to give a clean and simple introduction to visitor pattern with this post. So let’s start with a simple example ofthe usage of Visitor pattern

Visitor Pattern is often useful when there are fair numbers of related classes. One of the common examples of such a scenario is ‘drawing different shapes’. In this case we have a set of related classes; Circle, Trangle, Rectangle etc.

Now, if we are going to implement a draw() function for all these classes i.e. circle::draw(), trangle::draw().. etc we may be drawing different shapes but we are replicating a fair amount of code in all the classes. (Because the underlying methods, that we use to draw a ‘shape’ is similar)

Visitor pattern is quite capable of solving this kind of scenarios. Here what we are doing is that we port all the draw methods of each shaped inherited classes (Circle, triangle, Rectangle) to one common class called ‘DrawVisitor’ or simply ‘Visitor’. Then in the ‘DrawVisitor’ we have a set of functions overloads based on the type of object passed to it (Circle, Triangle, Rectangle etc), which implements the ‘drawing logic’. And its obvious that since we use a single visitor class to do the drawing of shapes we can share whatever the resources that we want to draw and can reuse drawing code.

There are a lot of concerns about using Visitor pattern, because applying visitor pattern to a given scenario often makes things ambiguous. However the applicability of

Visitor pattern is justified by James Cooper (author of a Java companion to the GoF) by giving us a real world scenario which is essentially solvable only from visitor pattern. His primary example :

Suppose you have a hierarchy of Employee-Senior Manager-Vendor etc. They all enjoy a normal vacation day accrual policy, but, Senior Manger also participates in a “bonus” vacation day program. As a result, the interface of class SeniorManager(as well as ‘Vendor’ and ‘SecurityOfficer’) is different than that of class Employee. We cannot polymorphically traverse a Composite-like organization and compute a total of the organization’s remaining vacation days. This is how we use the visitor pattern to solve this problem.

Create a ‘VacationVisitor’ which handles all the vacation manipulations and implementing different vacation manipulations in polymorphic visitor methods.

– Each visited instance (Employee, SeniorManager etc) is implementing an ‘accept’ method which in turns call the polymorphic visitor method.

– This method call procedure is termed as ‘Double Dispatch’

dd

– The complete class design looks like this.

visi

“The Visitor becomes more useful when there are several classes with different interfaces and we want to encapsulate how we get data from these classes.”

There are several motivations to use the visitor pattern.

– Add functions to class libraries for which the source is unrevealed or simple source is unavailable

– Obtain data from a distinct collection of unrelated classes and use it to present the results of a global calculation to the user program

– Gather related operations into a single class rather than force you to change or derive classes to add these operations

– Collaborate with the Composite pattern

Posted in Computer Science | Leave a Comment »

emacs rocks!

Posted by kasun04 on September 18, 2008

emacs with ecb!

Posted in Uncategorized | 1 Comment »

Chrome, Firefox, Safari and Opera – CPU and Memory usage

Posted by kasun04 on September 7, 2008

I perfromed a CPU usage and Memory usage test on Google Chrome, Firefox, Apple Safari and Opera. I simply ignore IE7 because the real battle comes when IE8 is officially relesed.

Here are my results. (or proofs) as screen shots.

IDLE State

Same Tabs Opened (various site were selected. Gmail, Apple, BBC, Youtube vedio streaming)

So then,  we can analyze the resource usage.

Chrome vs Opera

This is another comparison between Opera and Chrome.

Bottom Line

You have to decide which browser to be used.

Posted in Uncategorized | Leave a Comment »

Search Engine Technology-An In depth Analysis (I)

Posted by kasun04 on September 7, 2008

A search engine is an information retrieval system (IR Systems-See my prior post) designed to help find information stored on a computer systems. Search engines minimize the time required to find information and the amount of information which must be consulted.

The search engine technology is evolved over the past decade or so, owing to the continuous and rapid grow of World Wide Web. As the information systems grow bigger and bigger the amount information stored in the information systems became enormous. So as an information retrieval system, the search engine has to be evolved in order to be sync. with the growing stored data.

The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web. And more or less that the most important form of the search engine, because the www is the world’s fastest growing, distributed and fault tolerant information system. As an important business product the Enterprise Search Engines also play a critical role as most of the companies need to have their own customizable information retrieval system which is beyond a conventional web search engine.

Regardless of web or enterprise search, the technology used by the both forms are quite similar. in fact from the core (or heart) of the both engine looks almost the same. Here is a brief overview of different forms of Search Engines.

Web Search Engines

A Web search engine is a search engine designed to search for information on the World Wide Web. Information may consist of web pages, images and other types of files (including pdf, doc etc.). Web Search products are typically free to use and the vendors generate revenue via advertising.

Challenges:
  • The number of documents to be indexed (tens of billions)
  • The number of users (hundreds of millions)
Major Vendors:
  • Google
  • Microsoft
  • Yahoo

Enterprise search Engines

Enterprise search is the practice of identifying and enabling specific content across the enterprise to be indexed, searched, and displayed to authorized users. Enterprise Search Engine facilitates the application of search technology to information within an organization. And most of ES vendors tend to focus exclusively on Enterprise Search and do not also offer Web Search or Desktop Search products.

Challenges:
  • Index documents from a variety of sources such as: File systems, Intranets, Document Management Systems, E-mail, Databases.
  • Present a consolidated list of relevance ranked documents from these various sources
  • Many applications require the integration of structured data as part of the search criteria and when presenting results back to the users controls are vital
  • if users are to be restricted to documents to which they are granted access by the various document repositories within the enterprise
Major Vendors:
  • Microsoft (FAST Search & Transfer, Convera)
  • Autonomy
  • Dieselpoint
  • Endeca
  • Exalead
  • IBM
  • ORACLE
  • Google

Desktop search

Desktop search is about searching the documents personal to a single user (i.e. a local hard drive and email folders).Most of the leading Desktop Search products are provided as free downloads, or for a small fee per user.

Challenges:
  • To offer the search on the user’s local machine without impacting the performance of other applications running locally.
Vendors:
  • Microsoft
  • Google & Copernic/Coveo.

Posted in Uncategorized | Leave a Comment »

Fastest and Elegant Browser on Earth – Opera 9.52

Posted by kasun04 on August 30, 2008

Opera 9.52 is considered to be as the most powerful web browser that offers number of utilitarian features without penalizing the performance. As its counter part, Mozilla Firefox becoming more and more famous among web users, Opera is also becoming one of the major competitors against Firefox. Although many people are used to boring and tiresome Firefox, Opera’s latest version came up with a bunch of awesome features.

Faster Rendering Engine – Presto Rendering Engine

Opera uses their very own rendering engine (layout engine); ‘Presto’. ‘Presto’ is much faster than the tedious
‘Gecko’ layout engine (Mozilla project’s layout engine).

Elegant Look

You should use it and experience it. Here are some screen captures.

Download Manager with BitTorrent

Download files quicker with Opera. Opera starts downloading as soon as you’ve saved the file, so there’s no wasted time. Pause and resume downloads with the push of a button, and choose multiple files to download simultaneously without any fuss. Opera also features built-in support for the social file distribution protocol, BitTorrent, which makes it easy to download Torrent files without the need for a separate application.
I guess you know this .. Firefox download manager sucks!!

Opera Link and Speed Dial

Access your favorite Web sites everywhere!. Opera link syncs your bookmarks and Speed Dial between your computers and mobile phone. See the screen shots to see the use of speed dial.
(Firefox and IE copied this from opera. )

Tabs and Sessions

Most comprehensive tab browsing was introduced by Opera. Opera is very light on your computer, so it’s possible to have many tabs open at once. Drag and drop tabs around to change their order, or get a preview pausing over them with your mouse. You can even save a session of tabs for later, making it easy to pick up where you left off.

Zoom and Fit to width

Is it difficult to read the content on a page? Use Opera’s Zoom button in the lower right corner to resize Web
pages. If the page is too wide for your screen, simply hit “Fit to Width” and Opera will resize the Web page so you avoid horizontal scrolling.

Built-in e-mail and newsfeeds

Opera’s built-in e-mail client features improved responsiveness. It also retrieves and searches your mail and news feeds even faster than before. In Opera, newsfeeds are stored on your computer so you can read them later, even when you’re offline. Easily tag articles so you know which ones you want to read, which ones are important, which ones are funny, etc.

Quick Find

Have you ever forgotten the page where you found that great article or that perfect gift? When using Opera the browser remembers not only the titles and addresses, but the actual content of the Web pages you visit.

Visit – http://www.opera.com/

Posted in Computer Science | Leave a Comment »

Best IDE for Linux – Emacs Code Browser

Posted by kasun04 on August 24, 2008

A C/C++ developer on Linux platform, is someone who always seeks for a IDE that suits his work, but ends up with using tedious ‘vim’ or unstable eclipse CDT or Netbeans C/C++.

But for me the most powerful IDE for C/C++ development in Linux platform is emacs. Using just only emacs would do a little but emacs gain enmourse power when it used with Emacs Code Browser (ECB).

This is almost an IDE that is enriched with all the possible features in the modern IDEs in Windows like VS or IntelliJ. Have a look :-).

screenshot

Posted in Uncategorized | 4 Comments »

Facebook is not just an addiction–it’s a disease

Posted by kasun04 on August 22, 2008

If you are a dummy pc user, there is nothing to complain about using Facebook (and other such social networks) .. because you don’t have any other work to do with web or computer.

The problem arises if you are a IT Professional and still wasting time on Facebook. Here is a nice article that tells the true story.(I’m pasting it here :-))

source: Collegiate Times, by Susan Mulla

These words will go down in history: “Susan Mulla has requested to add you as a friend, but before we can do that, you must confirm that you are, in fact, friends with Susan.” If you’re ever lucky enough to receive an email saying that phrase, you best accept my friendship. If you don’t, how else can we read each other’s profiles every five seconds, or write inside jokes on each other’s “walls”? I think we all know what I’m talking about here; it’s the Facebook <WWW.THEFACEBOOK.COM>, and it has changed the way we live as college students.

Some have said, “Facebook is the worst social disease to hit college campuses nationwide,” and I would have to agree with that statement. So let’s take a deeper look into this new fad that has taken so many of us captive.Being a member of this cult following, I’ve realized that quite possibly the most crucial aspect of the Facebook is creating a flawless profile. Many of us are guilty of spending hours upon hours crafting our profile to ensure we come across as desirable to that special someone stumbling upon it. A flattering picture is the first step to the perfect profile. Next, your music interests have to be listed, but in all honesty it’s just an opportunity for people to pretend they are really eclectic with their music tastes. For example one might write: “I’m totally into ‘Death Cab for Cutie,’ ‘The Pussycat Dolls’ and ‘Yo-Yo Ma.’” You don’t have to try to impress people by listing every band you’ve ever heard of — it’s pretty obvious you’re faking.

Then there is always the request for friendship from that old high school “friend” who you actually never said a word to in high school. Maybe it was the person who laughed in your face when you asked them to prom, and now expects you to accept their friendship. Heck no. I say reject that “friendship” and show them what they missed out on. Then, there’s that whole “poking” deal. I will never forget the first time I was “poked.” I just sat there at my computer dumbfounded, in awe of the words I saw in front of me: “You have been poked, do you want to poke back?” I wasn’t sure if I should be flattered, offended or violated.

There’s also that whole stalking thing, too. Let’s be honest. We can all admit that Facebook has opened a door to the opportunities to stalk people. For example — last year there was a guy in one of my classes who caught my eye, and being too scared to talk to him in person, I Facebooked him. About five minutes later I find myself sitting in my room listening to the mp3s I got off his AIM profile and flipping through his Web shots. What was I doing? I didn’t even know the guy and already he was serenading me.Now, I can’t help but ask myself what is going to come of us if we continue communicating via Facebook and Instant Messenger? Will we eventually meet people, start dating, get engaged, get married and have kids and get divorced, all in one chat session? I know it might seem like I’m sitting here pointing my finger at all of you, but I’m just as guilty of Facebook addiction and compulsive away message reading as the next screen name on your buddy list. To be completely honest, I haven’t yet met my Collegiate Times editor in person, because all we do is pass e-mail back and forth.

What killed normal human interaction? I think we have simply become lazy. It’s easier to put up an away message that will let people know if you’ve had a bad day or billboard every single detail of what your schedule entails for that day: “Off to class, then lunch, then the gym, then the bathroom, then washing my hands, then drying them.” Who needs to know all of that?

So here is what I say to you all of you who sit at your computers and check away messages and stalk the guy you saw at Hokie Grill once — stop living like this. I’m convinced that we could fritter our whole lives sitting in front of the computer screen. What we need to do is ask ourselves this question: “What will happen when all of my buddies are away, or when the Internet connection cuts short.” We need to face it and realize that life doesn’t happen on a computer screen, that having 202 friends on Facebook doesn’t make you cool.

Posted in Uncategorized | 1 Comment »

Atif Aslam

Posted by kasun04 on August 22, 2008

Atif Aslam is one of the youngest yet a powerful voice in the indian pop music feild. He got unique voice which is a blend of Indian and Pakistani vocices. Infact he originated in Wazirabad, Pakistan and finally became one of the most wanted background singer in Bollywood.

He has released sevaral albums under TIPS and Fire Records lables and they were quite successfull as his various singles released in latest movies like Race, Kismat Konnection etc.

He is well-known in the subcontinent and with ex-pats for his songs ‘Aadat’, ‘Hum Kis Gali Jaa Rahe hain’, ‘Ehsaas’, ‘Doorie’, ‘Kuch Is Tarah’, “Woh lamhe”, “Pehli nazar mein” and “Tere Bin”. He was formerly the lead singer of Jal.

If there is something that rellay impress many people around the world.. that is his voice.. Its not Hindi .. its not Urdu.. just listen to this..

Atif Aslam | Pehli nazar mein | race

Posted in Music | Leave a Comment »