Article

Evolution of Search: Information Access and Guided Summarization

By Dinesh Chand, Senior Consultant – Information Search & Access, Broadstreet Data Solutions


It all began with their inability to understand human-friendly or unstructured data. Yes, I am talking about computers. The position of data became more important than the content or the context of the data in order to use it effectively with the computer. Rows and Columns became the norm of the day. A tremendous effort has been put in organising data in a structured way by companies hoping to utilize the value of their data. Despite spending enormous amount of money in converting data to a structured format, data, in its original format – unstructured / semi structured is still useful. Many companies still have the non-structured data that they simply do not have the time or money to convert to a structured format.

Then the question of looking at the data arises. Data itself becomes meaningless unless it is presented in a context understandable by humans, as information. Data stored, structured or unstructured, has to be retrieved and shown with context as information to be of any value. Companies, who have invested heavily in Business Intelligence or have custom built applications, still cannot open up their data completely or as efficiently as they wish to. The turn-around-time for a typical BI report is still measured in minutes, for instance.

Why does the traditional warehouse not fix the problem entirely? Firstly, it has difficulty getting access to unstructured and semi structured data. Secondly, it has limited ways of relating it to existing structured data. Add the limitations of the BI tool to this… it gets worse. Lastly, if you manage to do all of it, the query powering the reports at the front-end that show the information you are looking for, still takes a considerable amount of time. Why the discrepancy in the query time? Think about it, if Google can return results searching in a dataset considerably larger than any company can own…in less than seconds… why does the typical BI tool take much longer?

Then the question of Data Integration comes to mind. If all I want to do is look at related data that happen to be in separate silos do we have to really go down the path of expensive Data Integration projects? Why do we try to hammer the square object in the triangular hole?

The World Wide Web search might show us the way. Relational databases are agreeably one of the best ways to store the data. But this is not the 1980’s. Disk space and memory are less expensive than Consultants. We shouldn’t shy away from taking the “Google” approach and improve upon that.

Let us introduce Endeca to you... Endeca is an Information Access platform. It doesn’t replace your ETL or BI tool, but lets you get more information out of your data and faster. When looking for Information, for those who always felt the BI front end as too restrictive and custom built SQL queries as too complex, Endeca is the solution.

Imagine a Google like free form text search capabilities for your queries. The ability to associate structured data with unstructured data and semi structured data without the need to bring them to your database. The ability to integrate your information without having to integrate the underlying data -building into an Information warehouse. This is Endeca.

Picture a piece of data floating in a 3 dimensional world. It is related to different elements – data and attributes / dimensions. Every data element is linked to another data or attribute in this world. The only difference is the level of the link. The key here is to understand that the “database” for this Information Access is not in relational format.

If you add a “Google like” patented technology that makes finding information a nifty process and a front end highly customizable Java / .NET API… the possibilities are endless. What is even better is the guided summarization that makes finding what you want even easier. This goes one step above. While free form text searches are still one of the options, what about questions that require information that do not fit in a report or a search box properly? Guided summarization shows you the way.

In addition to being able to look up the various formats of data, relating them to each other, providing capabilities for searching them using free form search, guided summarization allows you analyze dimensions of data that you would not be aware of, easing your way to the information you need. For more information about Endeca, and to view a demo of how guided summarization works, please visit http://www.endeca.com/