Liveblogging Enterprise 2.0 – Enterprise Search

Enterprise Search

* Moderator – Larry Cannell, Enterprise Technology, Ford IT
* Speaker – Aaron Brown, Program Director, IBM
* Speaker – Lee Phillips, Senior Director, Knowledge Discovery Solutions, FAST
* Speaker – Matt Eichner, Vice President of Strategic Development, Endeca
* Speaker – Seth Gottleib, Principal, Content Here

Larry: This won’t be a pure demonstration – but I have asked the panelists to record some screencasts and show a bit about what their products do.

Quick demos from the vendors, followed by Seth and I leading a Q&A session.

Search itself has become critical to just about everything. In some reports, 90% of navigation is search. (I even sometimes use the find command in the browser to find search, if the search box isn’t obvious).

New VW site – a competitor of my employer – but notice the prominence of search. (vw.com)

Anecdote – his wife never types a url in the address bar – she searches on the company name – that way, instead of one option, she gets many and is more likely to hit what she wants (try Ford as an example – www.ford.com versus Ford in a google box).

Intranet search versus Internet search – why is it so hard to get Intranet search right? Isn’t search just search?

What are the components?
– Crawl (index)
– Query (what am I looking for)
– Show (here’s the query)

What’s different in Internet search?
– Access controls (what can we show or not show)
– Crawling is insufficient for intranets – much of what you need isn’t in an html page or can’t be crawled
– No page rank to support query location – there often are not rich intranet links

But all is not lost – there are some chances to improve intranet search
– We know the user – we have identity in a way internet such does not

—-

Aaron – IBM.

Two quick demos on key technologies – semantic search (going from simple keywords to real concepts)

OmniFind Enterprise Edition – bringing semantics to search (for example, returning phone numbers based on pattern recognition). In addition, users can tag urls – dogear, social search. Also matches bluepages.

Also shows a rich semantic search example – parsing not just keywords but a sort of natural language index which breaks the indexed content down into concepts, not just word or pattern matches – in the law enforcement space.

Lee – FAST

FastIDEA – information discovery & everyday analytics

Helps user narrow, surfacing clusters within results, enabling users to save searches as topics. (Example Vioxx and lawsuits – not all the other items about Vioxx).

Saved topics can be private (no one else can see my topic) or public (let other users benefit from my topic definitions).

Notifications – rss, email, many options for display, timing, etc. Enabling users to create basically a portal of topics – information discovery not typical “search.”

This is “search beyond the box” in the sense that the box is no longer the heart of the problem.

Fast also has a Personal search platform – desktop search. (fastPSP)

Also an advanced version with more complex query filtering / narrowing, different handling of the results.

(Sorry – got interrupted right there and missed a bit of the end of Lee’s demo)
—-

Matt – Endeca

We think about the context of search – it isn’t just about the user finding, it is about discovering new information that enables you to make a decision.

We think it is the machine’s job not just to respond to the users queries, but to expose information about what underlies that query response – to expose different slices across the results. Could be summaries, pie charts, bar charts, maps, whatever.

Endeca’s guided navigation automatically surfaces dimensionality – showed the wine demo which I think is available on their site.

[Full disclosure – Endeca’s an Optaros client, though what we’re doing for them is not directly related to their core search technology).

As Weinberger was talking about yesterday, the metadata is critical to so much information today – those metadata can be used to enable the kind of information filtering Endeca is doing. In addition, long form documents can have metadata extracted based on contents, patterns, etc.

Larry – Search has come a long way from that basic search box.

Seth – I come at search from a different perspective, having spent most of the last 10 years working in content management – as a customer of CMS, a user of CMS, a systems integrator, and working for CMS vendors. I’ve also been on the board of a membership organization (CM Pros). My company, Content Here, is a vendor neutral analyst and consulting firm focused on content management.

Search has an odd relationship with folks used to managing structured data – information architects and the like – since in theory good search means you have to worry less about how the content is organized. But ultimately good search relies on good metadata and structured content.

[More full disclosure: Seth’s a friend and former colleague of mine from Optaros and Molecular].

Discussion:

Lee: Let’s talk about portals and search. Often there’s been a challenge with search and portals – brittle layout, architectural and usability issues with traditional portal implementations. We think there’s an interest in search-driven information portals, where each sub portlet is a result of query.

Matt – one big difference we’re seeing is that there are many more comparable experiences out on the public web which get better all the time – these can make the internal information discovery systems seem even worse than it has been. (Part of the consumerization of technology).

Aaron – absolutely. This consumerization has to do with easier interfaces and also other things users experience as consumers first – del.icio.us, for example, or RSS feeds from somewhere like Google of everything matching your company name, or technorati, or blog search which leverages social input.

Q from audience: Faceted search is great – but in my experience people don’t go past the first page of results, and filter by simply doing a new search rather than using filters and such. In filtered search, do you expose tags upfront, allow users to choose tags, or expose no structured filtering at all and let people brute force keywords? Pure unstructured, semi-structured, or fully structured?

Aaron – I guess I’d say they aren’t so different or separate. You might have use cases which call for different mixes of the three – in some cases unstructured data exploration might lead to more structured investigation as a second step.

Matt – Buzzillions example- facets driving the structured entities but also facets which are tags provided by users. (So that’s both taxonomy and folksonomy in a single application).

Q from audience: What’s in going to cost to do this? What’s the line between I have to do this and it is still to cost-prohibitive?

Lee – comes down to the question of the value of the decisions the users are trying to make on the basis of what they are finding.

Seth – lots of studies on ROI of search and intranets. A lot of the measurement is based on time to find things – what’s the drag on a knowledge worker finding irrelevant results. Nothing is overly compelling as a concrete case – too many of the ROI stories overstate the drag (four hours a day trying to find things I can’t find?). But it ultimately comes down to what the cost is of having people not find what they need. The technology is probably the least expensive piece of it – as opposed to the people’s time taken to garden the index and make improvements in response to use, checking query logs, etc. Those are costly activities in people time, not just technology.