Last week I was invited for an “Expert meeting E-Discovery”. I’ve been in the search business for many years and I regularly encounter the concept and practice for “E-discovery” as well as “Enterprise search” (and E-commerce search, and Search Based Application etc.).
So I decided to get some information about what people think about the difference between Enterprise search and E-Discovery.
Definition of E-Discovery (Wikipedia):
Electronic discovery (also e-discovery or ediscovery) refers to discovery in litigation or government investigations which deals with the exchange of information in electronic format (often referred to as electronically stored information or ESI). These data are subject to local rules and agreed-upon processes, and are often reviewed for privilege and relevance before being turned over to opposing counsel.
Definition of Enterprise search (Wikipedia):
Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.
When you look at the definitions, the difference is in the “goal”. E-Discovery is dealing with legal stuff to gather evidence; Enterprise search is dealing with “general purpose” to gather answers or information to be used in some business process.
But one can also see the similarities. Both deal with digital information, multiple sources and a defined audience.
What could be seen as different is that according to these definitions, E-Discovery does not talk about a technical solution that indexes all (possibly relevant) information and makes that searchable. Enterprise search is much more close to a technical solution.
So… not much differences there, but I am beginning to have a hunch about why people could see them as different. My quest continuous.
I found two articles that are pretty clear about the differences:
- “Enterprise Search vs. E-Discovery Search: Same or Different?“, and
- “3 Reasons Enterprise Search is not eDiscovery“
I think that the differences that are mentioned come from a conceptual aspect of E-Discovery vs. Enterprise search, not from a technical (solutions) point (and even on the conceptual point they are wrong). Also I think that the authors of the article compare the likings of the Google Search Appliance to specialized E-Discovery tools like ZyLab. They just simplify the fact that there are a lot of more solutions out there that do “Enterprise search” but are very more sophisticated than the Google Search Appliance.
Below I will get into the differences mentioned in those articles from a technical or solution point of view.
- Business objective is a key consideration
“Recall vs. Precision” (getting all the relevant informations vs. getting the most relevant informations)
It is true that a typical Enterprise search implementation will focus on precision. To support efficient answering of common queries and speeding up information driven processes in a company, precision is important.
This does not say that the products used for Enterprise search cannot deliver all relevant informations for a query. HPE IDOL as well as Solr can retrieve all relevant informations fast.
- Number of search queries matter
“Simple vs. complex queries”
Here a couple of keyword examples are given to illustrate how people use Enterprise search. I’ve been working with companies (intelligence) that use Enterprise search solutions (like HPE IDOL/Autonomy) to use far more complex queries to get all possible relevant informations back.
The complex queries that are illustrated can be handled by Solr easily.
- The cost of relevancy
“Transparent query expansion”
For every search manager is important to know why results show up given a specific query. It is needed to tune the engine uses and the results that are displayed to the users.
Solr is open source and that’s why the community invest heavily in making it transparent why results come up given a specific (complex) query.
Furthermore there are tools that can be used with Solr that can even make E-Discovery better. Think of the Clustering engine Carrot2. That solution will make relations in informations visible even without knowing up front that those relations could even exist.
- Lenghty deployment
“All informations for one audience” vs. “All informations for everyone”
For this… see the first bullet under the next section “Business case”.
But also… an Enterprise search deployment can take some time because you have to find ways to get informations out of some content systems. Will this be ease when using a E-Discovery solution? Do they have ways to get content out out ALL content systems? If so… please share this with the world and let that vendor get into the Enterprise search business. They will have the “golden egg”!
- Misses key data sources
E-Discovery vs. “Intranet search”
The whole promise of “Enterprise search” is to get all informations in a company findable by all employees. The authors of the articles must have missed some information about this. Point.
- Not Actionable
“Viewing” vs. “Follow up”
The platforms that make up a real good Enterprise search solution are designed to support many information needs. They can support many different search based applications (SBA’s). E-Discovery could as well be such a search based application. It has specific needs in formulating queries, exploring content, saving the results, annotating it and even recording queries with their explanation.
So when I look at the differences from my piont of view (implementation and technical) I see three topics:
- Business case
The Business case for an E-Discovery Solution is clear: You have to implement/use this because you HAVE to. It’s a legal thing. The company has to give access to the data. Of course there is still a choice for doing this manually. But if there is too much information, the cost of labour will exceed the cost of a technical solution.
When we look at Enterprise search (all information within the company for all employees) there is no one who will start implementing a technical solution without insight in the cost and benefits. Implementing a large (many sources, many documents, many users) Enterprise search solution is very costly.
The audience (target group) for E-Discovery is the investigators that have to find out if there is any relevant information concerning an indictment or absolution in a legal case. This group is highly trained and it can be assumed that they can work with complex queries, complex user interfaces, complex reporting tools etc. Focus is getting all relevant documents, no matter how hard it is to formulate the right queries and traversing through the possible results.
The audience for Enterprise search is “everyone”. This could be skilled informationspecialists, but also the guys from marketing, R&D and other departments, just trying to find the right template, customer report, annual financial report or even the latest menu from the company restaurant.
Design of the user experience has to be carefully designed so that it is usable for a broad audience with different information needs. Sometimes the most relevant answer or document is OK, but in other use cases getting all the information on a topic is needed.
For E-Discovery in legal circumstances it’s simple: Every piece of informations has to be accessible. So no difficult stuff about who can see what.
In Enterprise search security is a pain in the *ss. Many different content systems, many different security mechanisms and many different users that have different identities in different systems.
To provide the right tools for an E-Discovery goal a solution needs to take care about some specific demands. I am pretty sure that the search solutions I mentioned can take of most of them. It’s all in the creation of the user interface and supporting add-ons to make it happen.
Allthough a typical Enterprise search implementation may not have this, the products used and the possibilities of creating custom reports and actions (explain, store etc.) do exist.
What none of the articles mention is the complexity of getting all informations out of all systems that contain the content. When abstracting from the possible tools for E-Discovery or Enterprise search, the tools for connecting to many different content systems is probably the most essential thing. When you cannot get informations out of a content system, the most sophisticated tool for search will not help you.
Enterprise search vendors are well aware of that. That’s why they invest so hard into developing connectors for many content systems. There is no “ring to rule them all” in this. If there are E-Discovery vendors that have connectors to get all informations from all content systems I would like to urge them to get into the Enterprise search business.
My conclusion is that there are a couple of products/solutions that can fullfill both Enterprise search needs as well as E-Discovery needs. Specifically I want to mention HPE IDOL (the former Autonomy suite) and Solr.
When looking at the cost perspective, Solr (Open source) can even be the best alternative to expensive E-Discovery tools. When combining Solr with solutions that build on top of them, like LucidWorks Fusion, there is even less to build of your own.
I am only talking about two specific Enterprise search products because I want to make a point. I know that there are a lot more Enterprise search vendors/solutions that can fulfill E-Discovery needs.