Forrester recognizes HPE IDOL/Vertica as leader in the latest (2017 Q2) “Cognitive” Search Vendor Evaluation

Last week I received e-mails from vendors like Attivio and Sineque about the latest Forrestor report “The Forrester Wave™: Cognitive Search and Knowledge Discovery Solutions“.

The enterprise search (yes, I still call it that) solutions of those vendors are placed in the “Leaders” quadrant.

Attivio has been around for some years now – about 10 years. Sinequa is bit younger and started to gain traction about 5 years ago. At least, to my knowledge.

What I like is that HPE IDOL/Vertica is placed in the top of the leaders quadrant. It has been there for more than a decade – with a short absence because of the trouble that HPE had in repositioning IDOL after the Autonomy buy.

Some six months ago I started working for a company specialized in implementing HPE (Autonomy) IDOL (KnowledgePlaza) products. Before that (2011-2016) I worked for a company that was mostly busy with the Google Search Appliance. Before that (2006 – 2011) I also worked for a company that mostly did Autonomy IDOL implementations (and before that Verity).

So I’m back in the saddle with HPE IDOL and I must say, that I am still impressed by their offering. It is very complete and has been under development in the last years. Because of the maturaty, it is stable. It’s a complete suite of modules with everything you need to implement a sophisticated Enterprise search environment. A “Swiss army knive” so to speak.

More info at “IDOL Unstructured Data Analytics: Enterprise Search & Knowledge Discovery | Micro Focus“: “Unified machine learning platform for enterprise search and big data analytics – text analytics, speech analytics, image analytics and video analytics.”

Furthermore LucidWorks – with their Fusion offering, based on Solr – is somewhere in the middle of the Forrester wave/quadrant. Watch them because the “Solr on steroids” offering is also very usefull or even needed if you want to implement Solr as an enterprise search solution. Needless to say that my company also uses that product to fullfill “Cognitive Search and Knowlegde Discovery” needs.

A modern intranet: connecting people to people?

2017-10-12 22_27_08-connecting people to content content to content people to people. - Google zoekeToday I read “Will Intranets Disappear With the Rise of the Bots?“. The author writes about how “old” intranet were all about one-way communication and providing essential content.

But:

Intranets designed and built around document libraries, one-way communications and links to deeper knowledge are no longer the proud, highly esteemed centerpieces of old“.

According to the article this doesn’t cut it in this time anymore. Doing business and work nowadays asks for more fluïd information, fast two-way communication etc. to support decision making and innovation:

A functioning intranet has become more about people: Finding people, interacting with people, building relationships and networks.” and “Decision making needs and the drive to improve the customer experience require a more fluid and faster intranet than one that is essentially a library“.

The article goes on about bots and how those bots will assist us in getting answers to common questions and helping us with doing simple tasks.

While reading this I thought to myself “But what with the ever-present problem of capturing tacit knowledge“? The goals of Knowledge management are to get the right information to the right people (of proces) at the right time, basically to achieve “doing the things right the first time”. There are two important use cases for managing knowledge:

  1. To make sure that new people that join the company know what the company has been doing, what works/worked and what (has) not, where to get the information they need. Simply to make them productive as quickly as possible.
  2. To make sure that the company is doing the right things right. Think about R&D and product/business development. It makes no sense to develop products you already have or to to research on a topic that has been covered in the past and from which the outcome is already known.

So when the author says:

Knowledge is instantly available, discussed, shared and fully used in the time it takes to add metadata to a document

and connecting people in a social environment is more important than securing information for future reference, we risk asking the same questions to people over and over again. Also, when experienced people leave the company the existing knowledge will leave the company with them. Connecting to people also poses the risk of getting them out of there current process to. This can lead to lower productivity because of the constant disturbance of notifications, instant messaging etc,

So, I still believe in “document libraries” with high quality information and data that any employee can access and use when ever he or she needs it. We simple need to manage the knowledge, information and data so that it is readily accessible.

When the article speaks of “bots” in that context I translate that to “a fucking good search engine” that understands what’s important and what not (in the context of the user/question). Enterprise search solutions also have the ability to provide pro-active suggestions to relevant content (research, people with knowledge). It all depends on how deep you want to integrate different technologies.

So, connecting people remains important in a company. But for a company to survive for a long time, it needs to secure it’s information and “knowledge”. Surely we need more smart solutions to connect people to content, content to content, content to people and people to people.

 

Everything you wanted to know about the Dark Web… and more

Today I acquired a copy of the “Dark Web Notebook” (Investigative tools and tactics for law enforcement, security, and intelligence organizations) by Stephen E Arnold.

I know, the grumpy old man from rural Kentucky who speaks negatively about almost all large “Blue Chip” companies and “self-appointed search experts”.
I read his articles with a lot of scepticism because he seems to “know it all”.

But… with this book he surprised me.

The Dark Web is something we all heard about, but most of us don’t know what it is, including myself. Until now.

If you are curious to, you should get a copy of this book. Purchase for $49 at https://gum.co/darkweb

From the introduction in the book:

The information in this book will equip a computer-savvy person to break the law. The purpose of the book is to help those in law enforcement, security, and intelligence to protect citizens and enforce applicable laws. The point of view in the Dark Web Notebook is pragmatic and pro-enforcement

You are warned!

Posted in: Kennis, Marktontwikkeling, Technologie by Edwin Stauthamer No Comments

HPe/IDOL (Former Autonomy IDOL) is still alive and kicking

2017-02-08 14_09_52-IDOL Unstructured Data Analytics_ Enterprise Search & Knowledge Discovery _ HewlWith all the rumble about Solr, Elasticsearch and other search vendors like Coveo, Attivio one could easily forget about that long existing behemoth in the (enterprise) search niche: HPE/IDOL.

IDOL stands for “Intelligent Data Operating Layer” and is a very sophisticated big data and unstructured text analytics platform and has been around for more than two decades.

HPE is stil investing heavily in this technology that consists of a very rich ecosystem of modules:

  • connectors
  • classifiers
  • taxonomy generators
  • clustering engine
  • summarization
  • language detection
  • video and audio search
  • alerting (agents)
  • visualization (Business intelligence for human information (BIFHI))
  • DIH/DAH for distribution (scalability) and mirroring (availability) of content and queries

Recently (december 2016) HPE added machine learning and Natural Language Procession to the capabilities.

IDOL can be used for knowledge search, e-commerce search, customer self service search and other use cases that require fast, accurate and relevant search.

Next to the “on premise” solution, HPE also enabled the IDOL platform to be used in the cloud with a range of services: Haven OnDemand. With this platform developers can quickly build search & data analytics applications. There are dozens of API’s available, amongst them:

  • Speech to text
  • Sentiment analysis
  • Image/video recognition
  • Geo/Spatial search
  • Graph analysis

So IDOL is still very much alive and kicking!

Looking for a specialist that can support you with first class search and text analytics based on HPE IDOL in the Netherlands? KnowledgePlaza Professional Services is a fully certified HPE Partner.

Unstructured data is still growing; Search technology alone is not the solution

Today I received an e-mail from Computerworld about IBM Watson and “How to Put Watson to Work for Powerful Information and Insights”.  It’s a Whitepaper with info on IBM Watson Explorer (IBM rebranded all their products that have something to do with search and text-analytics with the prefix “Watson”).

One of the first things mentioned in the Whitepaper is “Unstructured data has been growing at 80%, year-over-year” (Yeah, yeah… again the magical 80%).
Mmm… where did I hear that before? Oh yeah that’s been propagated by every vendor of enterprise search solutions over the last decade or more.

The mantra of vendors of search solutions is “no matter how much information/data/documents you have, our solution is capable of finding the most relevant information”.

I’ve been a “search consultant” for years now, and I now that getting the most relevant information op top is very difficult. The more information a company has, the harder it gets.
Just adding files from filesystems, SharePoint, E-mail etc. to the search environment will not make all information findable. Sure, when you are looking for a specific document of which you know some very specific features, you can probably retrieve it. But when you are looking for information on a certain topic, you will be burried with all kinds of irrelevant, not authoritative, content.

So to get your employees or customers the right information one has to take measures.

  1. Know the processes that need information from the search engine.
  2. Identify “important”/”authoritative” sources and content.
    Boost that kind of content
  3. Make it clear to the users what they can expect to find: what sources are indexed?
    Where can they get help?
  4. Clean, Clean, Clean
    Yep… Get rid of information that will not matter at all.

I could mention many more points, but this post is not intended to be a complete guide :).

One remark: Using a search engine to just index as much content from as much sources you can find in your company can be very, very usefull. It gives an information manager the insight as to what content is available in what content system and what the quality is. That can be a first step in improving information governance: no ‘hear-say” but pure data and analytics.

Do not trust vendors of solutions that say that not the volume of the information/content is the problem, but the product that you are now using for search is.
Feeding ground for good findability is good information governance: delete outdated, irrelevant and non-authoritative content and take care of good structuring of the content that matters!

 

Posted in: Opinie, Vendors by Edwin Stauthamer No Comments

The seven (7) “deadly” sins of text analytics

John Martin of “BeyondRecognition” posted a couple of interesting articles on LinkedIn concerning the use of Text Analytics or Text Mining to classify files and documents.

Of course his “catch” is that one needs visual recognition as well as text based pattern recognition; BeyondRecognition delivers visual recognition technology.
In nearly every article the “problem” of having “image-only” PDFs or TIFFs is mentioned; when there is no text, text mining will not work. We all know that it is very easy to OCR PDFs and TIFFs. One step further is image recognition within photo’s. Both technologies will give us text and metadata to associate with the files.

But still, the articles have some good point that have to be taken into account when using text based classification solutions:

Parts 5 through 7 are still to come…

 

Posted in: Technologie by Edwin Stauthamer No Comments ,

Knock, Knock. Is anybody alive here?

It’s been pretty quiet around here… Not that I haven’t published anything, though.

Just read my posts and shares on LinkedIn and/or Twitter if you want to keep up with the news on enterprise search and data analytics.

 

Posted in: WS Nieuws by Edwin Stauthamer No Comments

Open source search thriving on Google Search Appliance withdrawal?

Last week I had my first my first encounter with a potential client that changed their policy on open source search because of a recent event.

They were in the middle of a RFI (request for information) to see what options there are for their demands regarding enterprise search, when Google announced the end-of-life for their flag ship enterprise search product: the Google Search Appliance.

This has led them to think about this: “What if we choose a commercial or closed source product for our enterprise search solution and the vendor decides to discontinue it?”.

The news from Google has gotten a lot of attention on the internet, through blog posts and tweets. Of course there are commercial vendors trying to step into this “gap” like Mindbreeze and SearchBlox.

I have seen this happen before, in the time of the “great enterprise search take-overs”. Remember HP and Autonomy, IBM and Vivisimo, Oracle and Endeca, Microsoft and FAST ESP?
At that time organizations also started wondering what would happen to their investments in these high-class, high-priced “pure search” solutions.

In the case of the mentioned potential client the GSA was on their list of possible solutions (especially because of the needed connectors ánd the “document preview” feature). Now it’s gone.

Because of this, they started to embrace the strenght of the open source alternatives, like Elasticsearch and Solr. It’s even becoming a policy.
Surely open source will take some effort in getting all the required functionalities up and running, and they will need an implementation party. But… they will own every piece of software that is developed for them.

I wonder if there are other examples out there of companies switching to open source search solutions, like Apache Solr, because of this kind of unexptected “turn” of a commercial / closed source vendor.

Has Google unwillingly set the enterprise search world on the path of open source search solutions like Apache Solr or Elasticsearch?

 

Enterprise Search vs. E-Discovery from a solution point of view

Last week I was invited for an “Expert meeting E-Discovery”. I’ve been in the search business for many years and I regularly encounter the concept and practice for “E-discovery” as well as “Enterprise search” (and E-commerce search, and Search Based Application etc.).

So I decided to get some information about what people think about the difference between Enterprise search and E-Discovery.

Definition of E-Discovery (Wikipedia):

Electronic discovery (also e-discovery or ediscovery) refers to discovery in litigation or government investigations which deals with the exchange of information in electronic format (often referred to as electronically stored information or ESI). These data are subject to local rules and agreed-upon processes, and are often reviewed for privilege and relevance before being turned over to opposing counsel.

Definition of Enterprise search (Wikipedia):

Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.

When you look at the definitions, the difference is in the “goal”. E-Discovery is dealing with legal stuff to gather evidence; Enterprise search is dealing with “general purpose” to gather answers or information to be used in some business process.
But one can also see the similarities. Both deal with digital information, multiple sources and a defined audience.

What could be seen as different is that according to these definitions, E-Discovery does not talk about a technical solution that indexes all (possibly relevant) information and makes that searchable. Enterprise search is much more close to a technical solution.

So… not much differences there, but I am beginning to have a hunch about why people could see them as different. My quest continuous.

I found two articles that are pretty clear about the differences:

I think that the differences that are mentioned come from a conceptual aspect of E-Discovery vs. Enterprise search, not from a technical (solutions) point (and even on the conceptual point they are wrong). Also I think that the authors of the article compare the likings of the Google Search Appliance to specialized E-Discovery tools like ZyLab. They just simplify the fact that there are a lot of more solutions out there that do “Enterprise search” but are very more sophisticated than the Google Search Appliance.

Below I will get into the differences mentioned in those articles from a technical or solution point of view.

From “Enterprise Search vs. E-Discovery Search: Same or Different?“:

  1. Business objective is a key consideration
    “Recall vs. Precision” (getting all the relevant informations vs. getting the most relevant informations)
    It is true that a typical Enterprise search implementation will focus on precision. To support efficient answering of common queries and speeding up information driven processes in a company, precision is important.
    This does not say that the products used for Enterprise search cannot deliver all relevant informations for a query. HPE IDOL as well as Solr can retrieve all relevant informations fast.
  2. Number of search queries matter
    “Simple vs. complex queries”
    Here a couple of keyword examples are given to illustrate how people use Enterprise search. I’ve been working with companies (intelligence) that use Enterprise search solutions (like HPE IDOL/Autonomy) to use far more complex queries to get all possible relevant informations back.
    The complex queries that are illustrated can be handled by Solr easily.
  3. The cost of relevancy
    “Transparent query expansion”
    For every search manager is important to know why results show up given a specific query. It is needed to tune the engine uses and the results that are displayed to the users.
    Solr is open source and that’s why the community invest heavily in making it transparent why results come up given a specific (complex) query.
    Furthermore there are tools that can be used with Solr that can even make E-Discovery better. Think of the Clustering engine Carrot2. That solution will make relations in informations visible even without knowing up front that those relations could even exist.

From “3 Reasons Enterprise Search is not eDiscovery“:

  1. Lenghty deployment
    “All informations for one audience” vs. “All informations for everyone”
    For this… see the first bullet under the next section “Business case”.
    But also… an Enterprise search deployment can take some time because you have to find ways to get informations out of some content systems. Will this be ease when using a E-Discovery solution? Do they have ways to get content out out ALL content systems? If so… please share this with the world and let that vendor get into the Enterprise search business. They will have the “golden egg”!
  2. Misses key data sources
    E-Discovery vs. “Intranet search”
    The whole promise of “Enterprise search” is to get all informations in a company findable by all employees. The authors of the articles must have missed some information about this. Point.
  3. Not Actionable
    “Viewing” vs. “Follow up”
    The platforms that make up a real good Enterprise search solution are designed to support many information needs. They can support many different search based applications (SBA’s). E-Discovery could as well be such a search based application. It has specific needs in formulating queries, exploring content, saving the results, annotating it and even recording queries with their explanation.

Analysis

So when I look at the differences from my piont of view (implementation and technical) I see three topics:

  • Business case
    The Business case for an E-Discovery Solution is clear: You have to implement/use this because you HAVE to. It’s a legal thing. The company has to give access to the data. Of course there is still a choice for doing this manually. But if there is too much information, the cost of labour will exceed the cost of a technical solution.
    When we look at Enterprise search (all information within the company for all employees) there is no one who will start implementing a technical solution without insight in the cost and benefits. Implementing a large (many sources, many documents, many users) Enterprise search solution is very costly.
  • Audience
    The audience (target group) for E-Discovery is the investigators that have to find out if there is any relevant information concerning an indictment or absolution in a legal case. This group is highly trained and it can be assumed that they can work with complex queries, complex user interfaces, complex reporting tools etc. Focus is getting all relevant documents, no matter how hard it is to formulate the right queries and traversing through the possible results.
    The audience for Enterprise search is “everyone”. This could be skilled informationspecialists, but also the guys from marketing, R&D and other departments, just trying to find the right template, customer report, annual financial report or even the latest menu from the company restaurant.
    Design of the user experience has to be carefully designed so that it is usable for a broad audience with different information needs. Sometimes the most relevant answer or document is OK, but in other use cases getting all the information on a topic is needed.
  • Security
    For E-Discovery in legal circumstances it’s simple: Every piece of informations has to be accessible. So no difficult stuff about who can see what.
    In Enterprise search security is a pain in the *ss. Many different content systems, many different security mechanisms and many different users that have different identities in different systems.
  • Functionality
    To provide the right tools for an E-Discovery goal a solution needs to take care about some specific demands. I am pretty sure that the search solutions I mentioned can take of most of them. It’s all in the creation of the user interface and supporting add-ons to make it happen.
    Allthough a typical Enterprise search implementation may not have this, the products used and the possibilities of creating custom reports and actions (explain, store etc.) do exist.

Connectors?

What none of the articles mention is the complexity of getting all informations out of all systems that contain the content. When abstracting from the possible tools for E-Discovery or Enterprise search, the tools for connecting to many different content systems is probably the most essential thing. When you cannot get informations out of a content system, the most sophisticated tool for search will not help you.
Enterprise search vendors are well aware of that. That’s why they invest so hard into developing connectors for many content systems. There is no “ring to rule them all” in this. If there are E-Discovery vendors that have connectors to get all informations from all content systems I would like to urge them to get into the Enterprise search business.

Conclusion

My conclusion is that there are a couple of products/solutions that can fullfill both Enterprise search needs as well as E-Discovery needs. Specifically I want to mention HPE IDOL (the former Autonomy suite) and Solr.
When looking at the cost perspective, Solr (Open source) can even be the best alternative to expensive E-Discovery tools. When combining Solr with solutions that build on top of them, like LucidWorks Fusion, there is even less to build of your own.

PS

I am only talking about two specific Enterprise search products because I want to make a point. I know that there are a lot more Enterprise search vendors/solutions that can fulfill E-Discovery needs.

Posted in: Kennis, Opinie, Vendors by Edwin Stauthamer No Comments ,

Enterprise search or Search Based Applications or… a vision?

photoReading the article on cmswire “Enterprise search is bringing me down” by Martin White I also wonder why companies acknowledge that they have much informations (forgive me the term, but what can you make of the combination of “documents”, “databases”, “records”, “intranets”, “webpages”, “products”, “people cards” etc. … yep “informations”) spread around that they see that are (or is) valuable for everyone. There are plenty solutions and products that can help them achieve the goal of re-using that informations and put them to good use.

Still, most of the organizations focus on maintaining (storing, archiving) that informations in silo’s (CMS, DMS, FileShares, E-mail, Databases) and  not  combine it to see what business value the combination of that informations can (and will) bring. It’s pretty simple: If the informations can not be found, it’s useless. Why store it, maintain it? Just get rid of it!

But as humans… we do not like to delete informations. It’s like the working of our brain. In our brain we keep on storing informations and at some point we make use of that informations to make a decision, have a conversation, sing a song or whatever we want to do with that informations because we can retrieve it and make use of it!

Is it the costs?
Okay, building a search solution is not cheap. But if you can find a vendor/solution that can grow along the way, it will be not so expensive right from day one. There are commercial vendors and open source solutions that can deliver what you want. Just know what you want (in the end) and then discuss this with the implementation partners of product vendors. Maybe a “one size fits all” can be the way to go. Maybe the coöperation with an open source implementation partner can make it feasible on the long run?
But… always keep in mind the business case and the value that it can deliver for your business. You are always investing in the storage and maintenance of your informations at this time right? What are those costs? Why not making the informations usable? Remember that search solutions are also very good “cleaning agents”. It surfaces all informations and make it clear that something has to be done about deleting informations. I don’t even want to start on the gaps on informationsecurity that a good enterprise search system can surface…

Is it the complexity?
Whenever you want to take on a big project, at first it seems quite complex and you don’t want to even get started. It is not different with doing things in your own house. But once you have made a plan – do it room by room – then you are seeing the results in a short amount of time. And you are happy that you redecorated the room. That will give you energy in taking on the next room, right? It is the same with a search solution. If you take in on source by source or target group by target group, you will see improvement. And that will give you possitive feedback to start on another source or targetgroup!

Is it lack of vision?
… Yes… It’s the lack of vision. Vision doesn’t start with building a “Deus ex machina” that will do everything you ever wanted. It starts with small steps that will make that vision come true. It’s about making employees and customers happy. That can be achieved with having a vision, having a big plan, making small steps and scale fast.

Is the future of Enterprise search the creation of SBA’s (Search Based Applications) that optimize a couple of processes / business lines instead of optimizing the whole company? The Big Data movement is surely propagating that. They deliver solutions for analyzing big data sets that revolve around a specific business process or goal. But you don’t want lots of applications doing practically the same thing, right? That will costs you a lot of money. Well designed SBA’s work on the same set of informations while delivering value to many processes and target groups. The underlying infrastructure should be capable of serving many different business processes and information needs.

I still believe in the “Google Adigma” that all informations should be made accessible and usable for everyone, within and/or (you don’t want your internal informations out there for everyone to explore) outside of a company.

In my opinion everyone inside and outside a company should have a decent solution that gives access to the informations that are valuable and sometimes needed to perform their daily tasks. Google can do this on the internet, so why don’t you use the solutions at hand that can bring that to your customers and employees?

I won’t say it will be easy, but what has “easy” ever delivered? Surely not your edge on your competitors. Because then we all would be entreprenours with big revenues, right?

But as Martin wrote in the article, I sometimes get tired of explaining (and selling) the value of a good search solution to people who don’t just get it. Still they are using Google every day and take that value for granted… without realizing that they could have that for themselves in their own company and for their customers.

Posted in: Opinie, Technologie by Edwin Stauthamer No Comments