Forrester recognizes HPE IDOL/Vertica as leader in the latest (2017 Q2) “Cognitive” Search Vendor Evaluation

Last week I received e-mails from vendors like Attivio and Sineque about the latest Forrestor report “The Forrester Wave™: Cognitive Search and Knowledge Discovery Solutions“.

The enterprise search (yes, I still call it that) solutions of those vendors are placed in the “Leaders” quadrant.

Attivio has been around for some years now – about 10 years. Sinequa is bit younger and started to gain traction about 5 years ago. At least, to my knowledge.

What I like is that HPE IDOL/Vertica is placed in the top of the leaders quadrant. It has been there for more than a decade – with a short absence because of the trouble that HPE had in repositioning IDOL after the Autonomy buy.

Some six months ago I started working for a company specialized in implementing HPE (Autonomy) IDOL (KnowledgePlaza) products. Before that (2011-2016) I worked for a company that was mostly busy with the Google Search Appliance. Before that (2006 – 2011) I also worked for a company that mostly did Autonomy IDOL implementations (and before that Verity).

So I’m back in the saddle with HPE IDOL and I must say, that I am still impressed by their offering. It is very complete and has been under development in the last years. Because of the maturaty, it is stable. It’s a complete suite of modules with everything you need to implement a sophisticated Enterprise search environment. A “Swiss army knive” so to speak.

More info at “IDOL Unstructured Data Analytics: Enterprise Search & Knowledge Discovery | Micro Focus“: “Unified machine learning platform for enterprise search and big data analytics – text analytics, speech analytics, image analytics and video analytics.”

Furthermore LucidWorks – with their Fusion offering, based on Solr – is somewhere in the middle of the Forrester wave/quadrant. Watch them because the “Solr on steroids” offering is also very usefull or even needed if you want to implement Solr as an enterprise search solution. Needless to say that my company also uses that product to fullfill “Cognitive Search and Knowlegde Discovery” needs.

A modern intranet: connecting people to people?

2017-10-12 22_27_08-connecting people to content content to content people to people. - Google zoekeToday I read “Will Intranets Disappear With the Rise of the Bots?“. The author writes about how “old” intranet were all about one-way communication and providing essential content.

But:

Intranets designed and built around document libraries, one-way communications and links to deeper knowledge are no longer the proud, highly esteemed centerpieces of old“.

According to the article this doesn’t cut it in this time anymore. Doing business and work nowadays asks for more fluïd information, fast two-way communication etc. to support decision making and innovation:

A functioning intranet has become more about people: Finding people, interacting with people, building relationships and networks.” and “Decision making needs and the drive to improve the customer experience require a more fluid and faster intranet than one that is essentially a library“.

The article goes on about bots and how those bots will assist us in getting answers to common questions and helping us with doing simple tasks.

While reading this I thought to myself “But what with the ever-present problem of capturing tacit knowledge“? The goals of Knowledge management are to get the right information to the right people (of proces) at the right time, basically to achieve “doing the things right the first time”. There are two important use cases for managing knowledge:

  1. To make sure that new people that join the company know what the company has been doing, what works/worked and what (has) not, where to get the information they need. Simply to make them productive as quickly as possible.
  2. To make sure that the company is doing the right things right. Think about R&D and product/business development. It makes no sense to develop products you already have or to to research on a topic that has been covered in the past and from which the outcome is already known.

So when the author says:

Knowledge is instantly available, discussed, shared and fully used in the time it takes to add metadata to a document

and connecting people in a social environment is more important than securing information for future reference, we risk asking the same questions to people over and over again. Also, when experienced people leave the company the existing knowledge will leave the company with them. Connecting to people also poses the risk of getting them out of there current process to. This can lead to lower productivity because of the constant disturbance of notifications, instant messaging etc,

So, I still believe in “document libraries” with high quality information and data that any employee can access and use when ever he or she needs it. We simple need to manage the knowledge, information and data so that it is readily accessible.

When the article speaks of “bots” in that context I translate that to “a fucking good search engine” that understands what’s important and what not (in the context of the user/question). Enterprise search solutions also have the ability to provide pro-active suggestions to relevant content (research, people with knowledge). It all depends on how deep you want to integrate different technologies.

So, connecting people remains important in a company. But for a company to survive for a long time, it needs to secure it’s information and “knowledge”. Surely we need more smart solutions to connect people to content, content to content, content to people and people to people.

 

Everything you wanted to know about the Dark Web… and more

Today I acquired a copy of the “Dark Web Notebook” (Investigative tools and tactics for law enforcement, security, and intelligence organizations) by Stephen E Arnold.

I know, the grumpy old man from rural Kentucky who speaks negatively about almost all large “Blue Chip” companies and “self-appointed search experts”.
I read his articles with a lot of scepticism because he seems to “know it all”.

But… with this book he surprised me.

The Dark Web is something we all heard about, but most of us don’t know what it is, including myself. Until now.

If you are curious to, you should get a copy of this book. Purchase for $49 at https://gum.co/darkweb

From the introduction in the book:

The information in this book will equip a computer-savvy person to break the law. The purpose of the book is to help those in law enforcement, security, and intelligence to protect citizens and enforce applicable laws. The point of view in the Dark Web Notebook is pragmatic and pro-enforcement

You are warned!

HPe/IDOL (Former Autonomy IDOL) is still alive and kicking

2017-02-08 14_09_52-IDOL Unstructured Data Analytics_ Enterprise Search & Knowledge Discovery _ HewlWith all the rumble about Solr, Elasticsearch and other search vendors like Coveo, Attivio one could easily forget about that long existing behemoth in the (enterprise) search niche: HPE/IDOL.

IDOL stands for “Intelligent Data Operating Layer” and is a very sophisticated big data and unstructured text analytics platform and has been around for more than two decades.

HPE is stil investing heavily in this technology that consists of a very rich ecosystem of modules:

  • connectors
  • classifiers
  • taxonomy generators
  • clustering engine
  • summarization
  • language detection
  • video and audio search
  • alerting (agents)
  • visualization (Business intelligence for human information (BIFHI))
  • DIH/DAH for distribution (scalability) and mirroring (availability) of content and queries

Recently (december 2016) HPE added machine learning and Natural Language Procession to the capabilities.

IDOL can be used for knowledge search, e-commerce search, customer self service search and other use cases that require fast, accurate and relevant search.

Next to the “on premise” solution, HPE also enabled the IDOL platform to be used in the cloud with a range of services: Haven OnDemand. With this platform developers can quickly build search & data analytics applications. There are dozens of API’s available, amongst them:

  • Speech to text
  • Sentiment analysis
  • Image/video recognition
  • Geo/Spatial search
  • Graph analysis

So IDOL is still very much alive and kicking!

Looking for a specialist that can support you with first class search and text analytics based on HPE IDOL in the Netherlands? KnowledgePlaza Professional Services is a fully certified HPE Partner.

Open source search thriving on Google Search Appliance withdrawal?

Last week I had my first my first encounter with a potential client that changed their policy on open source search because of a recent event.

They were in the middle of a RFI (request for information) to see what options there are for their demands regarding enterprise search, when Google announced the end-of-life for their flag ship enterprise search product: the Google Search Appliance.

This has led them to think about this: “What if we choose a commercial or closed source product for our enterprise search solution and the vendor decides to discontinue it?”.

The news from Google has gotten a lot of attention on the internet, through blog posts and tweets. Of course there are commercial vendors trying to step into this “gap” like Mindbreeze and SearchBlox.

I have seen this happen before, in the time of the “great enterprise search take-overs”. Remember HP and Autonomy, IBM and Vivisimo, Oracle and Endeca, Microsoft and FAST ESP?
At that time organizations also started wondering what would happen to their investments in these high-class, high-priced “pure search” solutions.

In the case of the mentioned potential client the GSA was on their list of possible solutions (especially because of the needed connectors ánd the “document preview” feature). Now it’s gone.

Because of this, they started to embrace the strenght of the open source alternatives, like Elasticsearch and Solr. It’s even becoming a policy.
Surely open source will take some effort in getting all the required functionalities up and running, and they will need an implementation party. But… they will own every piece of software that is developed for them.

I wonder if there are other examples out there of companies switching to open source search solutions, like Apache Solr, because of this kind of unexptected “turn” of a commercial / closed source vendor.

Has Google unwillingly set the enterprise search world on the path of open source search solutions like Apache Solr or Elasticsearch?

 

“The future of search is to build the ultimate assistant”

Last week, one of my customers pointed me to an article on Search Engine Land, titled: “The rise of personal assistants and the death of the search box“.

Google’s Behshad Behzadi explains why he thinks that the convenient little search box that sits at the top right corner of nearly every page on the web, will be replaced. The article was written by Eric Enge and of course interpreted by him.

“Google’s goal is to emulate the “Star Trek” computer, which allowed users to have conversations with the computer while accessing all of the world’s information at the same time.”

I think that’s a great goal, and these things could be happening in the not to distant future. Of course we all now Siri, Cortana and Google Now, so this is not so hard to image. Below a timeline about the growth of Google.com:

2016-04-04 10_44_35-The rise of personal assistants and the death of the search box

At this time we are talking more and more to our computers. For most people it still feels weird, but “It’s becoming more and more acceptable to talk to a phone, even in groups.”

So… search applications are getting to know our voice and the way we speak is the way we search.

That demands a lot from search engines. They need to get more intelligent to be able to interprete the questions and match them with a vast amount of possible answers hidden in documents, knowledge basis, graphs, databases etc.
When having found possible answers, the search application needs to present the possible answers in a meaningful way ánd get a dialog going to be sure that it has interpreted the question right.

This future got me wondering about “enterprise search”. All this exciting stuff is happening on the internet. Search behind the firewall is lagging behind. The vast information and development power that is available on the internet is not available in the enterprise.
An answering machine needs to be developed constantly. Better language interpretation, more knowledge graphs (facts and figures) to drive connections, machine learning to process the queries, the clicks visitors perform, other user feedback etc.

The question is if on-premise enterprise search solutions can ever deliver the same experience as the solutions that run in the cloud. It’s impossible to come up with a product that installs on-premise  and has the same rich features that Google is delivering online. One could try, but then it’s the question if the product can keep up with the improvements.

So with the “death of the search box”, will this also lead to “the death of the on-premise search solutions”? Google is dropping support for their on-premise search solution, the Google Search Appliance, for a reason. The way to the cloud and personal assistents is driving that.

Goodbye Google Search Appliance, we are going to the cloud!

The History

It was the year 2005 when Google decided that they could use their superior search to make information in enterprises/behind the firewall searchable.

That year Google released the Google Mini. A cute little blue server, pre-installed with the software from Google.ScreenHunter_78 Feb. 14 16.33

The Mini could index up to 300.000 documents. The functionality was limited, but great in crawling webbased content, just like Google.com did. The Mini was mainly used to index intranets and public websites. That was in the time before Google introduced Site Search as a product. The Mini did not have features like facetting and connectors to crawl documents from sources other that websites or databases.

ScreenHunter_78 Feb. 14 16.57Google must have realized that the Mini could not fulfill the Enterprise search demands (many different content sources, need for facetting, changing the relevance, need coping with millions of documents etc.) so they released the Google Search Appliance.

The first versions of the GSA were very similar to the Mini. They added some connectors, facetting, morriring and API’s to manage the appliance.
One important feature was the ability to scale to millions of documents, distributed over several appliances. The limit of the number of documents one appliance could index was 10 million.
The proposition of the GSA shook up the enterprise search market. Management of the GSA was easy and so enterprise search became easy. Or so at least it seemed. “Google knows search and now it is bringing their knowledge to the enterprise. We can have search in our business as good as Google.com“. NOT so fast, there is a big difference in search on the web and search in the enterprise (read “Differences Between Internet vs. Enterprise Search“).

In 2012 Google pulled back the Mini from there offerings and focussed on selling more GSA’s and improving the Enterprise capabilities. I assume that the two are not that different at all and there could be a lot of more money to be made with the GSA.

After that time more energy was put into improving the GSA. After version 6 (the Mini stopped with version 5) came version 7 with more connectors and features like Wild Card search (truncation with ‘*’), Entity Recognition, Document Preview (Documill) etc.. Minor detail is that the OOTB search interface of the GSA was never improved. It reflected Google.com back in 2005.

The last years it became clear that Google didn’t know what to do with this anomaly in it’s cloud offerings. The attention dropped, employees were relocated to other divisions (mainly Google Apps and Cloud) and the implementation partners were left to their own when it came to sales support. There was not much improvement in adding features.

Beginning 2015 Google re-vamped the attention and dedicated more resources to the GSA again. It was clear (at that time) that the profits for the GSA are good and could even be better. Better sales support was promised to the partners (global partner meetings) and sales went slightly up. In 2015 version 7.4 was released with some small improvements but with a brand new connector framework (Plexi adaptors). Several technology partners invested in developing connectors to support this new model. Small detail was that the new connector framework relied heavily on the crawling by the GSA and the adaptors beeing more like a “proxy”. The old connector framework was pretty independant of the GSA by sending full contents of documents to the GSA. (since the open source character of the connectors other companies started to use it in theire own offerings, like LucidWorks using the SharePoint connector).

I’ve been working with the GSA for a long time a I must say that the solution made a lot of customers happy. The GSA really is easy to administer and the performance and stability is near to perfect.

On Thursday February 4th 2016 Google sent an e-mail to all GSA owners and partners stating that the GSA is “end-of-life”. Google will continue to offer support and renewals until 2019, but no innovation on the product will be done anymore. This came as a blow to the existing customers (who have invested a lot of money very recently) and the partners.

Google doesn’t have an alternative for enterprise search yet. It must be working on a cloud offering for that. It will certain be able to search through Google Drive (duh..) and some cloud services like Sales Force, DropBox, Box etc. since the data for those applications already reside in the cloud.

Also see the article “Why Google’s enterprise search is on a sunset march to the cloud“.

Observations

  • Google is a cloud company, it doesn’t like you to have information in on-premise or private cloud solutions
    Supporting an on-premise solution is “strange” for Google.
  • Enterprise search is hard. Slapping an appliance on to intranets and websites doesn’t cut it.
    Enterprise search is not Web search. So many other sources and different relevancy models.
  • The license model of the GSA runs into problems with a large number of documents/records.
    Let alone when you want to combine structured info from databases.
  • Delivering a search experience like Google.com in the enterprise is not possible out-of-the-box.
    Google.com has a lot of “application logic” and call-outs to other sources. The thing we see is not only the search engine working.
  • The GSA is a “relevancy machine”. It does not work well with structured content.
  • To be able to support enterprise search the vendor need to have many connectors to tap into many different content systems.
    Google has support for 5 content sources out-of-the-box/provided by Google. Other connectors are delivered by partners and need additional investments/contracts.
  • To be able to support disperate content systems with different metadata models the search engine needs to have metadata mapping functionality.
    The GSA always relied on the quality of content and metadata in indexed content systems. That is not the reality.
  • Also see the article “Why Google’s enterprise search is on a sunset march to the cloud“. With a slightly different take on the subject.

Conclusion

Google has proven not to be an enterprise search solution providor. It tried with the Google Search Appliance but it (sadly) failed. The GSA was a good product that fits wel in many areas. But Google is a cloud company an does not have other on-promise solutions.
Google must have come to the conclusion that enterprise search is hard and that the investments doesn’t stand up to the profit. Google doesn’t expose numbers on revenue on GSA deals, but it must be a small part of their revenue.

The GSA lacks some features that would make it “enterprise ready” and the number of feature requests would give them a work load of years to catch up with the current vendors.

Google is a cloud born company that thinks in large volume of users. Their offerings are all cloud based and focus on millions of users paying a small amount of money on a use base. When operating on that scale minimal margins are OK because of the volume.
Enterprise search doesn’t work that way. The license model of the GSA (based on number of documents) holds back opening up large amounts of documents (but that’s not only the case for the GSA. Other search vendors also have that model) .

Having said that, there a couple of search vendors that are ready to step up and are going to use the retraction of Google on the enterprise search market as their “Golden egg”:

  • Mindbreeze
    Offers an Enterprise Search Appliance. They even offer a solution to migrate from GSA to Mindbreeze Inspire.
    The 300+ connectors could be the reason to switch over.
  • SearchBlox
    Long term competitor of the GSA. Offer a similar experience but with more functionality and less cost.
  • LucidWorks Fusion
    The commercial party behind Solr. Solr is the most supported open source search engine in the world with a lot of features. Fusion offers connectors, manageability and data processing at index time to enable advanced search experience.

 This Blog reflects my personal opinions and not that from my employer

Enterprise Search – Geschiedenis herhaalt zich

Twee decennia geleden was er een grote aanbieder van zoekoplossingen voor bedrijven: Verity. Verity leverde een oplossing voor het doorzoekbaar én vindbaar maken van álle informatie binnen een organisatie, onafhankelijk van welke bron dan ook. Deze oplossing is ook bekend onder de noemer “Enterprise Search”.
Autonomy heeft Verity begin jaren “00” overgenomen en enkele jaren geleden heeft HP Autonomy overgenomen.

Sinds die tijd zijn er vele aanbieders van “enterprise search” toegetreden tot de markt van “enterprise search”:

  • Coveo (nu IBM)
  • Endeca (nu Oracle)
  • Exalead (Nu Dassault systèm)
  • LucidWorks (Fusion)
  • en nog meer

In mijn tijd als search consultant heb ik vele oplossingen mogen implementeren en heb ik de ontwikkelingen van verschillende – ook nieuwe – aanbieders gevolgd.

Iedere Enterprise search oplossing heeft dezelfde aandachtspunten:

  1. Hoe verkrijg je de informatie uit verschillende systemen in de index (crawling, feeding, connectoren)
  2. Hoe zorg je ervoor dat de gebruikers van de zoekoplossing alleen die resultaten terug kan vinden die je ook alleen mag zien (conform de rechten die in het bronsysteem gelden).

De “oude” oplossingen zoals Autonomy hebben vele connectoren om informatiesystemen aan te sluiten compleet met oplossingen voor permissies, updates, schaalbaarheid, beschikbaarheid etc.

De “nieuwe” aanbieders lopen tegen dezelfde problemen aan die de “oude” aanbieders al hebben opgelost. Hoe kan je vaststellen welke user welk resultaat mag zien? Wat als een bronsysteem niet beschikbaar is? Verwijder je dan gewoon alle content die niet meer beschikbaar is omdat een connector er niet mee bij kan komen?

Ik ben afgelopen week tegen zo’n probleem aangelopen. In een omgeving waar we de oplossing van Google (Google Search Appliance (GSA) + Adaptor voor SharePoint) hebben geïmplementeerd bleek de adaptor (=connector) niet meer beschikbaar te zijn. Omdat deze adaptor niet meer beschikbaar was kon de GSA ook niet meer bij die bron komen.

Het gevolg? Alle documenten (4 miljoen) werden uit de index verwijderd. Het duurt ongeveer 2 weken om deze content opnieuw te vergaren. Het resultaat ofwel de gebruikerservaring kan je je voorstellen.

Het verbaast mij om te zien dat alle aanbieders van Enterprise search oplossingen iedere keer het wiel opnieuw moeten/willen uitvinden omdat ze denken dat zij het beter/anders kunnen doen. Het “not invented here” syndroom lijkt hierbij te prevaleren. Dit in plaats van het (her)gebruiken van wat anderen al hebben bedacht en daarop voor te bouwen.

Uiteraard begrijp ik het commerciële gedeelte. Ik begrijp alleen niet hoe men (lees: nieuwe aanbieders) een nieuwe oplossing wil maken zonder gebruik te maken van de kennis en oplossingen die al aanwezig zijn?

Een ezel stoot zich toch ook niet aan dezelfde steen?

Dit onderstreept ook het belang van het betrekken van een  expert op het gebied van “Enterprise search” wanneer je je als organisatie wil verdiepen in de implementatie zoekoplossingen.
De aanbieders van zoekmachines belichten vaak maar enkele kanten van een totale oplossing.

StateofEnterpriseSearch.nl presenteert: Webinar

Afgelopen week heb ik een webinar bijgewoond getiteld: “The State of Enterprise Search“.

Dit webinar is georganiseerd door BA insight.

In een soort “round table” setting werd door zeer bekende personen in het vakgebied “Enterprise Search” gediscussieerd over onderwerpen die door de moderator werden ingebracht. De deelnemers waren:

  • Martin White
  • Sue Feldman
  • Jeff Fried

De webinar is opgenomen en kan worden teruggeluisterd op: http://vimeo.com/78551770.

BA insight heeft de afgelopen weken ook twee rapporten opgeleverd: State of Search in the Enterprise: Part 1 & Part 2

Veel luister en leesplezier!

Enterprise Search: stilstand of beweging?

Er wordt veel geschreven over Enterprise Search en er treden steeds nieuwe partijen toe tot deze markt die een “briljante” nieuwe oplossing hebben.

De nieuwkomers hebben veelal een product dat is gebaseerd op de open source Lucene kern of daarvan zijn afgeleid.

Het valt op dat deze oplossingen meestal een deel van het totale “information retrieval” probleem aanpakken. Ze zijn goed in X of Y maar vrijwel nooit X én Y.

Het valt ook op dat er voorbij wordt gegaan aan de meetbare principes van “precision and recall” (de “maatstaf” om relevantie te bepalen) en het principe “goed genoeg“ wordt gehanteerd. De vraag is dan natuurlijk wat “goed genoeg” is.

Waar zijn de “Game changing” aanbieders zoals Verity, Autonomy en Endeca gebleven? Toen Larry en Sergey nog met blokken speelden beschikten zij al over de oplossing voor security, connectie naar zeer veel verschillende informatiesystemen en gedistribueerde architecturen. Er zijn geen betere producten verschenen, alleen aanbieders die zaken ánders doen.

Alle aanbieders van “best of breed” zoekoplossingen zijn inmiddels overgenomen door HP, IBM, Dassault, Oracle en Microsoft. Zij hebben technologie opgebroken en in delen opgenomen in hun eigen integratie- of infrastructurele oplossingen. Search is daarmee een onderdeel van een reeds bestaand product geworden, maar de “”enterprise search” of “universal search” gedachte die achter de zoekoplossingen zat, is daarmee verloren gegaan.

Definieert de opkomst van goedkope op open source gebaseerde producten het speelveld van “Enterprise Search”? Het gaat om big data, business intelligence aan de ene kant en “one size fits all”, “goed is goed genoeg” aan de andere kant. De woorden zijn veranderd, maar het probleem van “findability” is nog steeds niet opgelost.

De nieuwkomers twitteren en bloggen over fondsen die ze hebben verkregen maar niet over grote klanten en implementaties die ze hebben gedaan. Waarom? Zijn er geen grote succesverhalen te vertellen? UIteraard heb ik het dan niet over het doorzoekbaar maken van een intranet maar het oplossing van een groot informatieprobleem waarbij echt business value is verkregen.

Zoals we weten gaat vindbaarheid over de kwaliteit van informatie én technologie. Toch wordt “search” vaak gezien als een infrastructuurproduct (a la e-mail) en niet in relatie gebracht met het beheer van de informatie die doorzocht moet kunnen worden (zie ook de blog van Earley & Associates “Building the Business case for enterprise search”).

De realiteit is dat een nieuwe Autonomy of Endeca niet snel zal verschijnen. Het zou wel eens onmogelijk kunnen zijn omdat Enterprise Search niet langer een oplossing is. Het is een “hidden feature” van andere producten geworden. Het kost moeite om het te vinden en nog meer moeite om het te laten werken zoals de gebruiker wil.

Wat vinden jullie? Is “Enterprise Search” dood en moeten er andere, revolutionaire oplossingen komen om alle informatie in een organisatie bruikbaar en vindbaar te maken?