Forrester recognizes HPE IDOL/Vertica as leader in the latest (2017 Q2) “Cognitive” Search Vendor Evaluation

Last week I received e-mails from vendors like Attivio and Sineque about the latest Forrestor report “The Forrester Wave™: Cognitive Search and Knowledge Discovery Solutions“.

The enterprise search (yes, I still call it that) solutions of those vendors are placed in the “Leaders” quadrant.

Attivio has been around for some years now – about 10 years. Sinequa is bit younger and started to gain traction about 5 years ago. At least, to my knowledge.

What I like is that HPE IDOL/Vertica is placed in the top of the leaders quadrant. It has been there for more than a decade – with a short absence because of the trouble that HPE had in repositioning IDOL after the Autonomy buy.

Some six months ago I started working for a company specialized in implementing HPE (Autonomy) IDOL (KnowledgePlaza) products. Before that (2011-2016) I worked for a company that was mostly busy with the Google Search Appliance. Before that (2006 – 2011) I also worked for a company that mostly did Autonomy IDOL implementations (and before that Verity).

So I’m back in the saddle with HPE IDOL and I must say, that I am still impressed by their offering. It is very complete and has been under development in the last years. Because of the maturaty, it is stable. It’s a complete suite of modules with everything you need to implement a sophisticated Enterprise search environment. A “Swiss army knive” so to speak.

More info at “IDOL Unstructured Data Analytics: Enterprise Search & Knowledge Discovery | Micro Focus“: “Unified machine learning platform for enterprise search and big data analytics – text analytics, speech analytics, image analytics and video analytics.”

Furthermore LucidWorks – with their Fusion offering, based on Solr – is somewhere in the middle of the Forrester wave/quadrant. Watch them because the “Solr on steroids” offering is also very usefull or even needed if you want to implement Solr as an enterprise search solution. Needless to say that my company also uses that product to fullfill “Cognitive Search and Knowlegde Discovery” needs.

HPe/IDOL (Former Autonomy IDOL) is still alive and kicking

2017-02-08 14_09_52-IDOL Unstructured Data Analytics_ Enterprise Search & Knowledge Discovery _ HewlWith all the rumble about Solr, Elasticsearch and other search vendors like Coveo, Attivio one could easily forget about that long existing behemoth in the (enterprise) search niche: HPE/IDOL.

IDOL stands for “Intelligent Data Operating Layer” and is a very sophisticated big data and unstructured text analytics platform and has been around for more than two decades.

HPE is stil investing heavily in this technology that consists of a very rich ecosystem of modules:

  • connectors
  • classifiers
  • taxonomy generators
  • clustering engine
  • summarization
  • language detection
  • video and audio search
  • alerting (agents)
  • visualization (Business intelligence for human information (BIFHI))
  • DIH/DAH for distribution (scalability) and mirroring (availability) of content and queries

Recently (december 2016) HPE added machine learning and Natural Language Procession to the capabilities.

IDOL can be used for knowledge search, e-commerce search, customer self service search and other use cases that require fast, accurate and relevant search.

Next to the “on premise” solution, HPE also enabled the IDOL platform to be used in the cloud with a range of services: Haven OnDemand. With this platform developers can quickly build search & data analytics applications. There are dozens of API’s available, amongst them:

  • Speech to text
  • Sentiment analysis
  • Image/video recognition
  • Geo/Spatial search
  • Graph analysis

So IDOL is still very much alive and kicking!

Looking for a specialist that can support you with first class search and text analytics based on HPE IDOL in the Netherlands? KnowledgePlaza Professional Services is a fully certified HPE Partner.

Unstructured data is still growing; Search technology alone is not the solution

Today I received an e-mail from Computerworld about IBM Watson and “How to Put Watson to Work for Powerful Information and Insights”.  It’s a Whitepaper with info on IBM Watson Explorer (IBM rebranded all their products that have something to do with search and text-analytics with the prefix “Watson”).

One of the first things mentioned in the Whitepaper is “Unstructured data has been growing at 80%, year-over-year” (Yeah, yeah… again the magical 80%).
Mmm… where did I hear that before? Oh yeah that’s been propagated by every vendor of enterprise search solutions over the last decade or more.

The mantra of vendors of search solutions is “no matter how much information/data/documents you have, our solution is capable of finding the most relevant information”.

I’ve been a “search consultant” for years now, and I now that getting the most relevant information op top is very difficult. The more information a company has, the harder it gets.
Just adding files from filesystems, SharePoint, E-mail etc. to the search environment will not make all information findable. Sure, when you are looking for a specific document of which you know some very specific features, you can probably retrieve it. But when you are looking for information on a certain topic, you will be burried with all kinds of irrelevant, not authoritative, content.

So to get your employees or customers the right information one has to take measures.

  1. Know the processes that need information from the search engine.
  2. Identify “important”/”authoritative” sources and content.
    Boost that kind of content
  3. Make it clear to the users what they can expect to find: what sources are indexed?
    Where can they get help?
  4. Clean, Clean, Clean
    Yep… Get rid of information that will not matter at all.

I could mention many more points, but this post is not intended to be a complete guide :).

One remark: Using a search engine to just index as much content from as much sources you can find in your company can be very, very usefull. It gives an information manager the insight as to what content is available in what content system and what the quality is. That can be a first step in improving information governance: no ‘hear-say” but pure data and analytics.

Do not trust vendors of solutions that say that not the volume of the information/content is the problem, but the product that you are now using for search is.
Feeding ground for good findability is good information governance: delete outdated, irrelevant and non-authoritative content and take care of good structuring of the content that matters!

 

Open source search thriving on Google Search Appliance withdrawal?

Last week I had my first my first encounter with a potential client that changed their policy on open source search because of a recent event.

They were in the middle of a RFI (request for information) to see what options there are for their demands regarding enterprise search, when Google announced the end-of-life for their flag ship enterprise search product: the Google Search Appliance.

This has led them to think about this: “What if we choose a commercial or closed source product for our enterprise search solution and the vendor decides to discontinue it?”.

The news from Google has gotten a lot of attention on the internet, through blog posts and tweets. Of course there are commercial vendors trying to step into this “gap” like Mindbreeze and SearchBlox.

I have seen this happen before, in the time of the “great enterprise search take-overs”. Remember HP and Autonomy, IBM and Vivisimo, Oracle and Endeca, Microsoft and FAST ESP?
At that time organizations also started wondering what would happen to their investments in these high-class, high-priced “pure search” solutions.

In the case of the mentioned potential client the GSA was on their list of possible solutions (especially because of the needed connectors ánd the “document preview” feature). Now it’s gone.

Because of this, they started to embrace the strenght of the open source alternatives, like Elasticsearch and Solr. It’s even becoming a policy.
Surely open source will take some effort in getting all the required functionalities up and running, and they will need an implementation party. But… they will own every piece of software that is developed for them.

I wonder if there are other examples out there of companies switching to open source search solutions, like Apache Solr, because of this kind of unexptected “turn” of a commercial / closed source vendor.

Has Google unwillingly set the enterprise search world on the path of open source search solutions like Apache Solr or Elasticsearch?

 

Enterprise Search vs. E-Discovery from a solution point of view

Last week I was invited for an “Expert meeting E-Discovery”. I’ve been in the search business for many years and I regularly encounter the concept and practice for “E-discovery” as well as “Enterprise search” (and E-commerce search, and Search Based Application etc.).

So I decided to get some information about what people think about the difference between Enterprise search and E-Discovery.

Definition of E-Discovery (Wikipedia):

Electronic discovery (also e-discovery or ediscovery) refers to discovery in litigation or government investigations which deals with the exchange of information in electronic format (often referred to as electronically stored information or ESI). These data are subject to local rules and agreed-upon processes, and are often reviewed for privilege and relevance before being turned over to opposing counsel.

Definition of Enterprise search (Wikipedia):

Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.

When you look at the definitions, the difference is in the “goal”. E-Discovery is dealing with legal stuff to gather evidence; Enterprise search is dealing with “general purpose” to gather answers or information to be used in some business process.
But one can also see the similarities. Both deal with digital information, multiple sources and a defined audience.

What could be seen as different is that according to these definitions, E-Discovery does not talk about a technical solution that indexes all (possibly relevant) information and makes that searchable. Enterprise search is much more close to a technical solution.

So… not much differences there, but I am beginning to have a hunch about why people could see them as different. My quest continuous.

I found two articles that are pretty clear about the differences:

I think that the differences that are mentioned come from a conceptual aspect of E-Discovery vs. Enterprise search, not from a technical (solutions) point (and even on the conceptual point they are wrong). Also I think that the authors of the article compare the likings of the Google Search Appliance to specialized E-Discovery tools like ZyLab. They just simplify the fact that there are a lot of more solutions out there that do “Enterprise search” but are very more sophisticated than the Google Search Appliance.

Below I will get into the differences mentioned in those articles from a technical or solution point of view.

From “Enterprise Search vs. E-Discovery Search: Same or Different?“:

  1. Business objective is a key consideration
    “Recall vs. Precision” (getting all the relevant informations vs. getting the most relevant informations)
    It is true that a typical Enterprise search implementation will focus on precision. To support efficient answering of common queries and speeding up information driven processes in a company, precision is important.
    This does not say that the products used for Enterprise search cannot deliver all relevant informations for a query. HPE IDOL as well as Solr can retrieve all relevant informations fast.
  2. Number of search queries matter
    “Simple vs. complex queries”
    Here a couple of keyword examples are given to illustrate how people use Enterprise search. I’ve been working with companies (intelligence) that use Enterprise search solutions (like HPE IDOL/Autonomy) to use far more complex queries to get all possible relevant informations back.
    The complex queries that are illustrated can be handled by Solr easily.
  3. The cost of relevancy
    “Transparent query expansion”
    For every search manager is important to know why results show up given a specific query. It is needed to tune the engine uses and the results that are displayed to the users.
    Solr is open source and that’s why the community invest heavily in making it transparent why results come up given a specific (complex) query.
    Furthermore there are tools that can be used with Solr that can even make E-Discovery better. Think of the Clustering engine Carrot2. That solution will make relations in informations visible even without knowing up front that those relations could even exist.

From “3 Reasons Enterprise Search is not eDiscovery“:

  1. Lenghty deployment
    “All informations for one audience” vs. “All informations for everyone”
    For this… see the first bullet under the next section “Business case”.
    But also… an Enterprise search deployment can take some time because you have to find ways to get informations out of some content systems. Will this be ease when using a E-Discovery solution? Do they have ways to get content out out ALL content systems? If so… please share this with the world and let that vendor get into the Enterprise search business. They will have the “golden egg”!
  2. Misses key data sources
    E-Discovery vs. “Intranet search”
    The whole promise of “Enterprise search” is to get all informations in a company findable by all employees. The authors of the articles must have missed some information about this. Point.
  3. Not Actionable
    “Viewing” vs. “Follow up”
    The platforms that make up a real good Enterprise search solution are designed to support many information needs. They can support many different search based applications (SBA’s). E-Discovery could as well be such a search based application. It has specific needs in formulating queries, exploring content, saving the results, annotating it and even recording queries with their explanation.

Analysis

So when I look at the differences from my piont of view (implementation and technical) I see three topics:

  • Business case
    The Business case for an E-Discovery Solution is clear: You have to implement/use this because you HAVE to. It’s a legal thing. The company has to give access to the data. Of course there is still a choice for doing this manually. But if there is too much information, the cost of labour will exceed the cost of a technical solution.
    When we look at Enterprise search (all information within the company for all employees) there is no one who will start implementing a technical solution without insight in the cost and benefits. Implementing a large (many sources, many documents, many users) Enterprise search solution is very costly.
  • Audience
    The audience (target group) for E-Discovery is the investigators that have to find out if there is any relevant information concerning an indictment or absolution in a legal case. This group is highly trained and it can be assumed that they can work with complex queries, complex user interfaces, complex reporting tools etc. Focus is getting all relevant documents, no matter how hard it is to formulate the right queries and traversing through the possible results.
    The audience for Enterprise search is “everyone”. This could be skilled informationspecialists, but also the guys from marketing, R&D and other departments, just trying to find the right template, customer report, annual financial report or even the latest menu from the company restaurant.
    Design of the user experience has to be carefully designed so that it is usable for a broad audience with different information needs. Sometimes the most relevant answer or document is OK, but in other use cases getting all the information on a topic is needed.
  • Security
    For E-Discovery in legal circumstances it’s simple: Every piece of informations has to be accessible. So no difficult stuff about who can see what.
    In Enterprise search security is a pain in the *ss. Many different content systems, many different security mechanisms and many different users that have different identities in different systems.
  • Functionality
    To provide the right tools for an E-Discovery goal a solution needs to take care about some specific demands. I am pretty sure that the search solutions I mentioned can take of most of them. It’s all in the creation of the user interface and supporting add-ons to make it happen.
    Allthough a typical Enterprise search implementation may not have this, the products used and the possibilities of creating custom reports and actions (explain, store etc.) do exist.

Connectors?

What none of the articles mention is the complexity of getting all informations out of all systems that contain the content. When abstracting from the possible tools for E-Discovery or Enterprise search, the tools for connecting to many different content systems is probably the most essential thing. When you cannot get informations out of a content system, the most sophisticated tool for search will not help you.
Enterprise search vendors are well aware of that. That’s why they invest so hard into developing connectors for many content systems. There is no “ring to rule them all” in this. If there are E-Discovery vendors that have connectors to get all informations from all content systems I would like to urge them to get into the Enterprise search business.

Conclusion

My conclusion is that there are a couple of products/solutions that can fullfill both Enterprise search needs as well as E-Discovery needs. Specifically I want to mention HPE IDOL (the former Autonomy suite) and Solr.
When looking at the cost perspective, Solr (Open source) can even be the best alternative to expensive E-Discovery tools. When combining Solr with solutions that build on top of them, like LucidWorks Fusion, there is even less to build of your own.

PS

I am only talking about two specific Enterprise search products because I want to make a point. I know that there are a lot more Enterprise search vendors/solutions that can fulfill E-Discovery needs.

Replacing a search appliance with… a search appliance?

With the news on the Google Search Appliance leaving the stage of (Enterprise) search solutions – of which there is still no record on the official Google for Work Blog – there are a couple of companies that are willing to fill the “gap”.

I think that a lot of people out there think that the appliance model is why companies choose for Google. I think that’s not the case.

A lot of people like Google when they use it to search the Internet. That’s why I hear a lot of “I want my enterprise search to be like Google!“. That’s pretty fair from a consumer perspective – every employee and employer are also consumers, right? We enterprise search consultants – and the search vendors – need to live up to the expectations. And we try to do so. We know that enterprise search is a different beast than web search, but still, it’s good having a company that sets the bar.

There are a few companies that deliver appliance models for search, namely Mindbreeze and Maxxcat. They are hopping on the flow and they do deliver very good search functionality with the appliance model.

But… wait! Why did those customers of Google choose the Google Search Appliance? Did they want “Google in a Box”? I don’t think so. They wanted “Google-like search experience”. The fact that it came in a yellow box was just “the way it was”. Now I know that the “Business” really liked it. It was kind of nifty, right? The fact was that in many cases IT was reluctant.

IT-infrastructure has been “virtualized” for years now. That hardware based solution does not fit into that. IT wants less dedicated servers to provide the functionality. They want every single server to be virtualized so that uptime/fail-over and performance can be monitored and tuned with the solution that are “in place”.

Bottom line? There are not many companies that choose for an appliance because it is an appliance. They choose a solution and take it for granted that it’s an appliance. IT is very reluctant towards this.

I’ve been (yes the past tense) a Google Search Appliance consultant for years. I see those boxes do great things. But for anything that could not be configured in the (HTML) admin interface, one has to go back to Google Support (which is/was great by the way!). There’s no way for a search team to analyse bugs or change to configuration on a deeper layer than the admin console.

So… If you own a Google Search Appliance, you have enough time to evaluate your search needs. Do this consciously. It may well be that there is a better solution out there, even open source nowadays.

 

 

Goodbye Google Search Appliance, we are going to the cloud!

The History

It was the year 2005 when Google decided that they could use their superior search to make information in enterprises/behind the firewall searchable.

That year Google released the Google Mini. A cute little blue server, pre-installed with the software from Google.ScreenHunter_78 Feb. 14 16.33

The Mini could index up to 300.000 documents. The functionality was limited, but great in crawling webbased content, just like Google.com did. The Mini was mainly used to index intranets and public websites. That was in the time before Google introduced Site Search as a product. The Mini did not have features like facetting and connectors to crawl documents from sources other that websites or databases.

ScreenHunter_78 Feb. 14 16.57Google must have realized that the Mini could not fulfill the Enterprise search demands (many different content sources, need for facetting, changing the relevance, need coping with millions of documents etc.) so they released the Google Search Appliance.

The first versions of the GSA were very similar to the Mini. They added some connectors, facetting, morriring and API’s to manage the appliance.
One important feature was the ability to scale to millions of documents, distributed over several appliances. The limit of the number of documents one appliance could index was 10 million.
The proposition of the GSA shook up the enterprise search market. Management of the GSA was easy and so enterprise search became easy. Or so at least it seemed. “Google knows search and now it is bringing their knowledge to the enterprise. We can have search in our business as good as Google.com“. NOT so fast, there is a big difference in search on the web and search in the enterprise (read “Differences Between Internet vs. Enterprise Search“).

In 2012 Google pulled back the Mini from there offerings and focussed on selling more GSA’s and improving the Enterprise capabilities. I assume that the two are not that different at all and there could be a lot of more money to be made with the GSA.

After that time more energy was put into improving the GSA. After version 6 (the Mini stopped with version 5) came version 7 with more connectors and features like Wild Card search (truncation with ‘*’), Entity Recognition, Document Preview (Documill) etc.. Minor detail is that the OOTB search interface of the GSA was never improved. It reflected Google.com back in 2005.

The last years it became clear that Google didn’t know what to do with this anomaly in it’s cloud offerings. The attention dropped, employees were relocated to other divisions (mainly Google Apps and Cloud) and the implementation partners were left to their own when it came to sales support. There was not much improvement in adding features.

Beginning 2015 Google re-vamped the attention and dedicated more resources to the GSA again. It was clear (at that time) that the profits for the GSA are good and could even be better. Better sales support was promised to the partners (global partner meetings) and sales went slightly up. In 2015 version 7.4 was released with some small improvements but with a brand new connector framework (Plexi adaptors). Several technology partners invested in developing connectors to support this new model. Small detail was that the new connector framework relied heavily on the crawling by the GSA and the adaptors beeing more like a “proxy”. The old connector framework was pretty independant of the GSA by sending full contents of documents to the GSA. (since the open source character of the connectors other companies started to use it in theire own offerings, like LucidWorks using the SharePoint connector).

I’ve been working with the GSA for a long time a I must say that the solution made a lot of customers happy. The GSA really is easy to administer and the performance and stability is near to perfect.

On Thursday February 4th 2016 Google sent an e-mail to all GSA owners and partners stating that the GSA is “end-of-life”. Google will continue to offer support and renewals until 2019, but no innovation on the product will be done anymore. This came as a blow to the existing customers (who have invested a lot of money very recently) and the partners.

Google doesn’t have an alternative for enterprise search yet. It must be working on a cloud offering for that. It will certain be able to search through Google Drive (duh..) and some cloud services like Sales Force, DropBox, Box etc. since the data for those applications already reside in the cloud.

Also see the article “Why Google’s enterprise search is on a sunset march to the cloud“.

Observations

  • Google is a cloud company, it doesn’t like you to have information in on-premise or private cloud solutions
    Supporting an on-premise solution is “strange” for Google.
  • Enterprise search is hard. Slapping an appliance on to intranets and websites doesn’t cut it.
    Enterprise search is not Web search. So many other sources and different relevancy models.
  • The license model of the GSA runs into problems with a large number of documents/records.
    Let alone when you want to combine structured info from databases.
  • Delivering a search experience like Google.com in the enterprise is not possible out-of-the-box.
    Google.com has a lot of “application logic” and call-outs to other sources. The thing we see is not only the search engine working.
  • The GSA is a “relevancy machine”. It does not work well with structured content.
  • To be able to support enterprise search the vendor need to have many connectors to tap into many different content systems.
    Google has support for 5 content sources out-of-the-box/provided by Google. Other connectors are delivered by partners and need additional investments/contracts.
  • To be able to support disperate content systems with different metadata models the search engine needs to have metadata mapping functionality.
    The GSA always relied on the quality of content and metadata in indexed content systems. That is not the reality.
  • Also see the article “Why Google’s enterprise search is on a sunset march to the cloud“. With a slightly different take on the subject.

Conclusion

Google has proven not to be an enterprise search solution providor. It tried with the Google Search Appliance but it (sadly) failed. The GSA was a good product that fits wel in many areas. But Google is a cloud company an does not have other on-promise solutions.
Google must have come to the conclusion that enterprise search is hard and that the investments doesn’t stand up to the profit. Google doesn’t expose numbers on revenue on GSA deals, but it must be a small part of their revenue.

The GSA lacks some features that would make it “enterprise ready” and the number of feature requests would give them a work load of years to catch up with the current vendors.

Google is a cloud born company that thinks in large volume of users. Their offerings are all cloud based and focus on millions of users paying a small amount of money on a use base. When operating on that scale minimal margins are OK because of the volume.
Enterprise search doesn’t work that way. The license model of the GSA (based on number of documents) holds back opening up large amounts of documents (but that’s not only the case for the GSA. Other search vendors also have that model) .

Having said that, there a couple of search vendors that are ready to step up and are going to use the retraction of Google on the enterprise search market as their “Golden egg”:

  • Mindbreeze
    Offers an Enterprise Search Appliance. They even offer a solution to migrate from GSA to Mindbreeze Inspire.
    The 300+ connectors could be the reason to switch over.
  • SearchBlox
    Long term competitor of the GSA. Offer a similar experience but with more functionality and less cost.
  • LucidWorks Fusion
    The commercial party behind Solr. Solr is the most supported open source search engine in the world with a lot of features. Fusion offers connectors, manageability and data processing at index time to enable advanced search experience.

 This Blog reflects my personal opinions and not that from my employer

Enterprise Search – Geschiedenis herhaalt zich

Twee decennia geleden was er een grote aanbieder van zoekoplossingen voor bedrijven: Verity. Verity leverde een oplossing voor het doorzoekbaar én vindbaar maken van álle informatie binnen een organisatie, onafhankelijk van welke bron dan ook. Deze oplossing is ook bekend onder de noemer “Enterprise Search”.
Autonomy heeft Verity begin jaren “00” overgenomen en enkele jaren geleden heeft HP Autonomy overgenomen.

Sinds die tijd zijn er vele aanbieders van “enterprise search” toegetreden tot de markt van “enterprise search”:

  • Coveo (nu IBM)
  • Endeca (nu Oracle)
  • Exalead (Nu Dassault systèm)
  • LucidWorks (Fusion)
  • en nog meer

In mijn tijd als search consultant heb ik vele oplossingen mogen implementeren en heb ik de ontwikkelingen van verschillende – ook nieuwe – aanbieders gevolgd.

Iedere Enterprise search oplossing heeft dezelfde aandachtspunten:

  1. Hoe verkrijg je de informatie uit verschillende systemen in de index (crawling, feeding, connectoren)
  2. Hoe zorg je ervoor dat de gebruikers van de zoekoplossing alleen die resultaten terug kan vinden die je ook alleen mag zien (conform de rechten die in het bronsysteem gelden).

De “oude” oplossingen zoals Autonomy hebben vele connectoren om informatiesystemen aan te sluiten compleet met oplossingen voor permissies, updates, schaalbaarheid, beschikbaarheid etc.

De “nieuwe” aanbieders lopen tegen dezelfde problemen aan die de “oude” aanbieders al hebben opgelost. Hoe kan je vaststellen welke user welk resultaat mag zien? Wat als een bronsysteem niet beschikbaar is? Verwijder je dan gewoon alle content die niet meer beschikbaar is omdat een connector er niet mee bij kan komen?

Ik ben afgelopen week tegen zo’n probleem aangelopen. In een omgeving waar we de oplossing van Google (Google Search Appliance (GSA) + Adaptor voor SharePoint) hebben geïmplementeerd bleek de adaptor (=connector) niet meer beschikbaar te zijn. Omdat deze adaptor niet meer beschikbaar was kon de GSA ook niet meer bij die bron komen.

Het gevolg? Alle documenten (4 miljoen) werden uit de index verwijderd. Het duurt ongeveer 2 weken om deze content opnieuw te vergaren. Het resultaat ofwel de gebruikerservaring kan je je voorstellen.

Het verbaast mij om te zien dat alle aanbieders van Enterprise search oplossingen iedere keer het wiel opnieuw moeten/willen uitvinden omdat ze denken dat zij het beter/anders kunnen doen. Het “not invented here” syndroom lijkt hierbij te prevaleren. Dit in plaats van het (her)gebruiken van wat anderen al hebben bedacht en daarop voor te bouwen.

Uiteraard begrijp ik het commerciële gedeelte. Ik begrijp alleen niet hoe men (lees: nieuwe aanbieders) een nieuwe oplossing wil maken zonder gebruik te maken van de kennis en oplossingen die al aanwezig zijn?

Een ezel stoot zich toch ook niet aan dezelfde steen?

Dit onderstreept ook het belang van het betrekken van een  expert op het gebied van “Enterprise search” wanneer je je als organisatie wil verdiepen in de implementatie zoekoplossingen.
De aanbieders van zoekmachines belichten vaak maar enkele kanten van een totale oplossing.

Enterprise Search adoptie wordt tegengehouden door licentiekosten

In mijn vele jaren als consultant “Enterprise Search Solutions” heb ik vele succesvolle en minder succesvolle implementaties mee mogen maken.

Let wel, ik heb het hier over echte “enterprise search” oplossingen: Doorzoek- en bruikbaar maken van alle binnen een organisatie aanwezige informatie voor alle medewerkers:
http://en.wikipedia.org/wiki/Enterprise_search
Het gaat hier dus niet om specifieke search oplossingen zoals voor Call Centers, R&D en websites.

De afgelopen tijd loop ik steeds vaker tegen het probleem van de licentiekosten bij het uitrollen van enterprise search oplossingen aan.

Een organisatie besluit om een zoekproduct aan te schaffen op basis van een bepaalde “scope” of business case. Daarbij wordt de keuze voor de investering gebaseerd op die scope.
Na de initiële implementatie – die vaak succesvol is – ontstaat een vraag naar “meer”: meer bronnen, meer documenten, meer gebruikers.

Op dat moment loopt de organisatie echter tegen de grenzen van de initiële licentie aan. Deze licentie van commerciële software is gebaseerd op het aantal servers waarop de software mag draaien, het aantal CPU’s dat gebruikt mag worden of het aantal documenten dat de index mag bevatten.

En daar gaat het fout. Ondanks het feit dat de zoekoplossing veel potentieel heeft en meer informatie zou kunnen ontsluiten voor meer medewerkers, wordt besloten om niet uit te breiden.

De reden hiervoor is meestal de kosten die hiermee zijn gemoeid. Niet de kosten van consultants of ontwikkelaars, maar de kosten van de licentieverhoging die nodig is om meer documenten te kunnen indexeren of het aantal gebruikers (lees zoekopdrachten) te kunnen bedienen.

Het licentiemodel van de Google Search Appliance (GSA) is daar een voorbeeld van. Dat model is uitsluitend gebaseerd op het aantal documenten dat doorzocht kan worden. Het instapmodel is gebaseerd op 500.000 documenten. Dit lijkt veel als we het hebben over een website. Dit aantal documenten is echter al snel veel te weinig als we het hebben over een filesysteem, DMS of databases.
De GSA heeft zeer veel potentieel als het gaat om het voorzien in de informatiebehoefte van medewerkers. De relevantie is zeer goed en de initiële configuratie niet complex. Als we het echter hebben over alle informatie en documenten binnen een organisatie, gaat het al snel over miljoenen “documenten”. De kosten lopen dan in de honderdduizenden euro’s en soms zelfs in de miljoenen. Dit geldt ook voor aanbieders zoals Exalead, HP/Autonomy en Oracle/Endeca.
Voor grote organisaties (meer dan 1000 medewerkers) is dit wellicht nog te rechtvaardigen. Voor het “middensegment” – bedrijven tussen de 50 en 500 medewerkers – is dit al snel niet meer op te brengen. We moeten dit natuurlijk afzetten tegen de business case – hoeveel meer kan ik verdienen/hoeveel kan ik besparen – die voor “enterprise search” zeer moeilijk hard te maken is.  De kosten van consultants en ontwikkelaars zijn vaak een fractie hiervan.

Enterprise search, het bieden van goede – contextuele – resultaten, doorzoekbaar maken van ALLE bedrijfsinformatie, integratie in werkprocessen, vergt veel aandacht en inspanning van specialisten. Deze specialisten kunnen oplossingen bieden voor complexe zoekvragen en user interfaces die zijn gericht op het optimaal bedienen van verschillende processen.
Deze oplossingen kunnen echter vrijwel nooit het niveau bereiken dat nodig is om échte bedrijfsbrede problemen te adresseren, vanwege de licentiekosten van de onderliggende enterprise search producten.

We zien dat organisaties – om het probleem van de licentiekosten op te lossen – steeds vaker uitkijken naar open source oplossingen. Deze oplossingen zijn vaak zeer geschikt om een specifiek probleem op te lossen. Denk hierbij aan Big Data, Data Discovery, Search Based Applications en E-Commerce toepassingen.
Enterprise search kent echter andere aspecten die deze open source oplossingen niet goed kunnen adresseren. Denk hierbij aan enterprise securitymodellen en connectoren voor verschillenden enterprise contentmanagementsystemen. Laat staan de zeer gebruiksvriendelijke beheeromgevingen die de commerciële producten bieden.  In dat geval moet er zeer veel tijd en energie gestoken worden in de totale oplossing waardoor het geheel via maatwerk aan elkaar komt te hangen, zonder een solide basis voor toekomstige uitbreiding te bieden én zonder een duidelijke richting van de partij (welke partij?) achter het open source product.

Naar mijn mening moeten de commerciële aanbieders van Enterprise search oplossingen eens goed kijken naar de licentiemodellen. Willen zij het probleem van het niet kunnen vinden en hergebruiken van informatie binnen bedrijven écht oplossen of willen zij – op korte termijn – zoveel mogelijk verdienen en dan accepteren dat het al snel stopt, vanwege de kosten die een “Enterprise wide” oplossing met zich meebrengt?
De kosten voor een goede oplossing moeten niet zitten in het gebruikte product – waar er meerdere van zijn die vrijwel hetzelfde kunnen – en de licenties, maar in de zorgvuldig overwogen ontwikkelinspanning die nodig is om de totale oplossing meer waarde te laten genereren.
Commerciële aanbieders kunnen daarbij een voorbeeld nemen aan het “subscription” model van LucidWorks (de commerciële organisatie achter Solr), waarbij het niet meer gaat om aantal documenten, servers of CPU’s.

Wij als consultants op het gebied van Enterprise Search willen goede oplossingen bieden voor bedrijven van welke omvang dan ook, maar we worden beperkt door de licentiekosten van commerciële producten.

 

Google Search Appliance 7.2 released

Zojuist ontvingen wij het nieuws dat de volgende versie van de Google Search Appliance (GSA) is verschenen: 7.2.
Hieronder (Engels) de samenvatting van de verbeteringen.
GSA 7.2 builds on our market leadership and provides us with a platform in enabling rich applications for our customers. Data consumption and digitization is on an upward trend and we continue to unveil even greater insights through features in 7.2. Here are some highlights:
  • Sorting by metadata: Users can sort by author, date, price or any other attribute to quickly sift through a large amount of results.
  • Wildcard search: If you don’t know the right spelling or want to search for similar terms at once, just type a few characters and let GSA fill in the blanks.
  • New admin console: A redesigned interface makes managing GSA a cleaner, simpler experience for administrators.
  • Enhanced entity recognition: Now you can test and tweak your entities before indexing begins, ensuring that they work the way you want.
  • Easier connector building: A more scalable, flexible framework simplifies the process of developing and improving custom connectors.

Bron: http://googleenterprise.blogspot.nl/2014/02/google-search-appliance-gets-update.html

Een uitgebreide analyse volgt in de aankomende weken.