Yesterday*, I attended Microsoft’s Enterprise Search conference in London where the future strategy for search was presented. The large percentage of time was spent on SharePoint Server 2007, that includes enterprise search features. The conference also included a demo of the next version of desktop search and overall roadmap for Microsoft’s search plans, including for the web.
There has been a lot of confusion about Microsoft’s search efforts and what is available across various products and technologies. This post will cover the current portfolio and try to explain the differences between each product. Part 2 will cover plans for the next releases.
Current portfolio
At the conference, Microsoft’s current search portfolio was described as spanning three areas: web, desktop and enterprise. I would argue that there is also a fourth area, the operating system, and have included it here:
- Operating System: MS Search (free, available out-of-box with the OS)
- Enterprise: SharePoint Portal Server 2003 (full product installation, licences required)
- Web: MSN Search (free, available in browsers)
- Desktop: MSN Desktop Search (free download add-on for WindowsXP and Windows Server 2000/2003)
MS Search is a basic full-text indexing engine and search interface that is provided with the operating system (OS) – Windows 2000, WindowsXP, Windows Server 2003, and within server products such as Exchange Server and SQL Server. This also includes Internet Information Server (IIS – Microsoft’s web server) and Windows SharePoint Services (WSS – Microsoft’s collaboration service), both included with Windows Server 2003. Full-text indexing means just that – files and pages are indexed based on their content. When you enter a query, the words are compared against the index and results returned. Order is determined simply by number of times each word appears in the document. Exact matches are required. For example: query = “security”, results = documents containing the word “security”, order is based on the number of occurences of “security” within each document. MS Search is limited to file formats that can be converted into raw text for indexing and it only indexes content on the same server it is installed on. For example, if you have two file servers running Windows Server 2003, MS Search will be installed on each server and you have two separate indexes. To search for a document, you would have to run a separate query on each server.
SharePoint Portal Server 2003 (SPPS) is Microsoft’s portal and collaboration server and only product to include an enhanced search engine. It builds on and extends two OS services – Windows SharePoint Services (WSS – available as an add-on to Windows Server 2003, it provides the base collaboration services) and Microsoft Search (provides the base indexing engine). From a indexing and search perspective, SPPS is focused on enterprise (internal) content, specifically unstructured content (i.e. documents) as opposed to structured content (application data). It extends MS Search to support additional file formats (content is indexed in its native format instead of needing to be translated into raw text) and multiple content sources. For example, you can index file servers, web sites, Exchange messaging servers, Lotus Notes databases as well as SharePoint collaboration sites. This means SPPS provides a single index and search interface across multiple different servers and databases (where as the basic MS Search can only index a single server). SPPS also includes a ranking algorithm to improve the relevance of results. Relevance is focused on content, analysing the similarities and differences between words within documents to determine their relevance to the query terms – this is based on Bayesian inference. (Internet search engines typically focus on web-pages and hence benefit from a different form of relevance algorithm, now dominated by PageRank – relevance based on links between pages). SPPS includes advanced capabilities such as thesaurus support (for example, query = “ie”, thesaurus replaces “ie” with “Internet Explorer” to improve the results returned) and stemming (e.g. query = “security”, search is expanded to include iterations such as “securing”, “securely”, “securities” etc.)
MSN Search is, well, MSN Search – Microsoft’s Internet search engine. Considered by most people to be inferior to the market leader, Google. I keep testing it from time to time (its appearance now looking remarkably similar to Google) and, 9 times out of 10, the wisdom of the crowd prevails. But I wouldn’t rule MS out of the fight just yet. Increasingly we will see web searches blurring with enterprise information (thanks to mash-ups, composite applications and social networks). And searching the web is still far from perfect – Google has also been slow to innovate, as sites such as Technorati and Del.icio.us (acquired by Yahoo) embrace tagging potential, for example. We’ll look at this playground in more detail in the next post.
…and we come full circle back to the most common operating system – running the desktop…
MSN Desktop Search does what it says on the box – enables you to search content stored on your local machine. MSN Desktop Search differs from the basic MS Search provided in the OS in that it supports additional file formats, can also index your email, and includes relevance ranking algorithms to improve the order of results. It includes a preview pane to view documents without having to open their native application, and can be extended to index sources beyond the desktop, such as network folders. If that all sounds a little bit familiar, it’s interesting to note that the first beta of MSN Desktop Search appeared about one year after the launch of SharePoint Portal Server 2003… 🙂 The desktop is becoming a gateway to both enterprise (document- and application- centric content) and web (page-driven content) information. MSN Desktop contains methods that can be found in both SharePoint Portal Server 2003 and MSN Search. It is competing against two similar offerings from, no prizes for guessing, Google and Yahoo.
I’ll delve into these four areas in more detail in the next post, looking at future directions for search and when you would use each one.
(*OK, not actually yesterday, it took place back on the 5th June, I’ve just gotten round to cross posting here… still munching over whether to completely switch over…)Â