If you have a multi-language or multi site installation in
Umbraco where you might want to have a site search using Examine,
you'll run into the issue that the indexes contain the reults for
ALL of the sites, not just the current site that the user is
on.
I've been working on a multi-language site recently and ran into
just this issue. Here's how I got round it and made a search that
can be included on all of the sites, with no changes needing to be
made.
First up, how can we limit the search? Handily, we can use the
path variable, which stores the path of the page in the Umbraco
content tree, in a format something like: -1,1060,1075,1230, where
-1 denotes the content root, and the rest of the numbers are the
nodes between the root and page that you're looking at.
In our user control that does the search, we can get the current
node, and rather than jumping back up the tree, we can just split
the path variable out and get the 2nd item in the array to get the
id of the site root node, like this:
var currentPage = umbraco.NodeFactory.Node.GetCurrent();
string parentId = currentPage.Path.Split(',')[1];
Now we know the root node of the current site, how can we use it
with our search? Handily, you can just add the path to your index
settings file. However, the path gets stored in the index in a
comma separated format, which is no good for searching, as Examine
treats it as one big string, so searching for the root node on the
raw path will return no results. However, if you were to replace
the commas with spaces in the index, the numbers in the path would
be treated like words, so you could search for your root node on
it, and it would return only pages with the root node in their
path.
So how to alter the index? Easy! You can plug into the Examine
events to alter the index as it's being written. Basically we want
to hook into ther event, get the path field, replace the commas
with spaces, and then save it as a new field in the Examine index.
Here's an example of the code that we used in an AppliactionBase
class to hook into the event handler and make the changes:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using umbraco.BusinessLogic;
using Examine;
namespace MySite.UmbracoExtensions.EventHandlers
{
public class CmsEvents : ApplicationBase
{
public CmsEvents()
{
//Add event to allow searching by site section
var indexerSite = ExamineManager.Instance.IndexProviderCollection["SiteSearchIndexer"];
indexerSite.GatheringNodeData += new EventHandler(SetSiteSearchFields);
}
//modifies the index field for the path variable, so that it can be searched properly
void SetSiteSearchFields(object sender, IndexingNodeDataEventArgs e)
{
//grab the current data from the Fields collection
var path = e.Fields["path"];
//let's get rid of those commas!
path = path.Replace(",", " ");
//add as new field, as path seems to be a reserved word in Lucene
e.Fields.Add("searchPath", path);
}
}
}
Obviously you'd need to change the "SiteSearchIndexer" part to
the name of your indexer to get it to work! You'll also need to
make sure that the path is included in your index (look at the
default indexes in your Examine config files for an example of
this).
Now all we need to do is make our Examine search look for the
root id in the "searchPath" field. Here's the finished code where
we get the root node, and use it in an example Examine search:
//do search
var searcher = ExamineManager.Instance.SearchProviderCollection["SiteSearchSearcher"];
var criteria = searcher.CreateSearchCriteria(UmbracoExamine.IndexTypes.Content);
Examine.SearchCriteria.IBooleanOperation filter = null;
//search on main fields
filter = criteria.GroupedOr(new string[] { "pageHeading", "pageContent", "navigationText" }, Search);
//only show results in the current path
var currentPage = umbraco.NodeFactory.Node.GetCurrent();
string parentId = currentPage.Path.Split(',')[1];
filter.And().Field("searchPath", parentId);
//don't show hidden pages
filter
.Not()
.Field("umbracoNaviHide", "1");
var resultsTemp = searcher.Search(filter.Compile());
And now your search should only return results for pages in the
current site, not pages from ALL of the sites! Nice and easy to do,
and a good example of how easy it is to extend Umbraco with its
event model!
You can also use this technique to search a specific area of the
site, e.g. have a dropdown to filter the search by the News area,
or Events area. You could also have a single index for multiple
sites, allowing for a search that spaned all the sites as well.