Home > Blog > 2011 > 12

All Posts From : December 2011

Limiting an Examine Search to the Current Site 2

After some feedback from Jeroen and Shannon about yesterday's blog post, I've had a look at another method for having searches on multiple sites, which works much better for site that use multiple languages.

The issue is that different languages have different stop words, and word stems etc, so using the standard Examine analysers on say French content means that the results won't be as accurate as they would be using a French language specific indexer.

How do we do this? Firstly, we need to set up a specific index for each site, telling each on to start at the root node for the language. Set up the index set as you would normally, and then add the IndexParentId parameter to your declaration, like this:

<IndexSet  SetName="enSiteSearchIndexSet"  IndexPath="~/App_Data/TEMP/ExamineIndexes/enSiteSearch/"  IndexParentId="1090">

Once this is done, the index will ONLY index content beneath the parent node that you specified. You can then create an index for each site, allowing you to use different analyzers for each index if you want to.

If you prefix your searchers/indexes etc with the name of the root node, you can get that in your search code and use it to get the right searcher, so you'll never have to change the search code when adding new language sites (just add the new indexes etc to your Examine config).

Limiting an Examine Search to the Current Site

If you have a multi-language or multi site installation in Umbraco where you might want to have a site search using Examine, you'll run into the issue that the indexes contain the reults for ALL of the sites, not just the current site that the user is on.

I've been working on a multi-language site recently and ran into just this issue. Here's how I got round it and made a search that can be included on all of the sites, with no changes needing to be made.

First up, how can we limit the search? Handily, we can use the path variable, which stores the path of the page in the Umbraco content tree, in a format something like: -1,1060,1075,1230, where -1 denotes the content root, and the rest of the numbers are the nodes between the root and page that you're looking at.

In our user control that does the search, we can get the current node, and rather than jumping back up the tree, we can just split the path variable out and get the 2nd item in the array to get the id of the site root node, like this:

var currentPage = umbraco.NodeFactory.Node.GetCurrent();
string parentId = currentPage.Path.Split(',')[1];

Now we know the root node of the current site, how can we use it with our search? Handily, you can just add the path to your index settings file. However, the path gets stored in the index in a comma separated format, which is no good for searching, as Examine treats it as one big string, so searching for the root node on the raw path will return no results. However, if you were to replace the commas with spaces in the index, the numbers in the path would be treated like words, so you could search for your root node on it, and it would return only pages with the root node in their path.

So how to alter the index? Easy! You can plug into the Examine events to alter the index as it's being written. Basically we want to hook into ther event, get the path field, replace the commas with spaces, and then save it as a new field in the Examine index. Here's an example of the code that we used in an AppliactionBase class to hook into the event handler and make the changes:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using umbraco.BusinessLogic;
using Examine;

namespace MySite.UmbracoExtensions.EventHandlers
{
    public class CmsEvents : ApplicationBase
    {
        public CmsEvents()
        {
            //Add event to allow searching by site section
            var indexerSite = ExamineManager.Instance.IndexProviderCollection["SiteSearchIndexer"];
            indexerSite.GatheringNodeData += new EventHandler(SetSiteSearchFields);
        }

        //modifies the index field for the path variable, so that it can be searched properly
        void SetSiteSearchFields(object sender, IndexingNodeDataEventArgs e)
        {
            //grab the current data from the Fields collection
            var path = e.Fields["path"];

            //let's get rid of those commas!
            path = path.Replace(",", " ");

            //add as new field, as path seems to be a reserved word in Lucene
            e.Fields.Add("searchPath", path);
        }
    }
}

Obviously you'd need to change the "SiteSearchIndexer" part to the name of your indexer to get it to work! You'll also need to make sure that the path is included in your index (look at the default indexes in your Examine config files for an example of this).

Now all we need to do is make our Examine search look for the root id in the "searchPath" field. Here's the finished code where we get the root node, and use it in an example Examine search:

//do search
var searcher = ExamineManager.Instance.SearchProviderCollection["SiteSearchSearcher"];

var criteria = searcher.CreateSearchCriteria(UmbracoExamine.IndexTypes.Content);

Examine.SearchCriteria.IBooleanOperation filter = null;

//search on main fields
filter = criteria.GroupedOr(new string[] { "pageHeading", "pageContent", "navigationText" }, Search);

//only show results in the current path
var currentPage = umbraco.NodeFactory.Node.GetCurrent();
string parentId = currentPage.Path.Split(',')[1];

filter.And().Field("searchPath", parentId);

//don't show hidden pages
filter
    .Not()
    .Field("umbracoNaviHide", "1");

var resultsTemp = searcher.Search(filter.Compile());

And now your search should only return results for pages in the current site, not pages from ALL of the sites! Nice and easy to do, and a good example of how easy it is to extend Umbraco with its event model!

You can also use this technique to search a specific area of the site, e.g. have a dropdown to filter the search by the News area, or Events area. You could also have a single index for multiple sites, allowing for a search that spaned all the sites as well.

Version 2.0.3 of AutoFolders Released

Those of you that have been using Umbraco a while are probably familiar with the excellentAutoFolders package, that allows you to have your content automatically sorted into date folders and alpha folders. The version that was up on the site until this morning was only compatible with the old schema.

A while back I persuaded the project owner Chris to let me contribute to the project so that I could update it to use the new schema and make it compatible with 4.5+.

I did that a while ago and we've been running it on a few sites for a few months now without issues so I've finally released it back into the wild for everyone else to use.

In addition to updating the code to work with the new schema, I also tweaked a few other things:

The content tree updates to reflect moves - the content tree on the left reloads to show the correct position of the content once it's been moved.

Problems with XML cache - there were a few minor issues introduced by some of the changes in 4.5 that caused the XML cache not to be correctly updated, these have been fixed.

Super exciting new folder provider - courtesy of Andrew Brigham, we now have a third default provider, the PropertyFolder provider. This allows you to specify a property of a document type, and have AutoFolders create folder structure based on that property. Say for example you have houses with a dropdown called "area", you can set the provider to group by the area property creating a site structure of /area/house. Example code is included in the config file that ships with 2.0.3!

The new version will ONLY work with the new schema, if you want to use AutoFolders on older sites, use version 2.0.2 instead! The DLL is built for .Net 3.5 so it should work on sites that aren't running 4.7 as well (we've tested it on 4.5, 4.5.2, 4.7 and 4.7.1).

I hope you guys like the update and that it allows you to use this excellent package on your newer sites!

:)

Head on over to Our and grab yourself a copy now!

 

Sometimes You Just Have To Design For IE6

Have you ever had a problem with one of those older, shittier browsers (you know the ones I mean) and you've asked for help on a forum, and been given the helpful answer of "Just tell the client to update their browser"? I know I have, and it's the sort of reponse that just makes me want to kick the person giving the answer square in the Balls. HARD.

Why? I'm guessing that most of the folks who make these statements either work for cool funky agencies that are all "web 2.0" and just do funky brochureware crap, or that they've only ever worked with fairly small companies. If you've ever worked with a large company (and I'm talking 1000's+ of staff, possibly spread worldwide here) you'll know what a laughable idea it is to try and get them all to update their browsers.

In large corporate IT, you tend to get everyone standardised on similar kit. Because of the cost of upgrading 1000's of PCs (in man hours as well as software licenses), a lot of these companies don't upgrade their software unless there's a compelling business reason to. Installing the latest version of SQL Server because it's 10x faster and can run on less hardware, saving the company X thousand a year in hostiong costs is a compelling reason for them to spend money. Tieing up their IT department upgrading everyone to IE9, plus all the training and increased support times as clueless users call tech support because the buttons are in different places, just so you can have rounded boxes and drop shadows when you browse the web is a far less convincing argument.

Another factor is that of Compliance. Big companies have to comply with all sorts of tedious regulations, especially if they work in the financial sector, which makes upgrading stuff even harder, as you have to go through innumerable audits to make sure that the new stuff is still compliant with internal policies and procedures.

Two of the largest companies I work with at the moment are pretty much all on Windows XP with IE6, with the exception of the upper echelons of management, who always seem to have the shiniest kit. It's not ideal, but it's how they operate.

As a web developer, I just have to live with this. I might not be happy about it, but if I want to work with the big boys, I have to accept that that's the way it is, and me asking them to upgrade their browsers isn't going to make it happen (no matter how much I'd like it to)!

Consequently for most clients in this situation, I go for an approach of progressive enhancement (Google it for lots of exciting resources on the subject). We make sure that the site works nicely in IE6, IE everything looks as nice as we can get it, and then we have progressive updates for users with better browsers. That way the poor schmoes at corporate HQ get a perfectly functional website that is perfectly usable, but eveyone else gets nice gradient fills and curved tab boxes!

So, next time you see someone talking about IE6 browser bugs, think before you tell them to update their browsers!