After some feedback from Jeroen and Shannon about yesterday's blog post, I've had a look at another method for having searches on multiple sites, which works much better for site that use multiple languages.
The issue is that different languages have different stop words, and word stems etc, so using the standard Examine analysers on say French content means that the results won't be as accurate as they would be using a French language specific indexer.
How do we do this? Firstly, we need to set up a specific index for each site, telling each on to start at the root node for the language. Set up the index set as you would normally, and then add the IndexParentId parameter to your declaration, like this:
<IndexSet SetName="enSiteSearchIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/enSiteSearch/" IndexParentId="1090">
Once this is done, the index will ONLY index content beneath the parent node that you specified. You can then create an index for each site, allowing you to use different analyzers for each index if you want to.
If you prefix your searchers/indexes etc with the name of the root node, you can get that in your search code and use it to get the right searcher, so you'll never have to change the search code when adding new language sites (just add the new indexes etc to your Examine config).
Now, if anybody has full-working german and polish analyzers, that would be great. We have a decent french one here, if anyone's interested.
@Stephan, I've been looking at the Snowball Contrib project on the Lucene.net site. Need to see if I can get it working with Examine though. It also supports stemming words, as well as several languages (not sure how well they're supported though).