Not a bug Searching for special characters

Gene

Member
Users have been noticing that searching for a word with a special character only returns results with that exact character, when really it should be returning results for that character as well as its English-keyboard equivalent (like Google would).

On our beer forum, for example, searching for Kölsch will NOT return results for Kolsch, and vice-versa. Just common sense.

I am using Enhanced Search with Elasticsearch, but I believe this behavior is the same with traditional search. If I'm wrong, please move this thread to Enhanced Search Bug Reports.
 
This is specific to Elasticsearch. It's really just something that we don't specify. MySQL is basically accent insensitive everywhere, Elasticsearch would require a configuration to enable that.

Specifically, enabling the asciifolding filter is necessary: http://www.elasticsearch.org/guide/...urrent/analysis-asciifolding-tokenfilter.html I'm not clear what effect this would have on non-Latin based languages (CJK, Cyrillic). Using this would also involve setting up a custom analyzer, which can be done with code like this:

Code:
<?php

$fileDir = dirname(__FILE__);

require($fileDir . '/library/XenForo/Autoloader.php');
XenForo_Autoloader::getInstance()->setupAutoloader($fileDir . '/library');

XenForo_Application::initialize($fileDir . '/library', $fileDir);

$dsl = array();
$dsl['index']['analysis']['analyzer']['default'] = array(
    'type' => 'custom',
    'tokenizer' => 'standard',
    'filter' => array('standard', 'lowercase', 'stop', 'asciifolding', 'xf_snowball')
);
$dsl['index']['analysis']['filter']['xf_snowball'] = array(
    'type' => 'snowball',
    'language' => 'English'
);

$indexName = XenES_Api::getInstance()->getIndex();

XenES_Api::closeIndex($indexName);
Zend_Debug::dump(XenES_Api::updateSettings($indexName, $dsl));
XenES_Api::openIndex($indexName);

Zend_Debug::dump(XenES_Api::getSettings($indexName));
The index should then be rebuilt.

There are a lot of ES options that we don't expose -- there's an absolute ton of configuration that can be done. There may be value in exposing an option to control the analysis in a little more detail, though it is a fair bit more complex.
 
It's a one off script. You put it in the root XF directory, run it and go from there. It's important that you don't later choose any options that later delete the index. Note that the code is provided as-is and hasn't been tested.

To verify any changes and how things are working, you're going to need to make queries against Elasticsearch (http://www.elastic.co/guide/en/elasticsearch/reference/current/index.html). This is something you'd need to learn about as they're really just options that apply to the core system, like Apache or MySQL config, and we can really only support what XFES exposes via the UI.
 
Top Bottom