1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Not a Bug Searching for special characters

Discussion in 'Enhanced Search Resolved Bugs' started by Gene, Mar 24, 2014.

  1. Gene

    Gene Member

    Users have been noticing that searching for a word with a special character only returns results with that exact character, when really it should be returning results for that character as well as its English-keyboard equivalent (like Google would).

    On our beer forum, for example, searching for K├Âlsch will NOT return results for Kolsch, and vice-versa. Just common sense.

    I am using Enhanced Search with Elasticsearch, but I believe this behavior is the same with traditional search. If I'm wrong, please move this thread to Enhanced Search Bug Reports.
     
  2. Mike

    Mike XenForo Developer Staff Member

    This is specific to Elasticsearch. It's really just something that we don't specify. MySQL is basically accent insensitive everywhere, Elasticsearch would require a configuration to enable that.

    Specifically, enabling the asciifolding filter is necessary: http://www.elasticsearch.org/guide/...urrent/analysis-asciifolding-tokenfilter.html I'm not clear what effect this would have on non-Latin based languages (CJK, Cyrillic). Using this would also involve setting up a custom analyzer, which can be done with code like this:

    Code:
    <?php
    
    $fileDir = dirname(__FILE__);
    
    require($fileDir . '/library/XenForo/Autoloader.php');
    XenForo_Autoloader::getInstance()->setupAutoloader($fileDir . '/library');
    
    XenForo_Application::initialize($fileDir . '/library', $fileDir);
    
    $dsl = array();
    $dsl['index']['analysis']['analyzer']['default'] = array(
        'type' => 'custom',
        'tokenizer' => 'standard',
        'filter' => array('standard', 'lowercase', 'stop', 'asciifolding', 'xf_snowball')
    );
    $dsl['index']['analysis']['filter']['xf_snowball'] = array(
        'type' => 'snowball',
        'language' => 'English'
    );
    
    $indexName = XenES_Api::getInstance()->getIndex();
    
    XenES_Api::closeIndex($indexName);
    Zend_Debug::dump(XenES_Api::updateSettings($indexName, $dsl));
    XenES_Api::openIndex($indexName);
    
    Zend_Debug::dump(XenES_Api::getSettings($indexName));
    The index should then be rebuilt.

    There are a lot of ES options that we don't expose -- there's an absolute ton of configuration that can be done. There may be value in exposing an option to control the analysis in a little more detail, though it is a fair bit more complex.
     
    sapph, janowitz, Gene and 2 others like this.
  3. Gene

    Gene Member

  4. imthebest

    imthebest Formerly Super120

    @Mike, where to put that code snippet?
     
  5. imthebest

    imthebest Formerly Super120

    Please someone tell me where to put the code snippet posted by Mike.
     
  6. imthebest

    imthebest Formerly Super120

    @Mike could you please help?
     
  7. Mike

    Mike XenForo Developer Staff Member

    It's a one off script. You put it in the root XF directory, run it and go from there. It's important that you don't later choose any options that later delete the index. Note that the code is provided as-is and hasn't been tested.

    To verify any changes and how things are working, you're going to need to make queries against Elasticsearch (http://www.elastic.co/guide/en/elasticsearch/reference/current/index.html). This is something you'd need to learn about as they're really just options that apply to the core system, like Apache or MySQL config, and we can really only support what XFES exposes via the UI.
     

Share This Page