Reply to thread

[USER=65566]@Omar Bazavilvazo[/USER]

You can modify the class "XenForo_Input" and the following function:

[php]

  /**

    * Cleans invalid characters out of a string, such as nulls, nbsp, \r, etc.

    * Characters may not strictly be invalid, but can cause confusion/bugs.

    *

    * @param string $string

    *

    * @return string

    */

   public static function cleanString($string)

   {

     // only cover the BMP as MySQL only supports that

     $string = preg_replace('/[\xF0-\xF7].../', '', $string);

     return strtr(strval($string), self::$_strClean);

   }

[/php]

Regex info


Since there are less than 100 additional characters, you could let the regex without modifing it and just replacing these characters by their unicode before the regex and once the regex is completed get back them:


Example:

[php]

  public static function cleanString($string)

   {

     $string = MyClass_Helper_ExtraHanzi::encodeExtraHanzi($string);   


     // only cover the BMP as MySQL only supports that

     $string = preg_replace('/[\xF0-\xF7].../', '', $string);


     $string = MyClass_Helper_ExtraHanzi::decodeExtraHanzi($string);   


     return strtr(strval($string), self::$_strClean);

   }

[/php]


Then use this kind of helper:

[php]

<?php


class MyClass_Helper_ExtraHanzi

{

   protected static $_extraHanziUnicodeTable = array(

     '2070E','20731','20779','20C53','20C78','20C96','20CCF','20CD5','20D15','20D7C',

     '20D7F','20E0E','20E0F','20E77','20E9D','20EA2','20ED7','20EF9','20EFA','20F2D',

     '20F2E','20F4C','20FB4','20FBC','20FEA','2105C','2106F','21075','21076','2107B',

     '210C1','210C9','211D9','220C7','227B5','22AD5','22B43','22BCA','22C51','22C55',

     '22CC2','22D08','22D4C','22D67','22EB3','23CB7','244D3','24DB8','24DEA','2512B',

     '26258','267CC','269F2','269FA','27A3E','2815D','28207','282E2','28CCA','28CCD',

     '28CD2','29D98');

 

   protected static $_extraHanziCharactersReplacementTable;

   protected static $_extraHanziCharactersCharsTable;


   public static function getExtraHanziRemplacementTable()

   {

     if(!self::$_extraHanziCharactersReplacementTable)

     {

       foreach(self::$_extraHanziUnicodeTable as $v)

       {

         self::$_extraHanziCharactersReplacementTable[] = '{u:'.$v.'}';

       }

     }

   

     return self::$_extraHanziCharactersReplacementTable;

   }

 

   public static function getExtraHanziCharsTable()

   {

     if(!self::$_extraHanziCharactersCharsTable)

     {

       foreach(self::$_extraHanziUnicodeTable as $v)

       {

         self::$_extraHanziCharactersCharsTable[] = html_entity_decode("&#x{$v};");

       }

     }

   

     return self::$_extraHanziCharactersCharsTable;

   }


   public static function encodeExtraHanzi($string)

   {

     $extraHanziChars = self::getExtraHanziCharsTable();

     $extraReplacements = self::getExtraHanziRemplacementTable();


     return str_replace($extraHanziChars, $extraReplacements, $string);

   }



   public static function decodeExtraHanzi($string)

   {

     $extraHanziChars = self::getExtraHanziCharsTable();

     $extraReplacements = self::getExtraHanziRemplacementTable();


     return str_replace($extraReplacements, $extraHanziChars, $string);

   } 

}

[/php]


Back
Top Bottom