1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Fixed Invalid UTF8 sequence in truncated message(?)

Discussion in 'Resolved Bug Reports' started by Kent, Nov 19, 2013.

  1. Kent

    Kent Active Member

    Someone had the bright idea to make their "about" field a giant blob of stacking diacritics, which went over the hard-coded limit of 65535 characters.

    Stacking diacritics look like this, and can be posted fine when under the character limit:
    Code:
    ก็็็็็็็็็็็็็็็็็็็็กิิิิิิิิิิิิิิิิิิิิก้้้้้้้้้้้้้้้
    When submitting a message of only those characters repeated beyond the character limit, this error occurs:
    Code:
    Zend_Db_Statement_Mysqli_Exception: Mysqli statement execute error : Data too long for column 'about' at row 1 - library/Zend/Db/Statement/Mysqli.php:214
    When submitting the same message prefixed by a single-byte character, this error occurs:
    Code:
    Zend_Db_Statement_Mysqli_Exception: Mysqli statement execute error : Incorrect string value: '\xE0\xB8\x81\xE0\xB9\x87...' for column 'about' at row 1 - library/Zend/Db/Statement/Mysqli.php:214
    After poking around, it seems the TEXT max length is 65535 bytes, but XenForo is splitting the string by characters.
     
    Adam Howard likes this.
  2. Mike

    Mike XenForo Developer Staff Member

    So part of this was a miscalculation on our part, though it is a bit of an consistency within MySQL. When you say VARCHAR(255) on a UTF-8 column, it actually means 255 characters. However, when you have a text column, the limit (65KB, 16MB, etc) is actually a byte limit and thus is affected by the variable length of UTF-8. Our count/length checks are done as characters across the board.

    The safest thing here, with respect to about and signatures is to enforce a much lower limit. MySQL UTF-8 is normally only supports 3 byte characters, so that puts the worst case space limit as ~21000 characters but I've changed it to limit to 20000. Furthermore, I'm actually applying limits to the length of these fields to fit with the maximum message length as well.
     
    Kent and Luke F like this.

Share This Page