Public service announcement: Check your MySQL backup!

Kier

XenForo developer
Staff member
Having moved servers today, we learned that our MySQL backups were subtly corrupted by an insidious issue that can be difficult to spot until it's too late.

All emoji and high-ascii characters in our database dumps were invalid.

This is due to a behaviour in mysqldump, which defaults the client character set to UTF8 instead of using UTF8MB4 when necessary.

To check if you are affected, review your MySQL dump SQL files and look for lines that look like this: /*!40101 SET character_set_client = utf8 */;

If you find one, your mysqldump command is not being told to use UTF8MB4, and the emoji data in your database dump will be corrupt.

To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
This public service update was brought to you by the letter 🦄
 
Thanks for heads up.
To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
Setting uft8mb4 as default, would bring interesting and potential issues ? If you have other non-xenforo mysql databases that don't use uft8mb4 but utf8 or non-utf8 too on the same server. You could end up messing up mysqldumps for those non-xenforo mysql databases ?

Setting --default-character-set=utf8mb4 command on mysqldump command line would seem safe in such cases
 
Last edited:
All the tables in my DB have their collations set as `utf8_general_ci`. Would this be as simple as converting all my tables to `utf8mb4_general_ci`?

Is there some code I can run to convert all my tables and columns to `utf8mb4_general_ci`?
 
What does this even mean? That only users who supported emojis or something other then UTF-8 are affected? How many users do you suppose this is affecting?
 
Simply, if you use mysqldump to take database backups, and you are using XF 2.0.x with emoji support enabled, then you need to check to make sure that your database backups are being exported as utf8mb4 rather than utf8.

If the database is being exported as utf8 then it's very possible that any content in your database which contains emoiji (that includes, but not limited to; posts, conversations, thread titles, usernames, user profiles) is subtly corrupted; namely emoji characters such as 🤪would appear as ? if you restored the database backup.
 
So if I run serveral databases in one web (which are being backed up together)

db1: Xenforo 1.5
db2: Xenforo 2.0
db3: Wordpress
db4: ...

and set

[mysqldump]
default-character-set=utf8mb4

db1/3/4.... will not be affected by this? Or could there be any corruption of the other databases?
 
If you’re backing up multiple databases then you’d be better to supply the required character set for each one.

Our warning here applies specifically to XF2 if it was a new install or upgraded from XF1 and you have converted your tables but you will need to check what the expected character set is for any other software. XF1 is fine with normal UTF8.
 
I upgraded this week from XF1 to XF2 and I did not convert it, so my XF2 is not emoji support enabled. This mean my database will not be effected by this mysqldump issue. Am I right?
 
Thanks for the heads up. Just fixed this issue.

I know this is off-topic but do I need to do the same thing for Wordpress and Mediawiki sites? I'm just wondering if my MySQL backups for them are affected by the same issue.
 
The default charset for newer WordPress installs at least is utf8mb4 so yes it could be susceptible.

However I can’t be certain so for that and Mediawiki you would really have to check with those to be certain.
 
I use MySQL Workbench to export backups and the default character set = utf8 , how do I alter it to export using utf8mb4 ?
 
do you have to do anything special to IMPORT a db exported with the utf8mb4 option?
If you exported using the command line parameter as given earlier (to prevent other DB's from having it forced on them also) then no, the exported data should define it in the text. This is a mysqldump segment frommy IPS site (since I don't run 2.x of XenForo currently)

Code:
CREATE TABLE `calendar_calendars` (
  `cal_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cal_moderate` tinyint(1) NOT NULL DEFAULT 0,
  `cal_position` int(3) NOT NULL DEFAULT 0,
  `cal_title_seo` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `cal_comment_moderate` tinyint(1) NOT NULL DEFAULT 0,
  `cal_color` varchar(7) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Hex color code to represent this calendar',
  `cal_allow_comments` tinyint(1) unsigned NOT NULL DEFAULT 1,
  `cal_allow_reviews` tinyint(1) unsigned NOT NULL DEFAULT 0,
  `cal_review_moderate` tinyint(1) unsigned NOT NULL DEFAULT 0,
  `cal_club_id` bigint(20) unsigned DEFAULT NULL COMMENT 'The club ID if this calendar belongs to a club, or NULL',
  `cal_bitoptions` int(10) unsigned NOT NULL DEFAULT 0,
  `cal_calendar_bitoptions` int(10) unsigned NOT NULL DEFAULT 0,
  PRIMARY KEY (`cal_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
If you notice, it has the utf8mb4 CHARSET set and the COLLATE as utf8mb4_unicode_ci.
 
If you exported using the command line parameter as given earlier (to prevent other DB's from having it forced on them also) then no, the exported data should define it in the text. This is a mysqldump segment frommy IPS site (since I don't run 2.x of XenForo currently)
Actually to be on safe side you'd want to do mysql import with

Code:
mysql --default-character-set=utf8mb4

as your may have uf8mb4 mysql table and field character set/collations but your mysql client connection my not be utf8mb4 based and mysql server default client might be a different character set and collation from utf8mb4 i.e. utf8
 
Back
Top Bottom