Public service announcement: Check your MySQL backup!

This is the command I use as a cron job for a daily rolling backup of my main XF database. Each day it replaces the day of the week as a new backup, and writes over the backup done the same day the previous week. Everything is automatically backed up once a day via my VPS provider (Linode), and once a week I download a copy to my local hard drive (for cloud and two other types of backup).

This is with Ubuntu 18.04.3.

(uncompressed)
mysqldump --opt --default-character-set=utf8mb4 -u'ADMINACCT' -p'PASSWORD' xenforo > /var/www/html/MYDOMAIN/mysql_backups/xenforo-backup-$(date +\%F).sql

(compressed)
mysqldump --opt --default-character-set=utf8mb4 -u'ADMINACCT' -p'PASSWORD' xenforo | gzip -c > /var/www/html/MYDOMAIN/mysql_backups/xenforo-backup-$(date +\%F).sql.gz

Works a charm.

ADMINACCT = your admin login account name
PASSWORD = your admin password
MYDOMAIN = domain where database operates

Here's the output.
Screen Shot 2020-02-19 at 3.49.14 PM.webp
 
Right,

My server is running cPanel 86.0.14 (MariaDB 10.3.22). Like some people, I tried doing a backup both with and without this added to my.cnf (and MySQL rebooted):
Code:
[mysqldump]
default-character-set=utf8mb4

Either way it adds in:

/*!40101 SET character_set_client = utf8 */;

When I check to see if the Emoji is there, it is, both ways! The collation is still utf8mb4_general_ci too.

1584537674277.png

So do you think it was a bug that is now fixed? And if so, what are the chances of this error happening again?

cheers,

Alex
 
I managed to do this when moving my forum over to Centminmod (thanks @eva2000 for a superfast LEMP stack)

All posts/threads/messages/whatever from a year back (when we did XF1.5->XF2 upgrade) had emojis showing as ?

I noticed this too late (2 days after migration) and wasn't able to simply roll back the database (because of new threads/posts etc)

Luckily I still had the old database up and running and could do the following "merge dump" on original server:
Code:
mysqldump --skip-add-drop-table --no-create-info --no-create-db --replace --default-character-set=utf8mb4 -u USERNAME -p DBNAME xf_post xf_thread xf_profile_post xf_profile_post_comment xf_poll_response xf_poll xf_conversation_message xf_unfurl_result > dump.sql

Then I copied the file over and ran this on the new server:
Code:
mysql --default-character-set=utf8mb4 -u USERNAME -p DBNAME < dump.sql

Which then replaced the previously wrongly imported data with correct encoding, while still allowing newly added threads/posts/messages stay intact.

Hope that helps someone who notices this too late! :)

I guess if you still have the original dump available, you could restore it in an identical environment and then do the above dump command to get a "merge dump", your mileage might vary :D
 
Having moved servers today, we learned that our MySQL backups were subtly corrupted by an insidious issue that can be difficult to spot until it's too late.

All emoji and high-ascii characters in our database dumps were invalid.

This is due to a behaviour in mysqldump, which defaults the client character set to UTF8 instead of using UTF8MB4 when necessary.

To check if you are affected, review your MySQL dump SQL files and look for lines that look like this: /*!40101 SET character_set_client = utf8 */;

If you find one, your mysqldump command is not being told to use UTF8MB4, and the emoji data in your database dump will be corrupt.

To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
This public service update was brought to you by the letter 🦄
Hi, Kier.

I recently got trapped by this issue.
I helped my friend migrating his site from one server to another server. Both server uses AAPanel.
AAPanel supports direct site migration (with databases and misc settings together).
However, it does not support keeping utf8mb4 data intact during transmission.
It appears that only utf8mb3-supported contents are transmitted intact.

Here's my question for XF 2.2.5:
By default, does XF 2.2.5 default system data (in MySQL) include utf8mb4-only characters (like emoji, high-ascii, etc.)?
// This excludes user-generated contents since Simplified Chinese ideographs are almost covered by utf8mb3, except some extremely-rarely-used ideoglyphs which are totally unnecessary to be used in his board.

Warm Regards,
 
I've had $config['fullUnicode'] = false; in my config from the start. I only need to be concerned about this if I decide to allow emojis, correct?
 
I've had $config['fullUnicode'] = false; in my config from the start. I only need to be concerned about this if I decide to allow emojis, correct?
Certain kanji glyphs in Japanese are not supported by UTF8mb3 at all, requiring UTF8mb4.

I really think that Oracle is responsible for totally disabling UTF8mb3 in future MySQL releases.
 
Having moved servers today, we learned that our MySQL backups were subtly corrupted by an insidious issue that can be difficult to spot until it's too late.

All emoji and high-ascii characters in our database dumps were invalid.

This is due to a behaviour in mysqldump, which defaults the client character set to UTF8 instead of using UTF8MB4 when necessary.

To check if you are affected, review your MySQL dump SQL files and look for lines that look like this: /*!40101 SET character_set_client = utf8 */;

If you find one, your mysqldump command is not being told to use UTF8MB4, and the emoji data in your database dump will be corrupt.

To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
This public service update was brought to you by the letter 🦄

If you look at this screenshot, am I to assume that I am safe from this issue since the database is already in utf8mb4?
You'll see it in the 'Collation' column.

F746D42F-A5EA-42BC-B769-1FB10EA92D8C_4_5005_c.webp
 
That only confirms you need to check your backups and follow the instructions to make sure they are correct.
 
That wasn't my question.
But it is the correct answer. Your screenshot only shows that your current DB is correct. It does not validate that any of your backups are correct unless that was from a VERY recent restore.
By default you have to export your database as specified in this thread. Failure to do so will result in the import not being in the correct format usually (unless you force ALL your DB to default to UTF8mb4 which would mean it should be fine).
That's why it's recommended to use --default-character-set=utf8mb4 if doing dumps and restores from the command line.

By your screenshot you apparently use PHPMyAdmin... I would not recommend relying on it for your DB dumps. It has a habit of timing out and you getting a corrupted export of your DB, especially when on a shared server environment.
 
But it is the correct answer. Your screenshot only shows that your current DB is correct. It does not validate that any of your backups are correct unless that was from a VERY recent restore.
By default you have to export your database as specified in this thread. Failure to do so will result in the import not being in the correct format usually (unless you force ALL your DB to default to UTF8mb4 which would mean it should be fine).
That's why it's recommended to use --default-character-set=utf8mb4 if doing dumps and restores from the command line.

By your screenshot you apparently use PHPMyAdmin... I would not recommend relying on it for your DB dumps. It has a habit of timing out and you getting a corrupted export of your DB, especially when on a shared server environment.

At least your answer made better sense to me.

Unfortunately, I have very little experience in dealing with databases, which is why I had inquired in the 1st place about my database.

Yes, you are correct that I do use phpMyAdmin. Doing command-line database backups is something that I am not used to and that's one area that I do not want to corrupt a database due to my lack of experience with databases.
 
Yes, you are correct that I do use phpMyAdmin. Doing command-line database backups is something that I am not used to and that's one area that I do not want to corrupt a database due to my lack of experience with databases.
Using mysqldump and the sample command line parameters shown in this thread will not harm your DB.
Where you can prang your DB is doing a mysql --default-character-set=utf8mb4 -u USERNAME -p FORUMDBNAME < yourbackup.sql
This will stomp your existing DB and not something you typically want to run unless your DB is for the live forum is corrupted or you have to restore from an older backup because of some issue. Using mysqldump simply dumps your existing DB, just reading it only and not doing any modifications to it.
You can always set up a test bed installation of XF and practice dumping and importing to that DB and it will not interfere with your live site.
 
Last edited:
Using mysqldump and the sample command line parameters shown in this thread will not harm your DB.
Where you can prang your DB is doing a mysql --default-character-set=utf8mb4 -u USERNAME -p FORUMDBNAME < yourbackup.sql
This will stomp your existing DB and not something you typically want to run unless your DB is for the live forum is corrupted or you have to restore from an older backup because of some issue. Using mysqldump simply dumps your existing DB, just reading it only and not doing any modifications to it.
You can always set up a test bed installation of XF and practice dumping and importing to that DB and it will not interfere with your live site.
To elaborate on this comment for readers who may still be leery of using the cli to perform db backups and restores - pay attention to the direction of the bracket character.

A bracket can be considered to be pointing in the direction that the command is going to influence change on.
mysqldump --default-character-set=utf8mb4 -u USERNAME -p MY_AWESOME_LIVE_FORUM > 2021-11-29_MY_FORUM_BACKUP.sql
is going to write your production data to a backup file in what is probably the desired result, vs
mysql --default-character-set=utf8mb4 -u USERNAME -p MY_AWESOME_LIVE_FORUM < 2020-11-01_AN_OLD_FORUM_BACKUP.sql
is gonna make for a bad day.

Yes, I know that --default-character-set is not a valid argument for mysql, so the command would never complete. This example is written purely for illustrative purposes.
 
Yes, I know that --default-character-set is not a valid argument for mysql, so the command would never complete. This example is written purely for illustrative purposes.
Depends on MySQL versions
Bash:
mysql --help | grep -C3 '\-\-default-character-set'
  --debug-check       Check memory and open file usage at exit.
  -T, --debug-info    Print some debug info at exit.
  -D, --database=name Database to use.
  --default-character-set=name
                      Set the default character set.
  --delimiter=name    Delimiter to be used.
  -e, --execute=name  Execute command and quit. (Disables --force and history
 
All the tables in my DB have their collations set as utf8_general_ci. Would this be as simple as converting all my tables to utf8mb4_general_ci?

Is there some code I can run to convert all my tables and columns to utf8mb4_general_ci?

No because the point of this warning is that mysqldump would likely ignore it and export the database as UTF8 which would cause the problems we’re talking about here.

But, even so, if you want to support emoji and utf8mb4 then you should let XF convert it for you using the instructions provided in the manual:

https://xenforo.com/xf2-docs/manual/unicode/


Bumping this thread up as I just saw it by accident (thanks Kier for stickying it).

I've started using XenForo "from scratch" since version 2.1 (if I recall well), I see emoji looks to be supported, but I've noticed that if I try to edit a message content from the xf_post table, emojis are replaced with a question mark.

I see my database is set to utf8_general_ci in the Operations tab in phpMyAdmin:

1648944565806.png

I'm on a Plesk shared hosting, MariaDB v10.5.15...

Should I change all tables collations to utf8mb4_general_ci or not (and eventually what's the best practice)?
 
Thank you Brogan 😁

I did see that already, so you are confirming that DB collation must be converted to utf8mb4_general_ci, correct?
 
Alright... I've run the conversion script on my testing board by SSH from the Plesk panel.
Before confirming I got a message saying that 2 tables were going to be converted, but at the end of the process I still see the same as before in phpMyAdmin:

I see my database is set to utf8_general_ci in the Operations tab in phpMyAdmin:

1648944565806.png

So how could I check if my live XF needs this conversion or not? :cautious:
 
Top Bottom