How do you properly do backup in live site?

Danny88

Active member

I'm wondering if you have large site consisting about 100GB mysql size , how do you guys do get the backup data in accurate way?

Logical Mysql dump inconsistencies​

One of the biggest downsides to mysqldump is its lack of parallelism. The tool starts at the first database, alphabetically, then dumps each table within, one at a time. Consider a transaction that inserts data into tables ‘alpha’, and ‘zeta’. Now the backup begins. When the backup reaches table ‘delta’, another transaction deletes from ‘zeta’. Your backup now has an inconsistency between ‘alpha’ and ‘zeta’.

Direct copy from /mysql folder
My experience doing rsync backup into mysql (involves .frm and .ibd files) never gets accurates index numbers upon double checking in phpmyadmin between

I only get my backup data accurately using mysql replication where slave in separate server and from there upload to 3rd parties. For me this setup is costly and painful. I plan for downsizing and looking for cheaper backup solution but with accurate data.
 
I have my server set to do a full backup, four times a week, during my site's slowest times (for me, that's around 2am central). Then, when I want a site-specific backup (usually a few times a month), I'll temporarily set the forums offline in the admin panel, perform the backup, then re-open the forums. That said, my site's backup is under 2Gb so it takes me less than 5 minutes to close the site, back it all up, and have it open again.

When my site is offline those few times a month is also when I choose to do any server maintenance or add-on upgrades. Doing all of that, I can still usually have the site back up and running within 15 minutes. I also try to do these things when the site is as slow as possible, but I'm not staying awake until 2am for it.
 
This addon will close your forum before the backup starts and re-open it automatically once done.

I'm not sure if that will work with a large database like yours.

Backing up my own XF2 forum database with a 25GB raw size takes 90–100 seconds only, so I don't bother closing the forum.
Just using MySQLDump and wrapped in a simple bash script.

Have you tried this?
 
Closing down the site for backup purpose isn't the method I prefer. It really takes time and eventually will tarnish site reputation.
 
I use mydumper for my Mariadb database (around 50gb). Working great and restores are consistently working too.
 
I use ndbcluster as the storage engine, which supports hot backups (end users don't know anything is down... all tables are fully write capable during a backup).

Not really an option to close down sites or databases for any period of time. I do a daily full backup of all the databases (not just one specific site), and the backup spans about 5TB. The backups are deleted after 1 week, unless the backup was taken on the 1st of the month (those are kept for 1 year). Once a month the full backup is sent to an offsite location (offsite locations keep the last 2 months of backups). So at any given time I have 20 backups available (18 onsite and 2 offsite).

The number of times I've needed to restore something from backup has been zero though.

If you have the resources and technical know-how to use ndbcluster, I highly recommend it. My little database cluster can handle ~30M SQL reads per second while also doing ~10M SQL writes per second. Zero downtime for backups or even database software updates.

I believe the first versions of the ndbcluster storage engine was designed for call detail logging and call routing in the telecom industry where you can't just take down the system for an upgrade or backup even for a second.

Across all my sites and databases (XenForo, a couple WordPress and some others), InnoDB and MyISAM are not used. Everything is 100% ndbcluster.
 
Last edited:
There is also mysqlhotcopy. It depends too on if you are using mysql or mariadb or postgres. Mariadb and Postgres have more advanced backup options as well.
 
I use ndbcluster as the storage engine, which supports hot backups (end users don't know anything is down... all tables are fully write capable during a backup).

Not really an option to close down sites or databases for any period of time. I do a daily full backup of all the databases (not just one specific site), and the backup spans about 5TB. The backups are deleted after 1 week, unless the backup was taken on the 1st of the month (those are kept for 1 year). Once a month the full backup is sent to an offsite location (offsite locations keep the last 2 months of backups). So at any given time I have 20 backups available (18 onsite and 2 offsite).

The number of times I've needed to restore something from backup has been zero though.

If you have the resources and technical know-how to use ndbcluster, I highly recommend it. My little database cluster can handle ~30M SQL reads per second while also doing ~10M SQL writes per second. Zero downtime for backups or even database software updates.

I believe the first versions of the ndbcluster storage engine was designed for call detail logging and call routing in the telecom industry where you can't just take down the system for an upgrade or backup even for a second.

Across all my sites and databases (XenForo, a couple WordPress and some others), InnoDB and MyISAM are not used. Everything is 100% ndbcluster.

Sounds great but unfortunately I'm using Mariadb.
There is also mysqlhotcopy. It depends too on if you are using mysql or mariadb or postgres. Mariadb and Postgres have more advanced backup options as well.

mysqlhotcopy seems deprecated. Anyway I'll try to look if it really can support accurate backup without downtime.
 
The most common options to create a backup source (which needs to be backuped using your backup solution) would be
  • replication secondary
  • mysqldump --single-transaction
  • Mariabackup (if using MariaDB), XtraBackup (if using MySQL or Percona Server)
  • filesystem snapshot (LVM, ZFS - don't use BtrFS!)
 
Last edited:
The most common options to create a backup source (which needs to be backuped using your backup solution) would be
  • replication secondary
  • mysqldump --single-transaction
  • Mariabackup (if using MariaDB), XtraBackup (if using MySQL or Percona Server)
  • filesystem snapshot (LVM, ZFS - don't use BtrFS!)
I use zfs and ship snapshots offsite (using zrepl, a wrapper around zfs send/receive), and as well as periodically standing up an instance of mysql targeting a clone/snapshot and doing a logic dump from that.

It avoid replication delays, it doesn't touch the live server, and I've got a consistent snapshot from both the logical dump and from raw snapshots.
 
mysqldump --single-transaction
I didn't even know this was a thing, and I use mysqldump for all my backups. Guess I'll have to start throwing that option in there when I do it to see how that goes. Maybe I'll be able to start doing my backups without shutting the site down, even if it's just for a few minutes.
 
I didn't even know this was a thing, and I use mysqldump for all my backups. Guess I'll have to start throwing that option in there when I do it to see how that goes. Maybe I'll be able to start doing my backups without shutting the site down, even if it's just for a few minutes.

If you're using logical backups such as mysqldump, consider using mydumper/myloader instead. Consistency, parallelism and even if you only used one thread it still dumps faster then mysqldump. Plus zstd compression support is now built in.

IIRC, for logical dumps of 4 big boards plus a dozen other smaller DBs, backup time dropped 30-40 minutes to about 2.
 
Last edited:
I didn't even know this was a thing, and I use mysqldump for all my backups.

Yeah, it's one of the standard switches I always use when dumping.
 

Attachments

  • 1696532118598.webp
    1696532118598.webp
    110.9 KB · Views: 40
Top Bottom