incremental import

spitballing...

Incremental only works within the same import session.

Don't take responsibility for new content in XF causing collisions with new content from source. Rather warn the user not to "use" the XF forum before an incremental import and to take backups if they wish to use the forum between an import and subsequent incremental imports.

Separate out threads and posts into their own steps for the purpose of incremental imports.

This shouldn't be hard to implement if you don't take responsibility for collisions.

Example... a forum with 30 million posts runs the import for a few days on a live source to get most of the content. Then they turn off the source and incrementally import only the new stuff. Downtime is almost nothing. That way it doesn't matter that the web-based importer is slow.
 
What about changes made to existing content?
For example, edited or deleted posts, deleted/merged/moved threads, banned members, deleted members, merged members, etc.

How would that be updated?
 
Last edited:
If I understand you correctly with this suggestion the data will be imported while the forum keeps open? Then the importer should take care that imported data cannot be deleted or altered on the source system. So they would be temporarily locked. The question is, what would be the better choice for the live system. Source or target or mixed, with a switch over at a certain point.
 
What about changes made to existing content?
For example, edited or deleted posts, deleted/merged/moved threads, banned members, deleted members, merged members, etc.

How would that be updated?

It wouldn't. Major moderator actions on the source would be contraindicated during the import.
 
What good is a car that doesn't fly? The question ignores the intended use.

As stated above, there are specific usage requirements with an incremental import. Use only as indicated. For external use only. In case of accidental ingestion do not induce vomiting. Contact your doctor if you experience an erection lasting longer than 4 hours.
 
Hmm, I guess some people don't like my humor. I got reported. =\

I was trying to explain that you need to be aware of certain considerations and use the incremental importer as intended. To Brogan's point, the user needs to be warned that an incremental import will pull in new inserts but not new updates, therefore updates to the source should be kept to a minimum. That means do not commit moderator actions during the initial import. If the initial import takes 3 days then warn your moderators not to make big changes during those 3 days as the actions might have to be redone after the move.

So to answer the question, "What good is an incremental import if it doesn't do updates?"

The answer is... it doesn't do updates. That is a compromise that you need to be aware of so you can work around it. If you want to pull in changes to all previously imported records then you need to reinstall and do a full import again. No incremental. But if you want to minimize downtime for a huge import then you can use incremental which has some slight compromises (e.g. no updates), the impact of which can be minimized if you are aware of them.
 
Back
Top Bottom