Some questions about data after import

dutchbb

Well-known member
Just finished a first import of my vB database for testing, and looking at everything I found some strange things that I'd like to understand.

First of all, the databases sizes:
vBulletin: 4,8 GB
XenForo: 10,1 GB

Of course I wanted to find out why there's such a big difference. So I checked the post and thread tables:

vBulletin threads & posts: 62,1 MB - 3,7 GB
XenForo threads & posts: 40,2 MB - 2,0 GB

Is that possible, or did something go wrong during the import that caused a loss of data here?

Then to find out why the XF database takes up so much more space, I saw that my vBulletin was actually using Fulltext search, but can it be that I have a 6,8 GB search index in XF for roughly 5.5 million posts? Because it sounds high. If so, a full text functionality or something similar would be very welcome.
 
Just finished a first import of my vB database for testing, and looking at everything I found some strange things that I'd like to understand.

First of all, the databases sizes:
vBulletin: 4,8 GB
XenForo: 10,1 GB

Of course I wanted to find out why there's such a big difference. So I checked the post and thread tables:

vBulletin threads & posts: 62,1 MB - 3,7 GB
XenForo threads & posts: 40,2 MB - 2,0 GB

Is that possible, or did there go something wrong during this import that caused a loss of data here?

Then to find out why the XF database takes up so much more space, I saw that vBulletin was actually using Fulltext search, but can it be that I have a 6,8 GB search index in XF for 5.5 million posts? Because it sounds high. If so, a full text functionality of something similar would be very welcome.
It's the result of rebuilding your search index.
 
And if I understand correctly that means content has to be duplicated, with the benefit that post and thread tables are not locked while searching? Any comment about the smaller post table?
 
Yes, content is duplicated for the benefit of speed and system responsiveness.

Regarding your smaller thread/post tables, is there a discrepancy in the number of records each holds?
 
5,550,357 records for vB post table
5,548,983 records for XF post table

262,295 records for vB thread table
259,467 records for XF thread table

So yeah it seems something is missing, although I doubt that would account for 1.7 GB. Unless ... some very large threads/posts are missing...

I should also add that my browser crashed once during the import of posts/threads.
 
My guess is that those thread/post count differences represent threads and posts that have become orphaned over the life of your vBulletin, from browser crashes during deletes etc.

As for the browser crash during your import, there's not much we can do when the browser is asked to do a rapidly-refreshing task with long-loading pages like this, they just tend to fall over from time to time. We have designed the system such that if the browser does have a crisis, you should be able to just hit refresh on the page where it died and it will continue where it left off.
 
My guess is that those thread/post count differences represent threads and posts that have become orphaned over the life of your vBulletin, from browser crashes during deletes etc.

As for the browser crash during your import, there's not much we can do when the browser is asked to do a rapidly-refreshing task with long-loading pages like this, they just tend to fall over from time to time. We have designed the system such that if the browser does have a crisis, you should be able to just hit refresh on the page where it died and it will continue where it left off.
Thanks for explaining. BTW the statistics on the main pages are different too. Anyway I'm going to try a new import and see if it happens again. Then it's probably what you say it is.

Just to be sure: there is no limit to what size post/thread (in number of lines) the importer can import right?
 
Thanks for explaining. BTW the statistics on the main pages are different too. Anyway I'm going to try a new import and see if it happens again. Then it's probably what you say it is.

Just to be sure: there is no limit to what size post/thread (in number of lines) the importer can import right?
The only limit is disk space and time.
 
Top Bottom