• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

XF 1.5 Search function on new Xenforo install with phpbb3 import

#1
Hi All,

New here, so apologies if there is an obvious answer. I've recently moved our forum (classicroverforum.net) from phpbb3 to xenforo. THe upgrade has been mostly successful, but the one main issue with the forum itself that seems to be left is that the search function doesnt seem to returning as many results as it should.

To illustrate, if i search for 'lt77' I get 20 results. If i go to the db and run:
Code:
SELECT * FROM `xf_thread` where title like "%LT77%"
I receive 92 results, and this doesn't even include the actual threads themselves, just the titles. So, I guess that makes me wonder if the search index is working correctly, or if there is some setting i should be tweaking.

I don;t know if it's a coincidence but when i've given google the sitemap with ~22000 pages in it only indexes about 900, which makes me think there could be something strange going on with the structure, or of course it could be an unrelated problem...

All help very gratefully received,

Many thanks,

Rich
 
#2
also
Code:
SELECT * FROM `xf_search_index` where message like '%lt77%'
returns 686 results, which if i'm understanding the db structure correctly would be the correct number of results.
 

Brogan

XenForo moderator
Staff member
#5
I just did a couple of searches for lt77

For titles only it returned 51 results
For titles and post content it returned 65 results

Are all of the threads/forums accessible?
If there is no permission to view, no search results will be returned.

Are you able to find a thread in the database with lt77 in the title which isn't listed in the search results?
 

Mike

XenForo developer
Staff member
#6
Just to clarify, "%lt77%" is not 100% equivalent to "lt77" in the search as the tokenization is word-based. The former would match "lt77a" for example, but that is considered to be a different word. ("lt77*" should still match that particular example).
 
#8
When I search google and your site for lt77 I get 3790 results
Code:
site:classicroverforum.net lt77
However, when I try to visit the first result I get a 500 error
http://www.classicroverforum.net/viewtopic.php?f=1&t=19502
Hi,

Yes - google still has the results from the phpbb version of the forum. Unfortunately i didnt realise I needed to match the IDs on import, and that opportunity has passed as people are using it. It does however show you the number of results we should be getting for that search phrase, at least in order of magnitude.

Thanks,

Rich
 
#9
Just to clarify, "%lt77%" is not 100% equivalent to "lt77" in the search as the tokenization is word-based. The former would match "lt77a" for example, but that is considered to be a different word. ("lt77*" should still match that particular example).
Hi Mike,

Thanks for replying. I was just trying to give an approximation of the number of results. While I'm a c#/c++ developer by trade, in Qt and MS environments generally, My php/mysql knowledge isn't that great I'll admit (although that may be changing in the near future to to an acquisition at work...)

Any ideas gratefully received :)

Rich
 
#10
I just did a couple of searches for lt77

For titles only it returned 51 results
For titles and post content it returned 65 results

Are all of the threads/forums accessible?
If there is no permission to view, no search results will be returned.

Are you able to find a thread in the database with lt77 in the title which isn't listed in the search results?
Hi Brogan,

Thanks for pointing me over here. Definitely the right place to be asking the question considering the responses I'm getting here. The threads are all enabled so should be working as far as i can see. There are some sections which are disabled, but that was due to a first aborted import, and the reimported versions are enabled. Could that be important?

THanks,

Rich.
 

Mike

XenForo developer
Staff member
#11
Yes - google still has the results from the phpbb version of the forum. Unfortunately i didnt realise I needed to match the IDs on import, and that opportunity has passed as people are using it. It does however show you the number of results we should be getting for that search phrase, at least in order of magnitude.
You don't need to maintain IDs to redirect old URLs, though it does make it simpler. Without it, you need explicit redirection scripts. I haven't tried this one, but it should help: https://xenforo.com/community/resources/redirection-script-for-phpbb-3-0-x-without-seo-urls.2326/

Ideally, we'd need to see some examples of searches that return(ed) something on phpBB that don't return in XF. I don't know how phpBB's search worked, but out of the box, XF uses MySQL's full text search which isn't great (but it's consistently available), notably due to high minimum word length requirements.

Looking at Google's results, the first page seems to generally match the XF results (as in, what's in Google is returned by XF). I did only get 28 results with full post searching and no results with title searching, so something is going a little odd. I know you've mentioned it, but try rebuilding the search index and choosing to empty the index first. We need to make sure the whole content is indexed to make sure nothing strange is going on.
 
#12
You don't need to maintain IDs to redirect old URLs, though it does make it simpler. Without it, you need explicit redirection scripts. I haven't tried this one, but it should help: https://xenforo.com/community/resources/redirection-script-for-phpbb-3-0-x-without-seo-urls.2326/

Ideally, we'd need to see some examples of searches that return(ed) something on phpBB that don't return in XF. I don't know how phpBB's search worked, but out of the box, XF uses MySQL's full text search which isn't great (but it's consistently available), notably due to high minimum word length requirements.

Looking at Google's results, the first page seems to generally match the XF results (as in, what's in Google is returned by XF). I did only get 28 results with full post searching and no results with title searching, so something is going a little odd. I know you've mentioned it, but try rebuilding the search index and choosing to empty the index first. We need to make sure the whole content is indexed to make sure nothing strange is going on.

Brilliant - thanks for the tip on redirection! I'll work through that in the next day or so.

I'm going to see if i can run the old forum back up for a day or so just to get some search results. I believe phpbb and xenforo both use mysql's full text search so hopefully that will shed some light. I have the reindex running again after deleting the old index as you suggest. I'll let you know what happens :)

Thanks,

Rich
 
#13
Just done the rebuild and it's drastically improved matters, LT77 now returns 117 results. I think that result is still down, but a lot lot better. I'm going to let the masses try it and I'll report back and let you know how we get on. Thanks for your help :)

Rich
 
#15
OK - Immediate feedback is that the number of results is still not where it should be. I feel like it may not be indexing everything for the search correctly.

Can I check whether the indexing operation has created the correct records?

Thanks,

Rich.
 
#16
Interestingly, with this search reindex google is now indexing 4000 out of about 22000 pages. I think something is wrong with the structure of the forum which is preventing these standard operations from behaving correctly?
 

Mike

XenForo developer
Staff member
#17
Google will take time to reindex pages. I don't think there's anything indicative of a problem. There's no relation between Google and the internal search

You can see what's indexed in the xf_search_index table. There will be one row for each post as well as an additional row that represents just the thread itself.

I'd definitely need to see example differences. I saw "M2C-33G" mentioned on your forum and that is likely down to the default 4 character minimum in MySQL and MySQL taking that as 2 separate, 3 character words.