Stuart Wright
Well-known member
I posted in my suggestion for a better sidebar system here that I think it's important that we have a display of similar threads in every thread displayed.
So the question here is how to return the most relevant set of search results based on the thread title.
The only addon I have found which does this is @Daniel Hood 's addon for [bd] Widgets, [XenMods] Similar Threads. I tested this but the search results were too generic and not similar to threads I was viewing. Daniel is looking in to how to make the similar threads search better and I wanted to put the question out to the development community here at Xenforo. Not because I don't think Daniel can do a great job, but because this functionality is so important that it should be in the core software and I think that the idea will benefit from as much input as possible.
There is a Similar Threads addon by @AndyB which displays them when creating a new thread. The idea being to avoid posting threads similar to existing ones. Andy has included some options which help narrow down the search results. He is searching on the thread title only, of course.
I'd like to run through my thoughts and then invite your feedback.
First how would the search be influenced by the presence of Xenforo's Enhanced Search addon? I'm guessing a fair bit of added functionality is available in Elastic, but I'm unfamiliar with what this is so I'm going to speak generally and assume that the search could be changed to take advantage of Elastic if it is installed.
Keyword matching
I have a problem with excluding words based on the number of characters. If, for example, we have a title of 'What is the best LG TV?' the two letter words LG and TV are the most important search terms for finding a similar thread.
But obviously it's important to give less weight to common words. This document http://www.elasticsearch.org/guide/...nce/current/query-dsl-common-terms-query.html implies that with Elastic, we can nail this. If Elastic figures out the common terms itself based on what's in the index, and then allows us to give them less weighting, then that's awesome. But if the Xenforo installation does not include Elastic, then would there need to be a list of common words entered, I guess, in order to try and achieve a similar result.
Forum
I think the weighting of the search results should be influenced by the forums they are in. So threads in the same forum should be most given preference. Then the child forum, sibling forum and parent forum. In this order? In any order?
Once the above is in place, I can't imagine a scenario where I would want to exclude the results from a specific forum. If I was looking for a similar thread to one in the OLED TVs forum, I can't imagine I would want any results from the Holidays forum, but if preference is given to threads based on the forum then hopefully that won't happen.
Characters
With regard to punctuation characters, the only one I can think would make any difference is the question mark. I'm thinking that if the thread is a question, then we'd want to match similar questions.
Thread date
Should any preference be given to more recent threads?
Prefix
If the thread has a prefix, then some preference should be given to threads with the same prefix.
Is there anything else which would help find the best set of search results?
Any input from you?
Thanks
I probably understated how important I think this is. If, like us, you have a bounce rate of around 70% then you *have got* to do something to keep people at your site once they have found you. Every time you don't, there is a lost opportunity to gain a new member.Big sites often have a huge bounce rate. Most people land on your forum via a search engine directly into a thread, do or don’t find the specific information they want and then click the Back button. If there was a block showing related content (similar threads, showcases, media gallery items etc.) then the visitor is more likely to click one of those and stay at your site. This is potentially a very useful addition for retaining users, increasing impressions and reducing the bounce rate.
So the question here is how to return the most relevant set of search results based on the thread title.
The only addon I have found which does this is @Daniel Hood 's addon for [bd] Widgets, [XenMods] Similar Threads. I tested this but the search results were too generic and not similar to threads I was viewing. Daniel is looking in to how to make the similar threads search better and I wanted to put the question out to the development community here at Xenforo. Not because I don't think Daniel can do a great job, but because this functionality is so important that it should be in the core software and I think that the idea will benefit from as much input as possible.
There is a Similar Threads addon by @AndyB which displays them when creating a new thread. The idea being to avoid posting threads similar to existing ones. Andy has included some options which help narrow down the search results. He is searching on the thread title only, of course.
- Limit search to threads in the same forum
- Exclude specific forums from the search
- Minimum word length - ignore words shorter than a specified number of characters
- A list of common words to exclude from the search
- Specific punctuation characters to ignore
- Support for multibyte characters - I guess this is an option in the search, but I don't know anything about it
I'd like to run through my thoughts and then invite your feedback.
First how would the search be influenced by the presence of Xenforo's Enhanced Search addon? I'm guessing a fair bit of added functionality is available in Elastic, but I'm unfamiliar with what this is so I'm going to speak generally and assume that the search could be changed to take advantage of Elastic if it is installed.
Keyword matching
I have a problem with excluding words based on the number of characters. If, for example, we have a title of 'What is the best LG TV?' the two letter words LG and TV are the most important search terms for finding a similar thread.
But obviously it's important to give less weight to common words. This document http://www.elasticsearch.org/guide/...nce/current/query-dsl-common-terms-query.html implies that with Elastic, we can nail this. If Elastic figures out the common terms itself based on what's in the index, and then allows us to give them less weighting, then that's awesome. But if the Xenforo installation does not include Elastic, then would there need to be a list of common words entered, I guess, in order to try and achieve a similar result.
Forum
I think the weighting of the search results should be influenced by the forums they are in. So threads in the same forum should be most given preference. Then the child forum, sibling forum and parent forum. In this order? In any order?
Once the above is in place, I can't imagine a scenario where I would want to exclude the results from a specific forum. If I was looking for a similar thread to one in the OLED TVs forum, I can't imagine I would want any results from the Holidays forum, but if preference is given to threads based on the forum then hopefully that won't happen.
Characters
With regard to punctuation characters, the only one I can think would make any difference is the question mark. I'm thinking that if the thread is a question, then we'd want to match similar questions.
Thread date
Should any preference be given to more recent threads?
Prefix
If the thread has a prefix, then some preference should be given to threads with the same prefix.
Is there anything else which would help find the best set of search results?
Any input from you?
Thanks