ES 2.2 Are similar threads generated just from thread titles or all content?

Kilt

Active member
Just migrated from VB and wondering whether XFES, when generating similar threads, is looking just at titles, or all words in an OP, or all words in a thread, or something else.
 

Having now used XFES Similar Threads on my site for three months, I believe the feature looks at more than thread titles.

For example, we have a thread titled solely "Opitima Number 3", which is the name of a boat. The posts talk about whether Rustoleum or EZ-Poxy paints are effective to paint the boat. Similar Threads turns up other threads having posts with the words "Rustoleum" and "EZ-Poxy" in them.

I've also noticed that if all other similiarities seem to fail, Similar Threads will offer up completely irrelevant threads simply because they are authored by the same person as the OP.

All in all, Similar Threads is a great feature to promote searches and to grab readers for seriatim thread surfing, and I recommend it highly.
 
The similar threads feature finds similarities by comparing both thread titles and first post message content.

Thanks for this amplification, Jeremy, but I'd like to flesh out how Similar Threads works in a bit more detail based on my experimentation.

First, to coin some terminology, I'll call the thread one is creating or looking at the "target" thread, and all the other threads that Similar Threads looks at are "candidate" threads. The candidate threads seem to be all the other threads on the entire site.

What I seem to see is that Similar Threads will look at the words in the title and first message of the target thread and look for matches in the words in the titles and first messages in all the candidate threads. That's quite a lot of work! Is all this consistent with what you are saying?
 
Yeah basically. It's powered by Elasticsearch (and in particular a MLT query) so it's not particularly slow. In very basic terms, Elasticsearch will select significant terms (keywords) from the title and message using data from the index and return other documents (threads) containing those terms. Not too different from a regular search, really, aside from the algorithmic selection of keywords.
 
Last edited:
Top Bottom