ES vs. Sphinx

Gladius

Well-known member
Any direct comparisons so far? From what I've read, Sphinx is much more efficient in several (all?) aspects than ES so it makes me wonder why XF's gone with ES...
 
I don't know, most of us are rather sensitive about how much CPU and RAM resources are wasted and ES uses up a lot more of both from the results I've seen posted here (considerably more HD space as well). For our uses, the ES benefits are secondary to Sphinx's so I'd certainly prefer Sphinx integration over ES.
 
I don't know, most of us are rather sensitive about how much CPU and RAM resources are wasted and ES uses up a lot more of both from the results I've seen posted here (considerably more HD space as well). For our uses, the ES benefits are secondary to Sphinx's so I'd certainly prefer Sphinx integration over ES.


You will most likely want to have a private developer update Sphinx for you then.

Xenforo chose Elasticsearch over Sphinx for good reason :)
 
Because it was easier? Honest question, because I just don't see how ES would beat Sphinx for the majority of big board users.

If you cant see the benefits then you need to do more research ;)

To list off just a handful.

ES has out the box support for clusers (which big boards are more likely to use). This is possible in Sphinx, however it is considerably more complex to set up.

Sphinx indexs documents with a large delay (deltas). ES has a default delay of one second.

ES has built in "more like this" searches (document comparison). Sphinx doesn't.

ES is easier to set up and get running for end users.

ES is easier to integrate with as it uses JSON.
 
I'm not sure how doing more research would change the facts that I've stated in this thread (as reported by admins who've tried both Sphinx and ES), unless you dispute their validity? I get it that ES comes with more extras that may or may not be useful to some, but the core aspect that big board admins look at is performance in conjunction with resource utilization. Everything else is secondary to that under normal circumstances.
 
Because all your looking at is resource usage and not functionality.

But as said, I would suggest if you want to use sphinx, you look at hiring a developer to update it for you.

Obviously only Kier or Mike can decide and comment on this, however I would be confident in saying XenForo won't be releasing a Sphinx search solution anytime soon. Their current search solution for big boards works, and works fantastically, their resouces will be better spent developping the core product.
 
Either one would have been fine really... As far as the argument that Sphinx results are delayed, that's only the case if you set it up that way.

Realtime indexes work just fine and is even simpler than using a JSON REST request to do it: http://sphinxsearch.com/docs/current.html#rt-indexes

You can simply use the normal MySQL protocol to do a DELETE, INSERT or UPDATE to the Sphinx index.

I'm not opposed to Elastic Search, just wanted to clarify some incorrect information being spread.
 
If I had to build it from scratch as of today, I probably would do it with Sphinx. Not because I think Sphinx is necessarily better, but only because I have more experience with it, so my development learning curve would be shorter.

As far as which I consider to be fundamentally better... I can't really give an educated opinion since I just don't have enough experience with Elastic Search (as a developer or as a user) to make an actual determination.

That being said, for XenForo, I plan on using Elastic Search since I don't have to build it or maintain it. And once I have a live site using it in the real world (and I've done whatever tuning I need to do), I'll be able to have a better opinion on it vs. Sphinx.

Bottom line is something based on Elastic Search OR Sphinx is going to be infinitely better than a FULLTEXT search engine based on MySQL.
 
Playing with a developer snap shot of Sphinx that has a prototype of real time indexing. Have no idea of this will be released to the public, but I can't imagine it wouldn't be.
 
Thanks digitalpoint, I always appreciate your expert input. For your site, would you pick ES over Sphinx?
Okay... I have a *little* more experience with ES now... And the more I play with it, the more I like it. It does some things that I wish Sphinx would do... for example it's ability to auto-shard and replicate to other nodes (servers) is pretty seamless and awesome (it's actually the easiest system of any sort I've dealt with for sharding/replicating... it more or less "magically" works just because it's on the same local network as other nodes... add a server, and the other nodes instantly distribute data to the new node as soon as it's online).

I didn't like how it was treating punctuation as part of a word by default, but that was easily enough fixed: http://xenforo.com/community/resources/change-analyzer-for-enhanced-search.643/

I also think indexing 1 and 2 letter words like it does by default is a rather pointless waste of disk space/memory... but that was also easily fixed in the config file.

I haven't run benchmarks or loaded a ton of data into it yet... and also don't know how the memory requirements compare to Sphinx now that I stopped indexing words < 3 characters.

I ended up writing a module so I can monitor the ElasticSearch cluster from the XenForo admin home...

Image%202012.04.25%206:41:12%20PM.png


If the benchmarks end up being as good as the rest of it, I'll be sold on ES over Sphinx.
 
So I ended up making that little status block for ElasticSearch so you can click the nodes to get detailed info about that node... Might help finding config issues fairly easy...

I ended up writing a module so I can monitor the ElasticSearch cluster from the XenForo admin home...



If the benchmarks end up being as good as the rest of it, I'll be sold on ES over Sphinx.


Would love to have these :o Any chance of releasing them ?
 
Top Bottom