Any news on the Big Board seach app yet Kier?

I was under the impression that Elasticsearch/Lucene indexes consume a lot of memory.
Seeing how I've never used either of them on test data, I will defer that question to those who know much more about both products than I do... *cough* shawn? care to take a stab?*cough*
 
Sphinx is lighter weight than Elasticsearch (not by much), however Sphinx excels at simple queries and the need for the incremental indexes mean you have a dead time betweeb an item being posted and being picked up.

Elasticsearch has the benefit of getting around this, as items are indexed in real time and is much more flexible when it comes to advanced queries. Elasticsearch also comes with a much easier setup for distributed resources, and allows for similar documents to be compared and retrieved, afaik something Sphinx just cannot do. I recall various other tit-bits of information but it came down to this:

Elasticsearch is simply newer and more advanced, at the cost of a few more resources you gain a lot more in functionality and deployability not only as a server resource, but an integration one also.
 
Creepy.

About 2 minutes after making this post I got this email.

My name is *snip* and I am your contact here at Sphinx. I hope you have experienced continued growth with your business and that Sphinx has played a successful role. Let me know what the current status of your Sphinx applications are, if you have had a chance to use our latest stable release of 2.0.3, and what we can do to help. Depending on your growth it may be time to look at one of our support / consulting offerings to keep or get your performance where it needs to be.
http://sphinxsearch.com/services/support/
http://sphinxsearch.com/services/consulting/
I look forward to working with you and appreciate your support of Sphinx.
Regards,
 
One thing I have noticed (and that maybe due to pre-loading) is that although I have allocated enough memory to keep the entire index in RAM ElasticSearch is still hitting the disk for initial searches of words, subsequent searches of the same words are hitting the index in RAM. To help pre-load the index into RAM I now have a script running every 2 seconds searching for random words. Seems to work a treat, even though I do not need to do this it is something that bothers me knowing I have more RAM than the index consumes on disk and want to use it considering I have allocated it.
 
One thing I have noticed (and that maybe due to pre-loading) is that although I have allocated enough memory to keep the entire index in RAM ElasticSearch is still hitting the disk for initial searches of words, subsequent searches of the same words are hitting the index in RAM. To help pre-load the index into RAM I now have a script running every 2 seconds searching for random words. Seems to work a treat, even though I do not need to do this it is something that bothers me knowing I have more RAM than the index consumes on disk and want to use it considering I have allocated it.
Are you part of some prerelease group? I can't find the app in the customer / purchase section of the site.
 
One thing I have noticed (and that maybe due to pre-loading) is that although I have allocated enough memory to keep the entire index in RAM ElasticSearch is still hitting the disk for initial searches of words, subsequent searches of the same words are hitting the index in RAM. To help pre-load the index into RAM I now have a script running every 2 seconds searching for random words. Seems to work a treat, even though I do not need to do this it is something that bothers me knowing I have more RAM than the index consumes on disk and want to use it considering I have allocated it.

Could you not just run a dictionary through the search? Alternatively, have you tried loaded the files into ramdisk?
 
Could you not just run a dictionary through the search? Alternatively, have you tried loaded the files into ramdisk?
I am running a dictionary through the index. Much prefer that then messing around with a ramdisk. I don't want to have to copy stuff on start/end from the disk to the ramdisk etc. The script I have running is doing its job of pre-loading the index. Much like Percona has an option for pre-loading the cache (innodb_buffer_pool_restore_at_startup).
 
I am running a dictionary through the index. Much prefer that then messing around with a ramdisk. I don't want to have to copy stuff on start/end from the disk to the ramdisk etc. The script I have running is doing its job of pre-loading the index. Much like Percona has an option for pre-loading the cache (innodb_buffer_pool_restore_at_startup).

Well, it was more of a ... how low can we get the ms on a search to be, opposed to anything else :)

unfortunately I only have my production server running at the moment otherwise id test it myself.
 
Well, it was more of a ... how low can we get the ms on a search to be, opposed to anything else :)

unfortunately I only have my production server running at the moment otherwise id test it myself.
The point is that some searches are instant (relatively speaking) but some can take a second due to the index being loaded off the disk. Even though I have mmapfs set as the default it still needs to pre-load the index into RAM. Having a script searching for random English words is doing that for me.

Edit:

I have my search configured to return upto 1000 matches. If the search word/term is pre-loaded it is instant and uncannily fast...
 
ah nice.

Actually i have noticed this, some search terms take 1-2 seconds to return, while others are instant, care to share your script?
 
I don't understand that other than install the new add-on I need to install elastic search software too.
Can someone explain about this point?
 
Is there any chance we might get stemming support for other languages in the foreseeable future? That would be a really cool addition (although I'm definitely gonna use this if I migrate my big board to XF, with or without stemming).
 
Top Bottom