That doesn't handle unicode, and will give lots of a false positives with complex bbcode, and you really don't want to be recounting every word in an entire thread every page view.
My sites have multiple threads with +1 million words counted by threadmarks.