If you would like to only allow modern English alphabet, numbers and space in usernames, you can use the "Username match regular expression" field in the admin control panel:
Admin control panel -> Setup -> Options -> User registration
What is the easiest way to match non-ASCII characters in a regex? I would like to match all words individually in an input string, but the language may not be English, so I will need to match thing...