[should i bother to] research this:
I'm using the "tr" command to filter a random garbage string, but why does OS X have issues with locale in this context?
The Mac OSX darwin BSD version as of 10.5.6 exits
if a copyright symbol (x'A9', © ), left double quote (x'93', “) etc,
B8, D1, C0, CF, C2, D8, D4 is encountered with the message:
tr: Illegal byte sequence
and a exit status of 1.
This is easily corrected with:
export LC_ALL=C
So...okay I do that for a single instantiation of a Terminal session, or prefix it in a script; either way works fine. But why can't I just be lazy and set this globally for the system--why do I even have to bother? In other words, if my my system locale is "en_US" anyway, isn't that the same as what "C" uses? (or I suppose "Objective C" since this is Apple)
Bah humbug. Sure don't seem like no Standard Distribution on here, Berkley or otherwise.
Addendum
Okay I may have been a bit harsh in my ignorance <patting MacBook>
Apparently it seems to be status quo in POSIX systems[links below], that the environment variable "LC_ALL" is intentionally left blank, presumably so that the underlying system locale gets inherited as a default. In a new Terminal, this is what I got:
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
I still don't fully get why this would be--especially with regards to what I was doing earlier--but oh well, that's probably enough wandering into Arcane Trivia(TM) territory for today.
[links below]
How Programs Set the Locale
http://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html#Setting-the-Locale
C Locale
http://docs.oracle.com/cd/E23824_01/html/E26033/glmbx.html
Environment Variables (from the v2 spec--I'm not gonna register just to read the current v3; and anyway Oracle did it for me)
http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html