Found some good info on Unicode in regular expressions: Unicode Regular Expressions. Particularly some good doco about Unicode Categories and how to indicate them in regular expressions.
So today I went looking for a font to replace ‘Inconsolata 14’ in my NetBeans IDE because it wasn’t supporting Unicode and I found Best Unicode Fonts for Programmer which lead me to DejaVu Sans Mono.
While I was at it I changed my Konversation font from Ubuntu Mono 12 to DejaVu Sans Mono too.
My friends on #lobsters also recommended:
- Monaco (Apple font)
- JetBrains Mono
- Fira Mono (13pt)
- IBM Plex Mono
An excellent article on Unicode: Emoji under the hood.
I was getting an error like this:
/etc/cron.daily/etckeeper: bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 34: ordinal not in range(128) Traceback (most recent call last): File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 853, in exception_to_return_code return the_callable(*args, **kwargs) File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 1055, in run_bzr ret = run(*run_argv) File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 661, in run_argv_aliases return self.run_direct(**all_cmd_args) File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 665, in run_direct return self._operation.run_simple(*args, **kwargs) File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 122, in run_simple self.cleanups, self.func, *args, **kwargs) File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 156, in _do_with_cleanups result = func(*args, **kwargs) File "/usr/lib/python2.6/dist-packages/bzrlib/builtins.py", line 659, in run no_recurse, action=action, save=not dry_run) File "/usr/lib/python2.6/dist-packages/bzrlib/mutabletree.py", line 50, in tree_write_locked return unbound(self, *args, **kwargs) File "/usr/lib/python2.6/dist-packages/bzrlib/mutabletree.py", line 521, in smart_add for subf in sorted(os.listdir(abspath)): UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 34: ordinal not in range(128) bzr 2.1.4 on python 2.6.5 (Linux-188.8.131.52-rscloud-x86_64-with-Ubuntu-10.04-lucid) arguments: ['/usr/bin/bzr', 'add', '-q', '.'] encoding: 'ANSI_X3.4-1968', fsenc: 'ANSI_X3.4-1968', lang: None plugins: bzrtools /usr/lib/python2.6/dist-packages/bzrlib/plugins/bzrtools [2.1.0] etckeeper /usr/lib/python2.6/dist-packages/bzrlib/plugins/etckeeper [unknown] launchpad /usr/lib/python2.6/dist-packages/bzrlib/plugins/launchpad [2.1.4] netrc_credential_store /usr/lib/python2.6/dist-packages/bzrlib/plugins/netrc_credential_store [2.1.4] news_merge /usr/lib/python2.6/dist-packages/bzrlib/plugins/news_merge [2.1.4] *** Bazaar has encountered an internal error. This probably indicates a bug in Bazaar. You can help us fix it by filing a bug report at https://bugs.launchpad.net/bzr/+filebug including this traceback and a description of the problem. etckeeper warning: bzr add failed Committing to: /etc/ modified apache2/passwd.htdigest modified apache2/sites-available/svn.jj5.net-ssl Committed revision 87.
I’ve tried to fix it by adding:
export LANG=en_AU.UTF-8 export LANGUAGE=en_AU:en
As lines 2 and 3 in /etc/cron.daily/etckeeper.
Now I’ll wait a day or two and see if it worked…
Reading about The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding).
Bumped into this old article by Joel Spolsky: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).
Today I learned about Punycode, which is a system for reversibly encoding Unicode in ASCII.