Tuesday, July 20, 2010

Wednesday, May 5, 2010

Kuten code to Unicode

So how can I get Unicode from this Ku-Ten code?
16-01

In this case, Ku is 16 and Ten is 01. Ku-Ten system represent characters in 94 by 94 matrix.
Ku-Ten code is not identical to JIS code, you need mapping.
JIS code avoids 0x00-0x20 appear in the encoding, so the mapping is to add 0x20 to Ku and Ten each.
For 16-01, JIS code is not 3601 because Ku-Ten is usually represented in decimal numbers, I always make this mistake and ends up with a wrong character.
So the first step is to convert them to Hex 10-01 then add 0x20 so the result is 3021, that is the JIS code value.

perl -e 'printf("%x%x\n", 16+32, 1+32)'

Once you get JIS code, there is a mapping available to Unicode (JIS X 0213 to Unciode).
http://x0213.org/codetable/iso-2022-jp-2004-std.txt

3-3021 U+4E9C #

The prefix '3' is a plane followed by JIS code then you get the Unicode.
At the top of the above page, there is an explanation of plains.

## 0-XX ISO/IEC 646 IRV (designated by '1b 28 42')
## 3-XXXX JIS X 0213:2004 plane 1 (designated by '1b 24 28 51')
## 4-XXXX JIS X 0213:2000 plane 2 (designated by '1b 24 28 50')

Somehow, '3' means plain '1' in this table.
From Wikipedia:
Plane 1 is a superset of JIS X 0208 containing kanji sets level 1 to 3 and non-kanji characters such as Hiragana, Katakana (including letters used to write the Ainu language), Latin, Greek and Cyrillic alphabets, digits, symbols and so on. Plane 2 contains only level 4 kanji set. Total number of the defined characters is 11,233.


So generic characters are included in plane 1, and not frequently used characters like this also.
第3水準1-90-17
This is Level 3 character but still Plane 1, Ku 90, Ten 17.
JIS code is 0x7A31 and U+7E11.
3-7A31 U+7E11 # [2000]


Some of the characters are outside of Unicode BMP (i.e. >0xFFFF).
3-776C U+247F1 # [2000] [Unicode3.1]
4-2177 U+20381 # [2000] [Unicode3.1]

Wednesday, April 14, 2010

Version 1.6 is on the App Store.

Version 1.6 is on the App Store.

Added encoding info in per character detail view, it shows possible other encodings for the character such as Shift_JIS, Big5, ISO-8859 and windows.


Friday, April 9, 2010

Version 1.5 is on the App Store

Version 1.5 is on the App Store.

Bug fix: avoid crash when stored user default font is not available on the device.
For example, iPhone profile migrated to iPad may contain a font name which is not available that device.

Tuesday, February 9, 2010