Sunday, October 5, 2008

The Case of the Bogus Chinese

Recently in the Apple forums someone had a problem with his normal English text turning into Chinese when he applied QuickTime's Export > Text to Text > Text with Descriptors function. Examination of the output file showed the encoding was marked as 256, which means UTF-16, even though the text itself was just ASCII. So for example the two characters "th", with byte values 74 68, were being read as a single two-byte character 7468, or 瑨.

From my testing it appears that when the input text is UTF-16, QT retains this in the Descriptors, but converts the text itself to ASCII. Must be a bug.

No comments: