Issue with native2unicode and windows-1252 encoding
13 ビュー (過去 30 日間)
古いコメントを表示
Hi all,
I'm trying to encode some bytes into a character set using the windows-1252 encoding and I've checked that native2unicode
回答 (3 件)
Walter Roberson
2022 年 1 月 14 日
source = char(0:511)
bytes = unicode2native(source, 'windows-1252')
backport = char(bytes)
whichdiffer = find(source(1:256) ~= backport(1:256) )
source(whichdiffer)
bytes(whichdiffer)
backport(whichdiffer)
What this is telling us is that Unicode 129 to 141 are not represented in Windows 1252
bytes2 = uint8(129:141)
encodes_as = native2unicode(bytes2, 'windows-1252')
double(encodes_as)
Looks about right.
2 件のコメント
Walter Roberson
2022 年 1 月 17 日
code point 26 is the standard value to substitute for codepoints that cannot be represented
https://en.m.wikipedia.org/wiki/Substitute_character
Borja Heriz
2022 年 1 月 17 日
1 件のコメント
Rik
2022 年 1 月 17 日
This is an answer, but it looks like a comment. Please use the comment sections to post comments. The order of answers can change, which will make reading back confusing.
Please post this as a comment and delete the answer.
When you do, I (or Walter) will post something along these lines:
Why do you think 153 and 156 are encoded as the same character? They are displayed as the same character, but that is probably due to a limitation in the display, as this could very well encode a control character without a proper symbol.
参考
カテゴリ
Help Center および File Exchange で Data Type Conversion についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!