decoding utf-8 type emoji codes and special characters from facebook data
48 ビュー (過去 30 日間)
古いコメントを表示
Hi, I recently downloaded the messenger data from facebook in form of ".json" format.
This format was new for me and it was quiet interesting to load,play around the file and make it like a conversation.
The problem is with decoding the emojis. I have no idea about the format. It looked something like this..
"\u00f0\u009f\u0098\u0082 \u00f0\u009f\u0098\u0082" which, the actual emoji I used is ??.
In matlab as shown in the figure it shows some rubbish "ð ð".
After a long research in the internet, I came to know that it is Unicode-8 format. So, I tried to read the file using unicode-8 format by looking at some answers form matlab central..
clear; clc
fname = 'message_keller.json';
fid = fopen(fname, 'rb');
raw = fread(fid, '*uint8')';
str = native2unicode(raw,'UTF-8');
fclose(fid);
val = jsondecode(str);
But it still was showing "ð ð".
The above link was the method I found for decoding. But that was for powershell.
Can anyone help me decode the unicode so that it can be viewed in matlab and other softwares (curently I am planning to export the conversation to excel)..?
4 件のコメント
Guillaume
2018 年 10 月 12 日
I wanted the raw json, not the stuff you've parsed when it is too late to get the right characters. You can just replace the confidential bits with xs or dots.
Or just provide the actual portion of the raw json that correspond to an actual message, e.g, one of the
{"message":{"sender_name":"Don't care","timestamp_ms":whatever,"content":"this is what I need","type":"Generic"}}
section.
回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で JSON Format についてさらに検索
製品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!