decoding utf-8 type emoji codes and special characters from facebook data
52 views (last 30 days)
Show older comments
Hi, I recently downloaded the messenger data from facebook in form of ".json" format.
This format was new for me and it was quiet interesting to load,play around the file and make it like a conversation.
The problem is with decoding the emojis. I have no idea about the format. It looked something like this..
"\u00f0\u009f\u0098\u0082 \u00f0\u009f\u0098\u0082" which, the actual emoji I used is ??.
In matlab as shown in the figure it shows some rubbish "ð ð".
After a long research in the internet, I came to know that it is Unicode-8 format. So, I tried to read the file using unicode-8 format by looking at some answers form matlab central..
clear; clc
fname = 'message_keller.json';
fid = fopen(fname, 'rb');
raw = fread(fid, '*uint8')';
str = native2unicode(raw,'UTF-8');
fclose(fid);
val = jsondecode(str);
But it still was showing "ð ð".
The above link was the method I found for decoding. But that was for powershell.
Can anyone help me decode the unicode so that it can be viewed in matlab and other softwares (curently I am planning to export the conversation to excel)..?
4 Comments
Guillaume
on 12 Oct 2018
I wanted the raw json, not the stuff you've parsed when it is too late to get the right characters. You can just replace the confidential bits with xs or dots.
Or just provide the actual portion of the raw json that correspond to an actual message, e.g, one of the
{"message":{"sender_name":"Don't care","timestamp_ms":whatever,"content":"this is what I need","type":"Generic"}}
section.
Answers (0)
See Also
Categories
Find more on JSON Format in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!