java - Emoji not encoding -
i retrieving twitter tweets , attempting save them flat file. have following code:
string jsonstring = new gson().tojson(tweets);   byte[] utf8jsonstring = jsonstring.getbytes("utf-8");   string utf8json = new string(utf8jsonstring, "utf-8");  system.out.println( utf8json);   output:
..."id":768260789744443392,"text":"#emojicity5 ?","source"...   the emoji (just after #emojicity5) appearing ?.  have attempted endode using utf-8, utf-16be, utf-16le, utf-32be, , utf-32le no avail. system using jdk 1.6 , 3.0.3 of twitter4j.  missing here?  
string contains unicode, no need convert same string. when or byte[] 1 needs indicate encoding of bytes.
however problem console has no unicode encoding utf-8 , might not have emoji in fonts. problem of system.out.println. in case system.out in other encoding not represent emoji , instead printed question mark.
what can check whether emoji arrived, dump unicode code points.
in java 8:
jasonstring.tocodepoints()         .filter(cp -> cp >= 256)         .foreach(cp -> {             system.out.printf("u+%x = %s%n",                 cp, character.getname(cp));          });  boolean containsemoji(string s) {     return s.codepoints().anymatch(cp ->         unicodeblock.of(cp).equals(unicodeblock.emoticons)); }      
Comments
Post a Comment