您的位置:首页 > 移动开发 > Android开发

android编码的理解1

2015-11-12 14:53 381 查看
最近在探究你们的code,发现你们在ffmpeg的id3v1.c文件中,添加了下面的函数将MP3文件的歌唱者的名字从GBK编码转到其他编码格式上。我的理解是最终要将GBK转到UTF8上,不然中文会乱码。但下面的装换很简单,并不是转换成UTF8.我想问的是这个函数将GBK转成什么格式?希望能得到你的帮助。

最近在用媒体中心播放音频时,发现ffmpeg获取metadata后,中文歌手名显示乱码。主要原因是ffmpeg取出的歌手名是GBK编码,直接通过android的newStringUTF给了java层显示,导致显示错误。下面的patch会将gbk转成UTF,再传给上层JAVA。

+static void convert_iso8859_to_string(const uint8_t *data, int size, char *s) {

+    int utf8len = 0;

+    int i;

+       

+    for (i = 0; i < size; ++i) {

+        if (data[i] == '\0') {

+            size = i;

+            break;

+        } else if (data[i] < 0x80) {

+            ++utf8len;

+        } else {

+            utf8len += 2;

+        }

+    }

+

+    if (utf8len == size) {

+        // Only ASCII characters present.

+

+        memcpy(s, data, size);

+        s[size] = '\0';

+        return;

+    }

+

+    char *ptr = s;

+    for (i = 0; i < size; ++i) {

+        if (data[i] == '\0') {

+            break;

+        } else if (data[i] < 0x80) {

+            *ptr++ = data[i];

+        } else if (data[i] < 0xc0) {

+            *ptr++ = 0xc2;

+            *ptr++ = data[i];

+        } else {

+            *ptr++ = 0xc3;

+            *ptr++ = data[i] - 64;

+        }

+    }

+    *ptr = '\0';

+

+}

+

 static void get_string(AVFormatContext *s, const char *key,

                        const uint8_t *buf, int buf_size)

 {

     int i, c;

     char *q, str[512];

 

+    convert_iso8859_to_string(buf, buf_size, str);

+#if 0

     q = str;

     for(i = 0; i < buf_size; i++) {

         c = buf[i];

@@ -191,6 +234,7 @@ static void get_string(AVFormatContext *s, const char *key,

         *q++ = c;

     }

     *q = '\0';

+#endif

 

     if (*str)

         av_dict_set(&s->metadata, key, str, 0);

UTF8编码表
http://blog.csdn.net/qiaqia609/article/details/8069678

GBK编码表

http://blog.csdn.net/qiaqia609/article/details/8069655

utf8汉字编码对照表_信息与通信_工程

http://cache.baiducontent.com/c?m=9d78d513d98407fb4fece4741a16a671695797143ec0a11568a3e35cd424054e1d20a5f930236319ce802b3b58e85e5c9da06529614437b7ec99d515c0ffc97f6a957332211c864613d51bffcd17259621c45decaf1ce3bba66184aea589990b0d&p=9b3fc64ad4d015b708e29778065594&newp=8e6acc1487d512a05abd9b7d0b1da5231611d73f6590cf512496fe4b98&user=baidu&fm=sc&query=utf8+%B1%E0%C2%EB&qid=dcbcd21c000092cc&p1=5

UTF8编码表

http://blog.csdn.net/qiaqia609/article/details/8069678

全角字符unicode码对应表

http://blog.csdn.net/lvwx369/article/details/39294265

Unicode码对应表_IT/计算机_专业资料

http://cache.baiducontent.com/c?m=9f65cb4a8c8507ed4fece76310478a215915d7743ca080462482d45f93130a1c187ba7e070670d0fd4cf7b6c51ad4f0be0f53570345724bcccc98b41daea963f2fff7d722f42914066934eb8ca30619a77d54eacf259b1b5e743e2b9a5a2c854228d0f5e2bdda6dc4d00659b3ea745&p=8b2a975686cc40ad07f1cf351564&newp=8a769a47999611a059ef8a24565692695c16ed623e9885&user=baidu&fm=sc&query=%CC%EC+unicode+ccec&qid=b21a779500071717&p1=1

GBK、GB2312、iso-8859-1之间的区别

http://blog.csdn.net/jerry_bj/article/details/5714745
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: