[問題] 關於中文編碼問題 想請教有無資料可查閱
看板Programming作者donkeychen (Bad_To_The_Bone)時間11年前 (2013/12/11 14:54)推噓21(21推 0噓 104→)留言125則, 10人參與討論串1/1
大家好
想請教一下關於中文編碼的問題
以 "複製" 兩個字來看
我利用google搜尋的網址 測試
用%BD%C6%BB%73
可以得到與
%E8%A4%87%E8%A3%BD
一樣的結果
(
https://www.google.com.tw/search?
safe=off&rlz=1C1SAVA_enTW501TW501&espv=210&es_sm=93&q=%BD%C6%BB%73
)不縮網址
(
https://www.google.com.tw/search?
safe=off&rlz=1C1SAVA_enTW501TW501&espv=210&es_sm=93&q=%E8%A4%87%E8%A3%BD
) 不縮網址
==========================================================
我看了一下一個文字檔如果我用winxp 用pspad輸入
"複製"
在hex editor裡面看到的是
BDC6BB73
我嘗試以Hex editor把BDC6BB73 的部分以 E8A487E8A3BD 取代
而填入後直接打開顯示為亂碼
"銴殴ˊ"
===========================================================
另外如果在GOOGLE搜尋欄位搜尋
"%E8%A4%87%E8%A3%BD"
一樣能找到一些與 "複製" 相關的網頁
複製狗狗技術進軍英國- Yahoo奇摩新聞
真的有辦法複製人腦嗎?IBM的複製人工智慧計畫大公開
===========================================================
由於去Browser版看好像大多都是在討論亂碼的問題
回答多屬於修改瀏覽器的設定
想請問一下在這邊的大大有沒有對這方面有涉獵的
是否有工具/網頁 OR 以程式讀BYTE做修改的方式做轉換
(寫程式應該不困難 但是不懂轉換原則)
謝謝
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 210.59.147.226
→
12/11 14:57, , 1F
12/11 14:57, 1F
→
12/11 14:57, , 2F
12/11 14:57, 2F
※ 編輯: donkeychen 來自: 210.59.147.226 (12/11 14:59)
→
12/11 15:05, , 3F
12/11 15:05, 3F
→
12/11 15:05, , 4F
12/11 15:05, 4F
→
12/11 15:06, , 5F
12/11 15:06, 5F
→
12/11 15:06, , 6F
12/11 15:06, 6F
→
12/11 15:07, , 7F
12/11 15:07, 7F
→
12/11 15:08, , 8F
12/11 15:08, 8F
推
12/11 15:38, , 9F
12/11 15:38, 9F
→
12/11 15:39, , 10F
12/11 15:39, 10F
→
12/11 15:39, , 11F
12/11 15:39, 11F
→
12/11 15:39, , 12F
12/11 15:39, 12F
→
12/11 15:39, , 13F
12/11 15:39, 13F
→
12/11 15:41, , 14F
12/11 15:41, 14F
→
12/11 15:43, , 15F
12/11 15:43, 15F
推
12/11 16:43, , 16F
12/11 16:43, 16F
→
12/11 16:44, , 17F
12/11 16:44, 17F
→
12/11 16:44, , 18F
12/11 16:44, 18F
大大
我有一份工作上需要用到的log檔案
整份log的中文部分呈現亂碼
但是我明確的知道這個呈現亂碼的地方該是中文的"複製"
我不想直接在板上講明自己在做什麼(鬼)工作
所以把我覺得可能有問題的部分po到板上
但是如果大大真的不嫌棄
那可以到
https://dl.dropboxusercontent.com/u/57491997/CoreLog.log
下載
搜尋1111(文字)即可找到有問題的地方
這份檔案我嘗試過MOONRAKER大大的做法
會從原本的 E8A487E8A3BD
變成 E98AB4EF8BACCB8A
他把我原本呈現錯誤的字(亂碼)以utf-8的方式存起來了...
※ 編輯: donkeychen 來自: 210.59.147.226 (12/11 17:40)
→
12/11 17:45, , 19F
12/11 17:45, 19F
→
12/11 17:46, , 20F
12/11 17:46, 20F
→
12/11 17:46, , 21F
12/11 17:46, 21F
→
12/11 17:52, , 22F
12/11 17:52, 22F
→
12/11 17:53, , 23F
12/11 17:53, 23F
→
12/11 17:55, , 24F
12/11 17:55, 24F
→
12/11 17:56, , 25F
12/11 17:56, 25F
→
12/11 17:56, , 26F
12/11 17:56, 26F
→
12/11 17:57, , 27F
12/11 17:57, 27F
→
12/11 17:58, , 28F
12/11 17:58, 28F
→
12/11 17:58, , 29F
12/11 17:58, 29F
→
12/11 18:52, , 30F
12/11 18:52, 30F
→
12/11 18:53, , 31F
12/11 18:53, 31F
→
12/11 18:53, , 32F
12/11 18:53, 32F
→
12/11 18:53, , 33F
12/11 18:53, 33F
→
12/11 18:54, , 34F
12/11 18:54, 34F
→
12/11 18:54, , 35F
12/11 18:54, 35F
→
12/11 18:55, , 36F
12/11 18:55, 36F
→
12/11 18:55, , 37F
12/11 18:55, 37F
還有 50 則推文
還有 4 段內文
推
12/12 14:39, , 88F
12/12 14:39, 88F
→
12/12 14:40, , 89F
12/12 14:40, 89F
→
12/12 14:41, , 90F
12/12 14:41, 90F
推
12/12 14:42, , 91F
12/12 14:42, 91F
推
12/12 14:46, , 92F
12/12 14:46, 92F
→
12/12 14:47, , 93F
12/12 14:47, 93F
→
12/12 14:50, , 94F
12/12 14:50, 94F
→
12/12 14:51, , 95F
12/12 14:51, 95F
→
12/12 14:52, , 96F
12/12 14:52, 96F
推
12/12 14:55, , 97F
12/12 14:55, 97F
→
12/12 14:57, , 98F
12/12 14:57, 98F
推
12/12 15:11, , 99F
12/12 15:11, 99F
→
12/12 15:12, , 100F
12/12 15:12, 100F
推
12/12 23:21, , 101F
12/12 23:21, 101F
→
12/12 23:22, , 102F
12/12 23:22, 102F
推
12/13 00:19, , 103F
12/13 00:19, 103F
→
12/13 00:20, , 104F
12/13 00:20, 104F
→
12/13 00:43, , 105F
12/13 00:43, 105F
→
12/13 00:43, , 106F
12/13 00:43, 106F
我都俗氣的唸 叉...
推
12/13 01:10, , 107F
12/13 01:10, 107F
→
12/13 01:11, , 108F
12/13 01:11, 108F
→
12/13 01:12, , 109F
12/13 01:12, 109F
→
12/13 01:13, , 110F
12/13 01:13, 110F
→
12/13 01:14, , 111F
12/13 01:14, 111F
→
12/13 01:16, , 112F
12/13 01:16, 112F
→
12/13 01:19, , 113F
12/13 01:19, 113F
一開始不知道這也是符合utf-8 encode
以為是另一套編碼
現在看起來似乎是在string與wstring轉換時
出現這樣的問題
code如下
(有用到boost)
using namespace std;
using namespace boost; wstring path = L"複製";
//http://ppt.cc/yK88
string path2 = boost::to_utf8(path);
//http://ppt.cc/eAm2
wstring path3;
path3.assign(path2.begin(), path2.end()); //to log
//http://ppt.cc/nQ7b
//我猜是上面這行有問題
locale old_locale;
locale utf8_locale(old_locale,
new boost::program_options::detail::utf8_codecvt_facet);
filesystem::wpath asPath(L"C:\\test.log");
wofstream logfile;
logfile.imbue(utf8_locale);
if(!logfile.is_open())
{
logfile.open(asPath.external_file_string().c_str(),
ios::in |
ios::out |
ios::app |
ios::ate |
ios::binary);
}
logfile << path3;
logfile.close();
在C:\test.log就會有精美的
EFBFA8EFBEA4EFBE87EFBFA8EFBEA3EFBEBD 了
http://ppt.cc/wEJA
※ 編輯: donkeychen 來自: 210.59.147.226 (12/13 10:12)
推
12/13 09:50, , 114F
12/13 09:50, 114F
→
12/13 09:52, , 115F
12/13 09:52, 115F
→
12/13 09:53, , 116F
12/13 09:53, 116F
→
12/13 10:30, , 117F
12/13 10:30, 117F
推
12/13 11:28, , 118F
12/13 11:28, 118F
→
12/13 14:49, , 119F
12/13 14:49, 119F
推
12/14 13:27, , 120F
12/14 13:27, 120F
→
12/14 13:29, , 121F
12/14 13:29, 121F
→
12/14 13:32, , 122F
12/14 13:32, 122F
→
12/14 13:34, , 123F
12/14 13:34, 123F
→
12/14 16:05, , 124F
12/14 16:05, 124F
→
12/17 13:54, , 125F
12/17 13:54, 125F
Programming 近期熱門文章
PTT數位生活區 即時熱門文章