July 05, 2004
UTF-8 GB convert and coding problems::[Misc]

1] Change '&'#XXXXX;' to Utf-8:
perl -p -e 's/(.....);/pack("U", $1)/eg'
2] Change \xb?\xb? to GB2312
perl -p -e 's/\\x(..)/pack("c", hex($1))/eg'
3] Change %C2%D2%C2%D7 to gb2312
perl -p -e 's/%(..)/pack("c", hex($1))/eg'
basicly, jv-convert is the best tool to convert different formats in UNIX/Linux:
jv-convert --from UTF-8 --to gb18030
:
jv-convert --help
Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]
Convert from one encoding to another.
--encoding FROM
--from FROM use FROM as source encoding name
--to TO use TO as target encoding name
-i FILE read from FILE
-o FILE print output to FILE
--reverse swap FROM and TO encodings
--help print this help, then exit
--version print version number, then exit
Trackback
You can ping this entry by using http://www.wespoke.com/cgi-bin/mt/mt-tb.cgi/532
