[imc-sf-active] utf-8
mark burdett
mark at indymedia.org
Thu Jul 28 12:14:03 PDT 2005
seems like now might be a good time to "force" all sf-active
sites & databases to be utf8.. but at least make it an easily
configurable option.
i changed radio.indymedia.org to be a utf-8 site, it was easy.
i wanted to change the db to utf8 as well so that e.g.
searches and everything work seamlessly:
http://radio.indymedia.org/news/?keyword=peri%C3%B3dico
run a sql command for each table:
Alter table webcast modify column article text character
set utf8, modify column summary text character
set utf8 [etc.]
note, if your database has something aside from latin1, this
probably won't work right. it'll get garbled. there are
various docs online to help you deal w/ this.. so at least
make sure you backup the old database first..!
you also might as well set the default character set to utf8
for each table and the database as a whole, so if something
is added in the future it will be utf8.
add this to get_connection() in db_class.inc:
mysql_query('SET CHARACTER SET utf8');
and you could also add
mysql_query('SET NAMES utf8');
mysql_query("SET COLLATION_CONNECTION='utf8_general_ci'");
change the <meta> header for your web and admin pages to
charset=utf-8
put this in your httpd.conf:
AddDefaultCharset UTF-8
so your server is sending out the correct header.
remove any utf_encode() from syndication classes. also remove
any translation being done for windows characters in posts.
this stuff will all be utf-8 now, so it's not necessary.
if you are aggregating feeds onto your site, remove any
utf_decode() functions from these.
that's it. well of course there are some other considerations
re: multibyte text but this is basic functionality.
this stuff can easily be placed in "if" statements based on
sfactive.cfg but might be worth forcing utf8 on everyone at
some point...
--mark B.
More information about the imc-sf-active
mailing list