Saturday, January 3, 2009

MySQL and Unicode

I just wanted to point out. MySQL STILL uses ISO-Latin-1 as the default encoding. LAME!!! I ran into again the problem today of having to set up my production database to use UTF-8. Not a difficult thing, but a hassle none the less. The thing is in these days everyone is using UTF-8, I mean i18n and l13n are two big deals. Some web programming frameworks make it a huge point of how well they handle the two.

Well anyway what to do? I had already created the database and tables, in the default ISO-Latin-1 encoding, and there was a few months of data in them. Therefore I had to start by fixing the encoding on the database and tables. Here are the commands that can make this happen:
alter table xxx character set = utf8;
alter database xxx character set = utf8;
Of course insert your table and database names. Now any new tables created in the database will have utf8 encoding. You probably WANT to make sure that everything in your database is UTF-8. The best way I know of doing this is my setting the proper values in the my.cnf file in /etc. If the file is not in /etc then you probably will need to create it. If someone else admins the machine and installed mysql there is a possibility also that the file is in another castle, er sorry location.

So crack open the my.cnf and insert this line:
default-character-set=utf8
Thats it, restart MySQL. To test this out create a new database, then a table in the database. Issue the 'show create table xxx' where xxx is the table name, you should see some text indicating it has UTF-8 as the encoding.

No comments: