Converting a Postgres Database to UTF8
February 27, 2011
I have a database that is struggling to store some of the data my users are entering. It's a default postgres database, which means it's in LATIN1 not UTF8. This post covers converting a database from LATIN1 to UTF8.
Dump Database
Use pg_dump to get the data from the database into a file.
pg_dump database_name -U owner_username > db.sql
Create new UTF8 Database
Next, we need to create the new database with UTF8 encoding. In my setup, this is done using the superuser login to postgres.
postgres=# create database database_utf8 with encoding 'UTF-8'; CREATE DATABASE postgres=# grant all on database database_utf8 to owner_username; GRANT postgres=#
Convert File
We now need to convert the database dump from latin1 to utf-8. Here's the command to do that:
iconv --from-code latin1 --to-code utf-8 db.sql > db-utf8.sql
We also need to change the client encoding in the file. Change the client_encoding line near the top from LATIN1 to UTF-8:
SET client_encoding = 'UTF-8';
Finally, connect to our new utf8 database and import the new file.
psql -U user user=> \c database_utf8 database_utf8=> \i db-utf8.sql