Nowadays it is a standard that all web sites and servers run at least with UTF-8 encoding as default, however you can find cases where it is LATIN1 or even ASCII.
The case we are dealing with today is a Debian server, version 5.0.3 (etch), which used encode LATIN1 as default encoding when installed and configured the Postgresql, and not let me create a database with UTF-8 encode, showing following error:
debian:~# su postgres -c 'createdb teste3 -E UTF-8' createdb: database creation failed: ERROR: encoding UTF8 does not match server's locale en_US DETAIL: The server's LC_CTYPE setting requires encoding LATIN1.
When Postgresql runs first time are created a cluster using default system encode, and after that, we can’t change cluster encoding. The default cluster are know as ‘main’.
In case of this Debian server, the default encode was configured to en_US, what for Debian means en_US.ISO-8859-1. By default in Debian Lenny, in my new instalations, system uses en_US.UTF-8 as default encoding, so, this server was probably an Debian Etch upgraded to Debian Lenny
After some research, I discovered that, after configure the server encode to use UTF-8, I can create a new cluster with UTF-8 support. See the walktrough:
- Creating a new cluster: Create a new cluster is a very simple operation:
debian:~# pg_createcluster 8.3 utf8_cluster Creating new cluster (configuration: /etc/postgresql/8.3/utf8_cluster, data: /var/lib/postgresql/8.3/utf8_cluster)... Moving configuration file /var/lib/postgresql/8.3/utf8_cluster/postgresql.conf to /etc/postgresql/8.3/utf8_cluster... Moving configuration file /var/lib/postgresql/8.3/utf8_cluster/pg_hba.conf to /etc/postgresql/8.3/utf8_cluster... Moving configuration file /var/lib/postgresql/8.3/utf8_cluster/pg_ident.conf to /etc/postgresql/8.3/utf8_cluster... Configuring postgresql.conf to use port 5433... debian:~# /etc/init.d/postgresql-8.3 restart
The cluster runs on first free port after 5432, unless that you specify. Do a Postgresql restart to take effect.
- Creating the user: Now, create the user to access database, I recommend that you use same username and password for user, and, set your authentication for trust on cluster’s pg_hba.conf:
debian:~# su postgres -c "createuser -P pyuser -p 5433" Enter password for new role: Enter it again: Shall the new role be a superuser? (y/n) n Shall the new role be allowed to create databases? (y/n) n Shall the new role be allowed to create more new roles? (y/n) n
- Populating new databases: Now you move databases to new cluster. First, you createdb and after run a pg_dump | psql to dump and populate database very fast, but be careful with ports:
debian:~# su postgres -c "createdb teste3 -e -p 5433 -O pyuser" CREATE DATABASE teste3 OWNER pyuser;
Now list the databases:
debian:~# su postgres -c "psql -l -p 5433" List of databases Name | Owner | Encoding -----------+----------+---------- postgres | postgres | UTF8 template0 | postgres | UTF8 template1 | postgres | UTF8 teste3 | pyuser | UTF8 (4 rows)
After create database, dump and populate:
debian:~# pg_dump -U pyuser teste3 | psql -U pyuser -p 5433 teste3
- Configuring your application: That’s are one small part, just change settings of your application to connect on port 5433, or what you choose.
Do each step after creation of database for each database that you need to move, and have all working with UTF-8.