How to migrate typo from mysql to postgresql
I almost always use postgresql when working on a rails application. I won’t list all the little reasons why, but a major reason is for transactional DDL statements, which means when I run a migration that fails, I don’t have to then go run a bunch of cleanup queries to get my database back to how it was before the migration was ran.
When I was setting up this instance of typo, I decided I’d go ahead and go with mysql since I didn’t plan to hack on typo very much. Long story short: I decided to migrate from mysql to postgresql. This howto was done with mysql 5.0.70, postgresql 8.3.5, and typo 5.1.3 It probably will work with any mysql 5+ and postgresql 8+.
In case anybody else out there might be interested in doing likewise, here’s how I did it. These steps will be for a production database, but the changes required for doing it to a development database should be obvious.
Step 0: Backup your data
You don’t really need to be told this, do you?
Step 1: Dump the data from mysql
run the following to dump the data.
mysqldump --compatible=postgresql --no-create-info -u root -p --skip-extended-insert --complete-insert --skip-opt typo > typo.dump
We are only dumping the data, hence the –no-create-info option.
Step 2: Create your postgresql database
You can do this however you see fit. I’ve included how I do it in case it’s useful:
CREATE USER typo_prod;
CREATE DATABASE typo_prod OWNER typo_prod ENCODING 'utf8';
\password typo_prodand enter the password you wish to use.
Step 3: Change your database.yml to use your new postgresql database
Again, do this however you want. Here’s my database.yml with passwords omitted:
defaults: &defaults
database: typo
adapter: postgresql
encoding: utf8
host: localhost
password:
development:
username: typo_dev
database: typo_dev
<<: *defaults
test:
username: typo_test
database: typo_test
<<: *defaults
production:
username: typo_prod
database: typo_prod
password:
host: salmon
<<: *defaults
Step 4: Create the schema in your new database
To do this we’ll run the db:migrate rake task
RAILS_ENV="production" rake db:migrate
Step 5: Fire up a rails console to fix stuff
Now we need to fire up a rails console to do a lot of necessary cleanup work before we can import our data
ruby script/console production
Once it’s ready to go, type (or more practically, copy pasta)
conn = ActiveRecord::Base.connectionWe’ll need this for a lot of the commands we have yet to run. You’ll keep this console open for the remainder of this howto. Any ruby code you see in this document will go into this console.
Step 6: Remove data created during the migrations
The typo migrations automatically add some default data, like some default pages/articles/blog. All of the data we want is in the dump we created earlier. Let’s delete all this stuff that’s in the way
conn.tables.each do |table|
conn.execute "delete from #{table}"
endStep 7: Temporarily change boolean columns to integers
mysqldump dumps it’s booleans as 0/1. These are interpreted by postgres as integers. It will not automatically cast these into booleans just because the column is boolean (I’m not sure why.) It’s too time consuming to go add casts to all of these 0/1’s, and a regular expression to use with sed would be far too complex to bother with since not all 1’s and 0’s in the dump correspond to boolean data.
So, we will temporarily change the boolean columns in our shiny new database to integers. Before we do this, we need to temporarily drop the defaults for these boolean columns because there won’t be an implicit cast from false/true to 0/1.
This code will build a couple of hashes to store which columns are booleans and what the defaults are.
bools = {}
defaults = {}
conn.tables.each do |table|
conn.columns(table).each do |col|
if col.type.to_s == "boolean"
(bools[table] ||= []) << col.name
(defaults[table] ||= {})[col.name] = col.default if !col.default.nil?
end
end
endhere’s the value of bools and defaults in my console after the above code:
#bools
{"resources"=>["itunes_metadata", "itunes_explicit"], "contents"=>["published",
"allow_pings", "allow_comments"], "users"=>["notify_via_email",
"notify_on_new_articles", "notify_on_comments", "notify_watch_my_articles",
"notify_via_jabber"], "feedback"=>["published", "status_confirmed"],
"categorizations"=>["is_primary"]}
#defaults
{"contents"=>{"published"=>false}, "feedback"=>{"published"=>false}}Let’s now temporarily drop the defaults
defaults.each_pair do |table,cols|
cols.each_key do |col|
conn.execute "alter table #{table} alter column #{col} DROP DEFAULT"
end
endNow let’s alter the column types for the columns in bools.
We’ll use a closure to run the alter statements, so that we can use it again later to alter them back to booleans.
change_to_type = proc {|to_type|
bools.each_pair do |table, cols|
cols.each do |col|
conn.execute "alter table #{table} alter column #{col} type #{to_type}
USING (#{col}::#{to_type});"
end
end
}
change_to_type.call :integer
Step 8: Load the data dump into the new database
Ah, finally. Let’s load the data. Back to a shell in a directory with the dump, run:
sed "s/\\\'/\'\'/g" typo.dump | sed "s/\\\r/\r/g" | sed "s/\\\n/\n/g" | psql -1 typo_prod
Pass whatever options you need to connect to psql as you normally would. The first sed converts all of the \’ to two consecutive ‘s, which is what psql expects. The next two calls to sed in the pipeline replace the escaped carriage returns and newlines with actual carriage returns and newlines, which is again what psql expects.
You may get a couple warnings, but hopefully no errors. The few warnings I received were inconsequential.
Step 9: Change the boolean columns back to boolean and restore the default columns
Back to our rails console. We now have the data in place and can change the columns back using our closure from earlier:
change_to_type.call :booleanAnd then restore the defaults we dropped:
defaults.each_pair do |table, cols|
cols.each_pair do |col, default|
conn.execute "alter table #{table} alter column #{col} SET DEFAULT #{default}"
end
endStep 10: Repair the sequences.
Another annoying aspect of postgresql is that inserting a value into a serial column doesn’t automatically advance the sequence to be ready to serve up an unused value. There will be a sequence called “#{table}idseq” for each table with an id column in the database.
We manually have to advance all of the sequences:
conn.tables.each do |table|
if conn.columns(table).detect{|i|i.name == "id"}
conn.execute "SELECT setval('#{table}_id_seq', (SELECT max(id) FROM #{table}))"
end
endConclusion
So that should do it. Restart your mongrel cluster (or whatever you are using to manage your rails server processes) and you should now be using your blog with a postgresql backend!
Posted in Ruby, Ruby on Rails, typo, PostgreSQL | 2 comments |
Trackbacks<
Use the following link to trackback from your own site:
http://nopugs.com/trackbacks?article_id=18
22/04/2009 at 13h06
It works for me. Thanks for this post.
04/06/2009 at 13h51
Hi, cool post. I have been wondering about this topic,so thanks for writing.