Django DB Migration Hiccup

I just ran into a minor issue that should have been very simple but took me a few moments to understand, so I wanted to quickly share the solution.

I've been running Watershed (my side-project, fun app) in very lazy debug mode for a small handful of users so far. That means DEBUG = True, and it also means the database was still sqlite3. I'll be fixing the DEBUG mode soon when I set up proper separation between prod and dev. This issue is something I ran into while migrating from sqlite to postgres.

The simplest way to change over to a new database engine, if you don't have anyone depending on uptime yet (I didn't!), is:

  1. Set up postgresql (user & database created, permissions & socket settings, etc)
  2. $ python manage.py dumpdata > data.json
  3. Change settings.py to point to your new db
  4. $ python manage.py migrate
  5. $ python manage.py loaddata data.json
  6. Redeploy

I ran into trouble on step 5. Watershed's backend relies heavily on Django Rest Framework, including using their Authtoken scheme for authentication. The hiccup happened in loaddata hitting a duplicate key error when it tried to import those tokens.

First I just tried running loaddata a couple more times, because why would there be any keys in any table? That didn't work. So I opened up data.json in vim just to poke around. It looked like perfectly reasonable data (which I knew had been running fine anyway!) but it wasn't working.

Next I decided to load in all the data except the authtoken rows. If it worked I could just focus on the authtoken data without the entire DB hanging in the balance. I used python -m json.tool data.json > pretty.json to format the dump and opened it with vim. I cut out the authtoken data and put it in a separate json file that could be loaded on its own, just in case I needed to be able to do that. pretty.json, naturally, loaded flawlessly. So at least everything else looked fine.

At this point it occurred to me to open the dbshell and poke around. The problem table, authtoken_token, had been created with rows for all existing users, with a created date of the time of migration. Now, this is probably the better default for most cases, and not a real issue you'd run into in a production setup. But this isn't that! My app relies on saving the authtoken to local storage in a Chrome extension for authenticated requests, and the extension isn't yet smart enough to ask the user to log in again if their token is rejected. So, it was easiest for me to drop those rows, load in my old tokens, and make a few notes about filling out the authentication management in the extension.

If I had planned better and foreseen this class of problems, I could have fixed the extension to account for that possibility before migrating the database. In fact, it's almost certainly bad to assume that the token will never change; in more critical apps than mine, you might even regularly invalidate those tokens and generate replacements. But for now, things are mostly working again, and we're several steps closer to a real (beta) release. The rest, as always, can wait.

← Posts