Atom Stores
November 25, 2005
I’ve had bad luck moving anything with Unicode characters from one MySQL database to another. MySQL 4.0.x latin1 â 4.1.x utf8 is no fun. So when I built a tool to publish my blog, I thought I’d try something else. Instead of generating pages from a database, it reads the entire site from an Atom store. Thanks to the feedparser module, the program is just 100 lines of Python.
Since the Atom store is a text file, I didn’t need to re-invent version control; I use subversion. When I move to another publishing tool, I won’t need to wrestle my data into its schema—if the new tool understands Atom stores. Lots of tools and libraries already understand Atom, programmers have big shoulders to stand on. Google Sitemaps and Google Base understand Atom, for example.
Future tools should of course manage the store, just as something like Wordpress manages a database now. The pending Atom Publishing Protocol suggests a simple API that toolsets can implement to do this.
There’s a lot to say about what you could do with all of this structured (and tagged) information; but my motivation here was practical. I didn’t want to fix OLD_PASSWORD and don’t want to suffer through another Unicode migration. For every column in your database, simply: ALTER TABLE myTable MODIFY myColumn BINARY(255);
ALTER TABLE myTable MODIFY myColumn VARCHAR(255) CHARACTER SET utf8;
Most blog software will generate an RSS or Atom feed that you can use to get your content back. (Database dumps or generic XML exports are no good.) Once your data’s in a simple and flexible format, why should you put it back into MySQL?