How to do data portability

I’ve heard a lot about data portability conferences and workshops, I’ve even been criticized for not going to one which happened on the west coast while I was in the east earlier this month. I don’t plan to go to any of them, I don’t see what’s accomplished by having public meetings about this stuff. People who control users’ data can accomplish a lot more by finding ways to give them the power to use it more effectively. Talking about principles of data portability only achieves talk. It gives people a sense of propriety over talking, not data, and people giving up propriety over talking are just yielding the floor, not yielding any power over users.
The best way to achieve data portability is to just do it.
I know that sounds silly, or obvious, but there is so much pretending that there’s more to it, that it has to be said.
If you want to accomplish something by talking, call up a friend who works at Netflix or Yahoo and ask them if they’ll let users move around their movie rating data. I’ve been asking about this for years. No one’s email addresses are involved. All I want is the power to give Netflix permission to read an XML file on yahoo.com that contains my movie rating data (assuming Yahoo goes first). Anyone can see how much power this would give Yahoo. Why don’t they do it? I honestly don’t know. If I were them, I would.
Another example — if Twitter wanted to buy itself some time and growth, and give developers something exciting to do, they would store as much user profile data as they can off twitter.com servers and on Amazon. Simple XML formats, use some of their ability to raise investment capital (which they have proven) to grow the human network while they patch up or rewrite their system software. The more data they can move off their outage-prone systems, the more the network can grow around them, but not dependent on them. Amazon has proven they can keep their servers running. Leverage that.
The discussion about data portability so far has fixed on the hardest most vexing technical, privacy and economic issues, the ones that probably don’t have a resolution. My advice is to instead pick a few relatively easy data portability problems and solve them. Flying around the world to go to conferences to talk about the hardest problems won’t actually achieve any data portability.
Update: Brad Feld argues for APIs. A few months ago I would have agreed, but today I don’t think an API is enough. As we’ve seen with Twitter, when the service goes down, there is no API and there is 100 percent lock-in. We need more. The most vital data must be stored off-site, so it doesn’t go away when the service goes down.

No Comments

No comments yet.

Comments RSS TrackBack Identifier URI

Leave a comment

  • Calendar

    • February 2012
      M T W T F S S
      « Sep    
       12345
      6789101112
      13141516171819
      20212223242526
      272829  
  • Search