2010-05-21

Exporting Email From Myspace With Google Gears in Firefox on Linux

Myspace doesn't appear to have a way of natively allowing exporting messages.  What follows is a proof-of-concept kludged script to do so.

I desired to retrieve 50 email messages from my now defunct Myspace inbox.

I didn't have any messages in sub folders, nor did I care for the single message I had in my 'sent' folder.

For whatever reason, it doesn't appear that Sent messages are being saved in the Gears database, though it ought parse those in sub folders.

I make no effort to parse and scrape user images; however, doing so appears as though it would be trivial.

Of further note: for a user that has an account number of 12345678, a user id of jdoe, and a display name as "Johnny D0e LOLZ!!!!", Myspace appears to store another display name ("John Doe") that it shows under a user's picture in email; this name is not accessible via the Gears database, as far as I can tell.  (The example above is based on an actual Myspace 'friend')

Regarding Myspace's formatting, I found the Firefox add-on "SQLite Manager" exceptionally useful for advancing my understanding of how Myspace was saving the messages; just using sqlite's .dump function didn't make things like rowid clear for a sql-language amateur such as myself;  anyone attempting to perhaps pursue the creation of a real parser would likely benefit from the use of sqlite manager.

Please note, the CreatedData appears to have milliseconds tacked on; hence the weirdness I do with trimming.  My timezone is MDT; please adjust the +/- for hour accordingly.

This all is done in Firefox on Xubuntu, as Google Gears was not available for Google Chrome on Linux.

Steps:
  1. Install Google Gears: Google Gears
  2. Log in to Myspace and go to your inbox; obviously, give Gears permission to sync.  Wait until messages are saved to hdd.
  3. Created a directory under your home directory, perhaps "myspace_emails" or similar would be sensible.
  4. With your editor of choice (or cat >> scrapemyspace.sh for the lazy/efficient) create file and copy/paste my script below into it. Give it +x.
  5. Now simply run script (sh name_you_used.sh); if you wish to have all emails in ONE file, in addition to the separate ones created, do: script.sh >> allmyspaceemails.txt or whatever; this was what I needed, and why I used tee.  The dump to the sqlite3 file is for debugging, and contains the sql commands to rebuild the database (for example, in mysql...). Comment out if not required.
  6. Done!

If the location where your Gears data is stored changes, simply change the path variables at the top; also useful if you want to test, by copying the local database to your home directory.

The *.default may need to be changed (~/.mozilla/firefox/profiles.ini) for multiple users.

There is NO error checking in my script: if this makes you uneasy, please feel free to add some.

You may use this script however you wish, as long as you give me attribution, but the following disclaimer also applies:

This script is provided "AS IS", without warranty of any kind, stated or implied, including without limitation the warranties of merchantability, fitness for a particular purpose and non-infringement. The entire risk as to the quality and performance of this script is borne by you. Should the script prove defective, you and not me assume the entire cost of any service and repair. etc. etc. (If you don't understand what this script does, please find someone who does, and ask them if it's okay for you to run it, instead of trusting a random person on the intrawebs.)

Final disclaimer: I know less about bash scripting than I do about SQL. Beware.

That said, enjoy!

# Script to parse Google Gear sqlite3 db with Myspace Emails
# Copyright (C) 2010, Seth Stahlman. All rights reserved
#!/bin/bash
dbpath=~/.mozilla/firefox/*.default/Google*/messaging.myspace.com/http_80/
dbfile='myspace.messaging.database\#database'
db="${dbpath}${dbfile}"
cut='cut -d, -s'
sqlite3 $db ".dump" >> myspace_email.sqlite3.sql
authors=`sqlite3 -csv $db "select rowid, * from AuthorData_content;"`
echo "$authors" |
while read a
do
        uid=`echo $a | $cut -f1`
        dn=`echo $a | $cut -f2`
        un=`echo $a | $cut -f3`
        echo "Messages from $dn ($un) (Account ID: $uid)\n" | tee -a "${uid}_${un}"
        msgids=`sqlite3 -list -separator , $db "select MessageId, \
                datetime(substr(CreatedDate, 1, 10), 'unixepoch', '+1 hour') as date from MessageMetaData where AuthorId = $uid;"`
        echo "$msgids" |
        while read m
        do
                messageid=`echo $m | $cut -f1`
                messagedate=`echo $m | $cut -f2`
                subject=`sqlite3 -list $db "select c0Subject from MessageData_Content where rowid = $messageid;"`
                body=`sqlite3 -list $db "select c1Body from MessageData_Content where rowid = $messageid;"`
                echo "Message ID: $messageid\nDate: $messagedate\nSubject: $subject\n\n$body\n" | tee -a "${uid}_${un}"
        done
done