The emlx Format

Starting with OS X 10.4, Apple's program stopped using standard mbox format for messages, and started using a proprietary one-file-per-message format. This format appears to make spotlight's job easier, but it makes it harder export messages from, especially without getting help from

I spent some time figuring out the file format. It has three parts: 1. The length of part 2, in bytes 2. The message itself 3. Message metadata Part one is the length of the message itself, written in ASCII in decimal, terminated by 0x0a. Part two is the message itself - headers and body - exactly what should be written to a file in Maildir format. It does not contain mbox style ">From" escaping. Part three is an XML Property List. A sample metadata section looks like:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "">
<plist version="1.0">
    <string>"Redacted" &lt;redacted&gt;</string>
    <string>Re: Sierra County Realty</string>

Most of the sections are pretty self explanatory and duplicative of data already in the message, with a single exception: flags.

At this point I suppose I should explain the point of this whole endeavor. Thursday night my wife's hard drive died. Fortunately with the help of rdiff-backup we have a very recent backup of her user directory, and I found a good deal on a new hard drive. But her email is inaccessible. She uses POP3, so the only copy of most of her email is locked away within Apple Mail. I've been trying to convert her to IMAP for some time, and this is the perfect opportunity. What I want to do is convert Mail's mail store to Maildir and just plop it on the server. There is a tool to convert to mbox, and there's lots of ways to go from mbox to Maildir, but its a GUI tool not well suited to automation (my wife has a lot of email folders). So being the overly optimistic engineer that I am, I'm building a tool to do it myself.

Back to the point. A fellow going by jwz has learned the meaning of the flags field.

7initial (no longer used)
10-15attachment count
24is junk
25is not junk
26-28font size delta
29junk mail level recorded
30highlight text in toc

That concludes the format of the .emlx files. For completeness, I should describe the higher level folder format. ~/Library/Mail/ contains subdirectories named after your various mail accounts, and a subdirectory named Mailboxes containing the local folders. Each folder is represented by two directories. Suppose you have a folder named INBOX. Then there will be the directories INBOX.mbox/Messages (or .imapmbox) which contains the .emlx files, and the directory INBOX which contains the subdirectories of INBOX. Finally, a folder which contains only subfolders but no messages will be represented by a directory with the .sbd extension. So, if INBOX contained no messages, it would be represented by a directory named INBOX.sbd.

Finally, check out emlx2maildir, the fruits of this labor.