Documentation for the BeyondMail archiver (bmarchiver)

This program is intended as a possible solution to the problem of keeping the messages which have been stored in BeyondMail available after you migrate to another mail system. It takes a BeyondMail export file, (created in BeyondMail using the File>Export selection on the menubar) and processes it to create a large number of files, one file for each message in your mailbox, and an additional file for each attachment. It also creates a number of html indexes to make it easier to review/locate specific archived messages. The files bmarchiver creates are all either html, text, or rich text(.rtf), so no special software is needed to use the archive once it has been created.

What you need to run this program.

This program should run on any Win32 platform that has Python 2.0 installed (it is available for Windows95, 98, NT, and 2000, not sure about ME), but it has only been tested on Windows NT 4.0 and Windows 98.

Python doesn't yet have the ability to create a standalone executable for Windows. It was tested using ActivePython 2.0.200 and 2.0.203, but it should work with any Python 2.0 or later version.

If you don't have Python installed on your computer you can get ActivePython here. The Python homepage is here.

How to run this program

To run the archiver, you need to open a command window and enter a command that looks like this:

bmarchiver.py <beyondmail exportfile> <output directory>

or on a machine that doesn't understand the .py file type

python bmarchiver.py <beyondmail exportfile> <output directory>

Running bmarchiver.py without arguments will print a reminder what the arguments are supposed to be and a version number.

All the files generated will exist in a tree beneath <output directory>

It is safer if the <output directory> is basically empty, because file name collisions are not handled except in cases (attachments) where the output of a single bmarchiver run might generate a conflict all by itself. You can always move the generated file tree someplace else after bmarchiver is finished, as all the links in the index files are relative.

Bmarchiver creates several html indexes for the mail. If the export file was named "lovely.msg" and you selected "d:\bmarchive" as the output directory, then you would run:

archiver.py lovely.msg d:\bmarchive

and bmarchiver would create the following files:

d:\bmarchive\lovely.html -- This is an index of all the messages in the export file, in the order they existed in the export file. This is usually chronological within folder.

d:\bmarchive\lovelyNameIndex.html -- This is an index of all the senders and recipents of mail in the export files. The links point at separate indexes for each sender/recipient.

d:\bmarchive\lovelyFolderIndex.html -- This is an index of all the store/folder combinations which existed in the export file.

d:\bmarchive\lovelyChronoIndex.html -- This is an index of all the months for which messages existed in the export file.

d:\bmarchive\lovely\lowerLevelIndexes\lovely<sender or recipient>.html -- indexes of the mail from or to each person

d:\bmarchive\lowerLevelIndexes\lovely<-store-folder-%gt.html indexes of the mail by folder

d:\bmarchive\lowerLevelIndexes\lovely-<year>-<monthnumber%gt-.html indexes of the mail by folder

d:\bmarchive\<storename>\<foldername> directories are used to hold the messages and attachments from each exported folder.

Problems and limitations

This program has only been tested against BM 3.5 export files. It is very likely that it would work on BM 3.0 export files.

Program doesn't understand about messages that were sent to multiple people in one gateway list (#IN[person1@foo.com,person2@bar.com]), but that needs to be fixed. Nor does it understand about nested gateways--no examples were available to work with or test against.

Doesn't know about anything but the default US BeyondMail date format (MM/DD/YYYY HH:MM AM|PM).

Message formatting (but not attachment formatting) may be slightly odd because mixed text/rtf messages are converted to html, and (more often) because I can't duplicate the BeyondMail functionality that handles line wrapping based upon window size in a browser-displayed text file. This is an area that could use improvement, or a user option

Indexes (usually the main index) with more than a thousand messages or so can be a little slow to load.

The main index is really an index sorted by folder, but the folder breaks are not displayed--they probably should be visible somehow, but this seems like a marginal requirement with the folder indexes now available.

Index by name is sorted by initial letter of email address because there is too much variation in how email addresses are formatted to parse out the last name correctly--the hope is that this way most of the messages from the same people will at least be in nearby folders

No link to attachments is created in message body--attachments must be located via an index. This could only be avoided if we converted all the messages from plain text to html, which isn't obviously a good idea.

Performance has not been optimized, except for attachment processing which was previously a major bottleneck for large attachments. A test 153MB export file containing a number of large attachments and about 4000 messages takes about 5 minutes to process on a 500Mhz PIII.

No options. There should be an initialization file which would let you choose things like

Font

Suppress/Generate various indexes

Column widths and/or suppression

Message text handling (html or text, max column to start line wrap)

Input or output date formats

Unhandled exceptions just generate raw python traceback, not as nice as it could be (good for debugging though!)

The program has a simple-minded RTF->HTML converter embedded in it for handling the mixed text-and-rtf format BeyondMail sometimes uses to store forwarded messages. It is likely that sometimes it is too simple-minded, although I haven't seen any specific problem, it has only gotten limited testing because I don't have that many mixed mode messages

Copyright and Licensing

Copyright (c) 2001 by Matt Wilbert This program is made available for use under the GNU Public License. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Feedback

This program is still under development. If you find bugs or need features it doesn't have, you can let me know at mwilbert@alum.mit.edu Documentation last updated by Matt Wilbert 03/13/2000