Discussion:
Importing mbox files
(too old to reply)
Joerg Walther
2024-02-24 11:01:34 UTC
Permalink
Archive.org has mbox files of old usenet articles for download, e. g.
here: https://archive.org/details/usenet-de
Now, is there a way to import them into Forte Agent 6? I'm running Linux
Ubuntu (and Agent in Wine), but I could also use the old Windows 10 PC.

Thx for any hints.

-jw-
--
And now for something completely different...
Ralph Fox
2024-02-24 18:19:27 UTC
Permalink
Post by Joerg Walther
Archive.org has mbox files of old usenet articles for download, e. g.
here: https://archive.org/details/usenet-de
Now, is there a way to import them into Forte Agent 6? I'm running Linux
Ubuntu (and Agent in Wine), but I could also use the old Windows 10 PC.
Thx for any hints.
-jw-
File >> Import and Export >> Import Messages

Help >> Index >> Importing, messages >> Import Message File Format

Even though that help page says "RFC-822" at the top it is actually mbox,
the Unix message file format. Agent's import should handle most variants
of mbox.
--
Kind regards
Ralph Fox
🦊

Where Bees are, there is honey.
Joerg Walther
2024-02-25 15:41:34 UTC
Permalink
Post by Ralph Fox
File >> Import and Export >> Import Messages
Thanks, I thought so but wanted to be sure. BUT: it failed twice. One
file even gets imported, but into one news article of 200000 lines
length, for another one the importer say it doesn't contain any news.
Strangely enough Claws Mail imports both without any problems. Could
these fails be due to the fact that I am running Agent on Wine? I will
give it a try on my old Windows PC later this day.

-jw-
--
And now for something completely different...
Joerg Walther
2024-02-25 18:01:45 UTC
Permalink
Post by Joerg Walther
Could
these fails be due to the fact that I am running Agent on Wine? I will
give it a try on my old Windows PC later this day.
Same behaviour on regular Windows 10. Next I'm going to import & export
to Claws Mail and then try to import in Agent and report here.

-jw-
--
And now for something completely different...
Ralph Fox
2024-02-25 19:44:43 UTC
Permalink
Post by Joerg Walther
Post by Ralph Fox
File >> Import and Export >> Import Messages
Thanks, I thought so but wanted to be sure. BUT: it failed twice. One
file even gets imported, but into one news article of 200000 lines
length, for another one the importer say it doesn't contain any news.
Strangely enough Claws Mail imports both without any problems. Could
these fails be due to the fact that I am running Agent on Wine? I will
give it a try on my old Windows PC later this day.
Running on Wine will not cause a problem. Agent can import mbox files
in Wine.

If it fails, it is probably because the "From " separator lines are
not in the format which Agent expects.


Checking the mbox file "de.alt.comp.mbox" from Archive.org...
* The "From " separator lines do not contain a date & time.
* Agent expects the "From " separator lines to contain a date and
time because (a) they normally do, and (b) this allows Agent's
import to handle variants of the mbox file format.


If you make the following search-and-replace in Notepad++, the mbox
file "de.alt.comp.mbox" will import into Agent. This changes the
"From " separator line into the format which Agent expects:

* Find what: ^From [-0-9][0-9]*$
* Replace with: From ???@??? Thu Jan 1 00:00:00 -0000 1970
* Search mode: (*) Regular expression

Screen-shot: <Loading Image...>


There is another problem which you will see after you import the mbox
file "de.alt.comp.mbox" into Agent.
* The messages will not have the correct dates in the message list.
The root cause is:
* The "Date:" headers in the mbox file are not in RFC822 format.
* The "Date:" headers look like something from the DejaNews archive,
where the time of day and timezone have been thrown away and
what is left is in some format other than RFC822.

I do not know whether all mbox files from Archive.org have this
problem. It may depend on whether the mbox contents originally came
from the DejaNews archive.

In principle someone could write a converter to convert these DejaNews
"Date:" headers to RFC822 format. This would require inventing a fake
time-of-day for each message.
--
Kind regards
Ralph Fox
🦊

Fair words butter no parsnips.
Joerg Walther
2024-02-26 12:35:21 UTC
Permalink
Post by Ralph Fox
Running on Wine will not cause a problem. Agent can import mbox files
in Wine.
If it fails, it is probably because the "From " separator lines are
not in the format which Agent expects.
I can confirm that, it produced the same error message in Win as in
Wine. Importing and exporting the mbox with Claws Mail and then feeding
it to Agent finally worked although it took a while. Claws Mail somehow
"repaired" the mbox, which after the export was a bit bigger than
before.

-jw-
--
And now for something completely different...
Loading...