On Mon, 10 Oct 2022 12:50:40 -0700 (PDT), cjsmall wrote:
I'm using munpack -t to extract all the parts of a MIME email. It saves the unnamed text and HTML content as part1 and part2 respectively. However,
the html is betting badly mangled since all sequences beginning with an
equal sign followed by a double quote (=") into what I think are Unicode characters. Below I'll use the ASCII representation of these characters
that you see if you pipe them through the less pager.
Given:
<div dir="auto">
<div class="gmail_quote" dir="auto">
<img src="cid:1838ef09576531465341"
The results:
dir<FA>uto">
<div class<FF>mail_quote">
<img src<FC>id:1838ef09576531465341"
I hope smarter people will show up and give you better help than I
can, but ...
First of all, about your examples: In the first line, the "<div "
seems to have disappeared. Did that really happen? In the second
line, the whole ' dir="auto"' seems to have disappeared. Did
that really happen?
Also, does this happen with all messages, or only with certain
messages? When I use mutt to pipe messages through munpack, I
don't see stuff like this. (Specifically, I run mutt interactively
in a text window, and after positioning the cursor on a particular
line in the message index, I type "|munpack -1".)
I'm so ignorant that I'm not even sure whether the problem is
occurring in mutt, munpack, or less. Have you used hexdump -C
to verify that the funny stuff is really in the file that munpack
writes?
It looks to me as if some process -- either mutt or munpack -- is
trying to interpret its input as quoted-printable, but it isn't.
--
To email me, substitute nowhere->runbox, invalid->com.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)