• How to get get_body() to work? (about email)

    From Peng Yu@21:1/5 to All on Sat Mar 18 21:49:45 2023
    Hi,

    https://docs.python.org/3/library/email.parser.html

    It says "For MIME messages, the root object will return True from its is_multipart() method, and the subparts can be accessed via the
    payload manipulation methods, such as get_body(), iter_parts(), and
    walk()."

    But when I try the following code, get_body() is not found. How to get get_body() to work?

    $ python3 -c 'import email, sys; msg = email.message_from_string(sys.stdin.read()); print(msg.get_body())'
    <<< some_text
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    AttributeError: 'Message' object has no attribute 'get_body'

    --
    Regards,
    Peng

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Peng Yu on Sun Mar 19 17:09:35 2023
    Supersedes: <body-20230319174636@ram.dialup.fu-berlin.de>
    [Added "import email.policy"]

    Supersedes: <body-20230319175501@ram.dialup.fu-berlin.de>
    [changed "global_output" into "output"]

    Peng Yu <pengyu.ut@gmail.com> writes:
    But when I try the following code, get_body() is not found. How to get >get_body() to work?

    Did you know that this post of mine here was posted to
    Usenet with a Python script I wrote?

    That Python script has a function to show the body of
    a post before posting. The post is contained in a file,
    so it reads the post from that file.

    I copy it here, maybe it can help some people to see
    how I do this.

    # Python 3.5

    import email
    import email.policy

    ...

    def showbody( file ): # lightly edited for posting on 2023-03-19
    output = ''
    msg = email.message_from_binary_file\
    ( file, policy=email.policy.default )
    for part in msg.walk():
    if part.get_content_maintype() == 'multipart':
    continue
    charset = part.get_content_charset()
    if charset is not None:
    output += part.get_content() + '\n'
    print( '\n' + output )

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to Stefan Ram on Sun Mar 19 17:20:24 2023
    On 2023-03-19, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    Peng Yu <pengyu.ut@gmail.com> writes:
    But when I try the following code, get_body() is not found. How to get >>get_body() to work?

    Did you know that this post of mine here was posted to
    Usenet with a Python script I wrote?

    That Python script has a function to show the body of
    a post before posting. The post is contained in a file,
    so it reads the post from that file.

    I copy it here, maybe it can help some people to see
    how I do this.

    # Python 3.5

    import email

    ...

    def showbody( file ): # lightly edited for posting on 2023-03-19
    output = ''
    msg = email.message_from_binary_file\
    ( file, policy=email.policy.default )

    I wouldn't generally be pedantic about code style, but that's giving me
    painful convulsions. Backslashes for line continuations are generally considered a bad idea (as they mean that any whitespace after the
    backslash, which is often invisible, becomes significant). And not
    indenting the continuation line(s) is pretty shocking. Writing it as
    below is objectively better:

    msg = email.message_from_binary_file(
    file, policy=email.policy.default )

    (Also, I too find it annoying to have to avoid, but calling a local
    variable 'file' is somewhat suspect since it shadows the builtin.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to Stefan Ram on Sun Mar 19 18:07:53 2023
    On 2023-03-19, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    Jon Ribbens <jon+usenet@unequivocal.eu> writes:
    (Also, I too find it annoying to have to avoid, but calling a local >>variable 'file' is somewhat suspect since it shadows the builtin.)

    Thanks for your remarks, but I'm not aware
    of such a predefined name "file"!

    Ah, apparently it got removed in Python 3, which is a bit odd as the
    last I heard it was added in Python 2.2 in order to achieve consistency
    with other types.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Jon Ribbens on Sun Mar 19 17:30:51 2023
    Jon Ribbens <jon+usenet@unequivocal.eu> writes:
    (Also, I too find it annoying to have to avoid, but calling a local
    variable 'file' is somewhat suspect since it shadows the builtin.)

    Thanks for your remarks, but I'm not aware
    of such a predefined name "file"!

    main.py

    # Python 3.9

    print( file )

    sys.stderr

    Traceback (most recent call last):
    File "main.py", line 1, in <module>
    print( file )
    NameError: name 'file' is not defined

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Peng Yu on Sun Mar 19 13:34:00 2023
    On 3/18/2023 10:49 PM, Peng Yu wrote:
    Hi,

    https://docs.python.org/3/library/email.parser.html

    It says "For MIME messages, the root object will return True from its is_multipart() method, and the subparts can be accessed via the
    payload manipulation methods, such as get_body(), iter_parts(), and
    walk()."

    But when I try the following code, get_body() is not found. How to get get_body() to work?

    $ python3 -c 'import email, sys; msg = email.message_from_string(sys.stdin.read()); print(msg.get_body())'
    <<< some_text
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    AttributeError: 'Message' object has no attribute 'get_body'

    A Message object does not have a get_body method, but an EmailMessage
    object does.

    In the Python 3.10 docs, Just before the part you quoted, there is this sentence:

    "You can pass the parser a bytes, string or file object, and the parser
    will return to you the root EmailMessage instance of the object
    structure".

    So if you want to use get_body(), you should be feeding the parser your
    message string, rather than using email.message_from_string(), which
    returns a Message, not an EmailMessage.

    With a Message object, you could use get_payload(), which will give you
    a list of Messages for each MIME part of the document. When a part is
    not a multipart, it will give you a string, which sounds like what you
    want to end up with.

    (see https://docs.python.org/3.10/library/email.compat32-message.html)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Greg Ewing@21:1/5 to Jon Ribbens on Mon Mar 20 10:59:03 2023
    On 20/03/23 7:07 am, Jon Ribbens wrote:
    Ah, apparently it got removed in Python 3, which is a bit odd as the
    last I heard it was added in Python 2.2 in order to achieve consistency
    with other types.

    As far as I remember, the file type came into existence
    with type/class unification, and "open" became an alias
    for the file type, so you could use open() and file()
    interchangeably.

    With the Unicode revolution in Python 3, file handling got
    a lot more complicated. Rather than a single file type,
    there are now a bunch of classes that handle low-level I/O,
    encoding/decoding, etc, and open() is a function again
    that builds the appropriate combination of underlying
    objects.

    --
    Greg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to Greg Ewing on Mon Mar 20 12:41:57 2023
    On 2023-03-19, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
    On 20/03/23 7:07 am, Jon Ribbens wrote:
    Ah, apparently it got removed in Python 3, which is a bit odd as the
    last I heard it was added in Python 2.2 in order to achieve consistency
    with other types.

    As far as I remember, the file type came into existence
    with type/class unification, and "open" became an alias
    for the file type, so you could use open() and file()
    interchangeably.

    With the Unicode revolution in Python 3, file handling got
    a lot more complicated. Rather than a single file type,
    there are now a bunch of classes that handle low-level I/O, encoding/decoding, etc, and open() is a function again
    that builds the appropriate combination of underlying
    objects.

    This is true, however there does exist a base class which, according to
    the documentation, underlies all of the different IO classes - IOBase -
    so it might have been neater to make 'file' be an alias for that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)