• re.DOTALL

    From Stefan Ram@21:1/5 to Yet the Python Library Reference on Wed Jul 17 18:09:51 2024
    Below, I use [\s\S] to match each and every character.
    I can't seem to get the same effect using "re.DOTALL"!

    Yet the Python Library Reference says,

    |(Dot.) In the default mode, this matches any character except
    |a newline. If the DOTALL flag has been specified, this
    |matches any character including a newline.
    what the Python Library Reference says.

    main.py

    import re

    text = '''
    alpha

    gamma

    epsilon
    '''[ 1: -1 ]

    pattern = r'^.*?\n<hr.*?\n(.*)\n<hr.*$'

    output = re.sub( pattern.replace( r'.', r'[\s\S]' ), r'\1', text )
    print( output )

    print( '--' )

    output = re.sub( pattern, r'\1', text, re.DOTALL )
    print( output )

    stdout

    gamma
    --
    alpha

    gamma

    epsilon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Stefan Ram on Wed Jul 17 18:21:26 2024
    ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
    I can't seem to get the same effect using "re.DOTALL"!

    PS: But (?s) works.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Stefan Ram on Wed Jul 17 23:54:25 2024
    On 17 Jul 2024 18:09:51 GMT, Stefan Ram wrote:

    Below, I use [\s\S] to match each and every character.
    I can't seem to get the same effect using "re.DOTALL"!

    This might help clarify things:

    text = "alpha\n<hr>\ngamma\n<hr>\nepsilon"
    pattern = r'^(.*?)\n(<hr.*?)\n(.*)\n(<hr.*)$'

    re.search(pattern, text, re.DOTALL).groups()



    ('alpha', '<hr>', 'gamma', '<hr>\nepsilon')

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)