From e4993fc56dd222c9b11fc96b1044e39f84e4544f Mon Sep 17 00:00:00 2001 From: Waylan Limberg Date: Mon, 20 Sep 2010 14:45:52 -0400 Subject: Added the re.UNICODE flag to inlinepatterns. Now all inlinepattern regex will match unicode characters when \w, \b, or \s is used. Also updated docs to reflect change. --- docs/writing_extensions.txt | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'docs/writing_extensions.txt') diff --git a/docs/writing_extensions.txt b/docs/writing_extensions.txt index 1300d55..2ecd4c9 100644 --- a/docs/writing_extensions.txt +++ b/docs/writing_extensions.txt @@ -80,9 +80,10 @@ Note that any regular expression returned by ``getCompiledRegExp`` must capture the whole block. Therefore, they should all start with ``r'^(.*?)'`` and end with ``r'(.*?)!'``. When using the default ``getCompiledRegExp()`` method provided in the ``Pattern`` you can pass in a regular expression without that -and ``getCompiledRegExp`` will wrap your expression for you. This means that -the first group of your match will be ``m.group(2)`` as ``m.group(1)`` will -match everything before the pattern. +and ``getCompiledRegExp`` will wrap your expression for you and set the +`re.DOTALL` and `re.UNICODE` flags. This means that the first group of your +match will be ``m.group(2)`` as ``m.group(1)`` will match everything before the +pattern. For an example, consider this simplified emphasis pattern: -- cgit v1.2.3