From ef9a229ebeaf8173e9fd4e541de4d83e8678f649 Mon Sep 17 00:00:00 2001 From: Waylan Limberg Date: Thu, 17 Nov 2011 22:43:02 -0500 Subject: Fixed #47. Improved HRProccessor.\n\nPython's re module does not support atomic grouping, which was slowing the HR regex down if a long HR ended with a non HR char (casing the regex to backtrack). Therefore, we have to simulate atomic grouping. Fortunately, we only need to match end-of-line or end-of-string after the atomic group here, so it was an easy case to simulate. Just remove the '$' from the end of the regex and manualy check using m.end(). The run method was refactored while I was at it, saving us from running the regex twice for each HR. --- tests/misc/para-with-hr.html | 5 ++++- tests/misc/para-with-hr.txt | 3 +++ 2 files changed, 7 insertions(+), 1 deletion(-) (limited to 'tests') diff --git a/tests/misc/para-with-hr.html b/tests/misc/para-with-hr.html index 8569fec..7607449 100644 --- a/tests/misc/para-with-hr.html +++ b/tests/misc/para-with-hr.html @@ -1,3 +1,6 @@

Here is a paragraph, followed by a horizontal rule.


-

Followed by another paragraph.

\ No newline at end of file +

Followed by another paragraph.

+

Here is another paragraph, followed by: +*** not an HR. +Followed by more of the same paragraph.

\ No newline at end of file diff --git a/tests/misc/para-with-hr.txt b/tests/misc/para-with-hr.txt index 20735fb..165bbe3 100644 --- a/tests/misc/para-with-hr.txt +++ b/tests/misc/para-with-hr.txt @@ -2,3 +2,6 @@ Here is a paragraph, followed by a horizontal rule. *** Followed by another paragraph. +Here is another paragraph, followed by: +*** not an HR. +Followed by more of the same paragraph. -- cgit v1.2.3