| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
The current implementation was wrong as it also percent encoded query strings
(which should be plus encoded) and calling urllib.quote on the path (and
urllib.quote_plus on the query string) assumes the url is not already encoded.
What if the document author pasted a url that was already encoded? She probably
did not intend for `%20` to become `%2520`. Or did she? It is now clear to me
why many implementation do nothing to urls. Just pass them though as-is. To bad
if they are not valid HTML. HTML authors have to encodee their own urls, so I
guess markdown authors have to as well.
|
|
|
|
|
|
| |
Leave all other chars prefaced by a backslash alone. Fixes #242.
Not sure why I thought that I needed to add another backslash.
Thanks for the report and the test case @mhubig.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partial fix for #183. This has the same effect on empty lines in code blocks
as not using the html processor at all (which was eating some of the missing
newlines as reported in issue #183).
By doing `rsplit('\n\n')` the third newline (in each set of three) always ends
up at the end of a block, rather than the begining - which it less of an issue
for the html processor.
Also updated tests to indicate final intended output, although they do not fully
pass yet.
|
|
|
|
|
|
|
|
| |
Partial fix for #183. By preserving tabs at the start of empty lines in
code blocks, the parser will retain those empty lines. Still does not work
consistantly if the tab is missing!? Not sure why.
Also added tests.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
homepage).
|
| |
|
| |
|
|
|
|
|
| |
Interestingly, the change to the misc/mismatched-tags test is inline with
PHP Markdown Extra's behavior but not markdown.pl, which produces invalid html.
|
| |
|
|
|
|
| |
atomic grouping, which was slowing the HR regex down if a long HR ended with a non HR char (casing the regex to backtrack). Therefore, we have to simulate atomic grouping. Fortunately, we only need to match end-of-line or end-of-string after the atomic group here, so it was an easy case to simulate. Just remove the '$' from the end of the regex and manualy check using m.end(). The run method was refactored while I was at it, saving us from running the regex twice for each HR.
|
|
|
|
| |
tails. Tests included.
|
|
|
|
| |
Somehow we stopped checking for a single inline html element when swapping back in raw html. Added a test. Also patched a weird (invalid) comment test. Seeing the input is not really a valid html comment - it doesn't matter what we do with it. I suppose we test it to make sure it doesn't break the parser. Actual output is not so important. As a side note, this has exposed a preexisting (unrelated) bug with the extra extension's handling of raw html. That test is failing following this fix.
|
|
|
|
| |
previous commits. This addes the missing tests.
|
|
|
|
| |
output xhtml. This fixes #9 among other bugs. The test suite even had bad tests that should have been failing. They also have been corrected.
|
|
|
|
| |
list item are now parsed correctly. One of those crazy wierd edge cases that no one would ever test for, but is obvious once you see it.
|
| |
|
| |
|
| |
|
|
|
|
| |
default.
|
|
|
|
|
| |
Now the startindex would be reset if continual unordered
lists are present (tests are passed).
|
| |
|
|
|
|
| |
raw html parser. Fixed a related but I found while debugging this as well. Also added tests for both.
|
|
|
|
| |
blockquote tag. Added lists8.txt and .html for test suite to test condition.
|
|
|
|
|
|
| |
it has a sublist. Previously, the test suite erroneously passed this condition
because there was an error in the expected '.html' output file. The expected
output has been corrected as well.
|
|
|
|
| |
observe rules for using p tags. Thanks to Gerry LaMontagne for the patch.
|
|
|
|
| |
Thanks for the report and preliminary work Gerry LaMontagne.
|
|
|
|
| |
that contained text that fit markdown's escaping syntax (i.e. <x\]>) was never unescaped. Now it is. Markdown probably shouldn't be escaping before removing rawhtml in the first place, but this will do for now.
|
|
|
|
| |
that the branch is merged.
|
|
|
|
| |
in everyones site-packages. We just need to distrubute them in the tarball for people to run before installing etc.
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| | |
shows the differances between the old and the new. Also left one test failing (unsignificant white space only) to demonstrate what a failing test looks like.
|
| |
| |
| |
| | |
treatment if raw <div>s with multiple line breaks - they no longer automagicly process their content as markdown. This matches other implementations. Finished rest of code for use by an extension - to be added later.
|
| |
| |
| |
| |
| |
| | |
now - allowing various arbitrary stuff (like x/html to be included without breaking the rawhtml parser.
Although currently unused, the code also provides the parsed attributes as a dict. Should be useful for adding support for parsing markdown text within rawhtml in an extension.
|
| |
| |
| |
| | |
inside raw <pre> tags.
|
| |
| |
| |
| | |
wrapped in punctuation without spaces and still will be converted to emphasis (ie: '[_foo_]'). Test included. Thanks for the report seanh.
|
|/
|
|
| |
asterisk. Adjusted regex to eat all (one or more) of the spaces. While it may seem wrong (at least in the loose list case), this is how all other implementations work. And it solves a number of edge cases otherwise not accounted for in the list parser.
|
|
|
|
| |
30 and other related issues. Note that I went with php's behavior rather than perl's when we have have three (ie.: *** or ___) without a closing three.
|
| |
|
|
|
|
| |
blockquote.
|
|
|
|
| |
Apparently differant versions of ElementTree encode line breaks in attributes differantly. Therefore, we just remove any such linebreaks as they are insignificant anyway.
|
| |
|
| |
|
|
|
|
| |
starts after the first line. Also updated coresponding test as it had an error and added more detail. All core tests pass now. On to extensions.
|