| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Add new InlineProcessor class that handles inline processing much better and allows for more flexibility. This adds new InlineProcessors that no longer utilize unnecessary pretext and posttext captures. New class can accept the buffer that is being worked on and manually process the text without regex and return new replacement bounds. This helps us to handle links in a better way and handle nested brackets and logic that is too much for regular expression. The refactor also allows image links to have links/paths with spaces like links. Ref #551, #613, #590, #161.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #601. Merged in 6f87b32 from the md3 branch and did a lot of cleanup.
Changes include:
* Removed old docs build tool, templates, etc.
* Added MkDocs config file, etc.
* filename.txt => filename.md
* pythonhost.org/Markdown => Python-Markdown.github.io
* Markdown lint and other cleanup.
* Automate pages deployment in makefile with `mkdocs gh-deploy`
Assumes a git remote is set up named "pages". Do
git remote add pages https://github.com/Python-Markdown/Python-Markdown.github.io.git
... before running `make deploy` the first time.
|
|
|
|
|
|
|
| |
If both open and close was not found in first block, additional blocks
were evaluated without context of previous blocks. The algorithm needs
to evaluate a buffer with the left bracket present. So feed in all
items and get the right bracket, then adjust the data_index to be
relative to the last block. Fixes #452.
|
|
|
|
|
|
|
| |
HRProcessor tried to access a member variable after recursively calling
itself. In certain situations HRProcessor will try to access its
member variable containing its match, but it will not be the same match
that call in the stack expected. This is easily fixed by storing the
match locally *before* doing any recursive work.
|
|
|
|
|
| |
This aims to escape code in a more expected fashion. This handles
when backticks are escaped and when the escapes before backticks are
escaped.
|
|
|
|
|
| |
Don’t allow spaces in image links. This was also causing an issue
where any text following a space was treated as a title. Ref #484.
|
|
|
|
|
|
|
|
|
|
|
| |
The logic for the current regex for strong/em and em/strong was sound,
but the way it was implemented caused some unintended side effects.
Whether it is a quirk with regex in general or just with Python’s re
engine, I am not sure. Put basically `(\*|_){3}` causes issues with
nested bold/italic. So, allowing the group to be defined, and then
using the group number to specify the remaining sequential chars is a
better way that works more reliably `(\*|_)\2{2}. Test from issue #365
was also added to check for this case in the future.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This issue was discovered when dealing with nested inlines. In
treeprocessors.py it was incorrectly handling tails. In short, tail
elements were being inserted earlier than they were supposed to be.
In order to fix this, the insertion index should be incremented by 1 so
that when the tails are inserted into the parent, they will be just
after the child they came from.
Also added a test in nested-patterns to catch this issue.
|
|
|
|
|
|
|
|
|
| |
Fixes #253. Thanks to @facelessuser for the tests. Although I removed
a bunch of weird ones (even some that passed) from his PR (#342). For
the most part, there is no definitive way for those to be parsed. So
there is no point of testing for them. In most of those situations,
authors should be mixing underscores and astericks so it is clear
what is intended.
|
|
|
|
|
|
|
|
|
| |
markdown/inlinepatterns.py is now at 99% coverage.
I have no idea why the two remaining lines are not not covered.
I it not clear to me under what circumstances this two if statements
would ever evaluate to True. I'm inclined to just remove them, but perhaps
there is an edge case I'm missing. I'll take another look later.
|
|
|
|
| |
'https://pythonhosted.org/Markdown/'. The former redirects to the latter anyway. Might as well point to the actual destination.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #257 and slightly alters comment parsing behavior.
Unlike self-closing tags, a comment can contain angle brackets between the
opening and closing tags. The greaterthan angle bracket at the end of the
first block should not be mistaken for closing the comment. Need to actually
check for a comment closing tag (`-->`). If one if not found, then the comment
keeps going (to the end of the document if nessecary) just like in HTML.
That last bit is a slight change from previous behavior, but should be
unsurprising as that's how broswers parse html comments. And as far as
I can tell, more implementations follow this behavior than any other. The
ones that don't seem to be all over the place.
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation was wrong as it also percent encoded query strings
(which should be plus encoded) and calling urllib.quote on the path (and
urllib.quote_plus on the query string) assumes the url is not already encoded.
What if the document author pasted a url that was already encoded? She probably
did not intend for `%20` to become `%2520`. Or did she? It is now clear to me
why many implementation do nothing to urls. Just pass them though as-is. To bad
if they are not valid HTML. HTML authors have to encodee their own urls, so I
guess markdown authors have to as well.
|
|
|
|
|
|
| |
Leave all other chars prefaced by a backslash alone. Fixes #242.
Not sure why I thought that I needed to add another backslash.
Thanks for the report and the test case @mhubig.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partial fix for #183. This has the same effect on empty lines in code blocks
as not using the html processor at all (which was eating some of the missing
newlines as reported in issue #183).
By doing `rsplit('\n\n')` the third newline (in each set of three) always ends
up at the end of a block, rather than the begining - which it less of an issue
for the html processor.
Also updated tests to indicate final intended output, although they do not fully
pass yet.
|
|
|
|
|
|
|
|
| |
Partial fix for #183. By preserving tabs at the start of empty lines in
code blocks, the parser will retain those empty lines. Still does not work
consistantly if the tab is missing!? Not sure why.
Also added tests.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
homepage).
|
| |
|
| |
|
|
|
|
|
| |
Interestingly, the change to the misc/mismatched-tags test is inline with
PHP Markdown Extra's behavior but not markdown.pl, which produces invalid html.
|
| |
|
|
|
|
| |
atomic grouping, which was slowing the HR regex down if a long HR ended with a non HR char (casing the regex to backtrack). Therefore, we have to simulate atomic grouping. Fortunately, we only need to match end-of-line or end-of-string after the atomic group here, so it was an easy case to simulate. Just remove the '$' from the end of the regex and manualy check using m.end(). The run method was refactored while I was at it, saving us from running the regex twice for each HR.
|
|
|
|
| |
tails. Tests included.
|
|
|
|
| |
Somehow we stopped checking for a single inline html element when swapping back in raw html. Added a test. Also patched a weird (invalid) comment test. Seeing the input is not really a valid html comment - it doesn't matter what we do with it. I suppose we test it to make sure it doesn't break the parser. Actual output is not so important. As a side note, this has exposed a preexisting (unrelated) bug with the extra extension's handling of raw html. That test is failing following this fix.
|
|
|
|
| |
previous commits. This addes the missing tests.
|
|
|
|
| |
output xhtml. This fixes #9 among other bugs. The test suite even had bad tests that should have been failing. They also have been corrected.
|
|
|
|
| |
list item are now parsed correctly. One of those crazy wierd edge cases that no one would ever test for, but is obvious once you see it.
|
| |
|
| |
|
| |
|
|
|
|
| |
default.
|
|
|
|
|
| |
Now the startindex would be reset if continual unordered
lists are present (tests are passed).
|
| |
|
|
|
|
| |
raw html parser. Fixed a related but I found while debugging this as well. Also added tests for both.
|
|
|
|
| |
blockquote tag. Added lists8.txt and .html for test suite to test condition.
|
|
|
|
|
|
| |
it has a sublist. Previously, the test suite erroneously passed this condition
because there was an error in the expected '.html' output file. The expected
output has been corrected as well.
|
|
|
|
| |
observe rules for using p tags. Thanks to Gerry LaMontagne for the patch.
|
|
|
|
| |
Thanks for the report and preliminary work Gerry LaMontagne.
|
|
|
|
| |
that contained text that fit markdown's escaping syntax (i.e. <x\]>) was never unescaped. Now it is. Markdown probably shouldn't be escaping before removing rawhtml in the first place, but this will do for now.
|
|
|
|
| |
that the branch is merged.
|
|
|
|
| |
in everyones site-packages. We just need to distrubute them in the tarball for people to run before installing etc.
|
| |
|
|\ |
|