| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
function. The url is being encoded (with errors ignored) as an easy means of removing non-ascii chars, but we have to re-encode it *before* running the regex on it, not after.
|
|
|
|
| |
unicode anyway.
|
|
|
|
| |
framework. With a simple addition to our subclass (which we then make use of), this is an easy fix.
|
|
|
|
| |
positional arguments back then.
|
|
|
|
| |
handler. Also added a note to the docs. Anyone doing their own encoding of output should be as well.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Those extra steps always bothered me as being unnecessary. Additionally, this
should make conversion to Python 3 easier. The 2to3 tool wasn't converting
the searializers properly and we were getting byte strings in the output.
Previously, this wasn't a major problem because the default searializer was
the xml searializer provided in the ElementTree standard lib. However, now
that we are using our own xhtml searializer, it must work smoothly in all
supported versions.
As a side note, I believe the thought was that we needed to do the encoding to
take advantage of the "xmlcharrefreplace" error handling. However, using the
example in the python [docs](http://docs.python.org/howto/unicode.html#the-unicode-type):
>>> u = unichr(40960) + u'abcd' + unichr(1972)
>>> u.encode('utf-8', 'xmlcharrefreplace').decode('utf-8') == u
True
There's no point of using the "xmlcharrefreplace" error handling if we just
convert back to the original Unicode anyway. Interestingly, the Python 3
standard lib is doing essentially what we are doing here, so I'm convinced this
is the right way to go.
|
| |
|
|
|
|
| |
Currently only automates testing (makes 2to3 testing easier). More featurs to come.
|
| |
|
|
|
|
| |
rawhtml blocks. All tests pass again.
|
|
|
|
| |
tails. Tests included.
|
| |
|
|
|
|
| |
code blocks/spans. A better fix for #4: Only the *text* from the header is caried over to the toc (without *any* inline formatting). Also refactored the extension to better work in tandem with the refactored headerid extension and the new attr_list extension.
|
|
|
|
| |
Force input and output of tidy to use UTF-8 and encode before and decode after passing in the text.
|
| |
|
|
|
|
| |
autogenerates ids. If you want to define your own, use the attr_list extension. Also expanded HeaderId extension to use the same algorithm as the TOC extension (which is better) for slugifying the header text. Added config settings to override the default word separator and the slugify callable. Closes #16 as the reported bug is for a removed feature.
|
|
|
|
| |
Thanks to skurfer for report and inital patch.
|
|
|
|
| |
Somehow we stopped checking for a single inline html element when swapping back in raw html. Added a test. Also patched a weird (invalid) comment test. Seeing the input is not really a valid html comment - it doesn't matter what we do with it. I suppose we test it to make sure it doesn't break the parser. Actual output is not so important. As a side note, this has exposed a preexisting (unrelated) bug with the extra extension's handling of raw html. That test is failing following this fix.
|
|\
| |
| | |
also respect encoding when reading from a user-provided file
|
|/ |
|
| |
|
|
|
|
| |
xhtml1 searializers respectively.
|
|
|
|
| |
previous commits. This addes the missing tests.
|
|
|
|
| |
may be using them, so we should continue to support them. Also adjusted docs to encourage using keyword args only. However, if existing code was using positional args in previous versions, it should still work.
|
|
|
|
| |
and allowing us to use it.
|
|
|
|
| |
markdown.util.etree not markdown.etree. This may be a backward incompatable change for some extensions.
|
|
|
|
| |
for some extensions. They should be importing from markdown.util
|
|
|
|
| |
defined the public methods.
|
|
|
|
| |
and xhtml gets the key=value pair. We didn't need this prior to adding the attr_list ext.
|
|\ |
|
| | |
|
| |\
| | |
| | | |
allow language guessing to be disabled by passing a setting to CodeHilite - closes #24
|
| |/
| |
| |
| | |
closes #24
|
|/ |
|
|
|
|
| |
See issue #7. Also likely to become a replacement for the headerid extension (with a little more work - need a forceid option) which means it will also address issue #16. The extension works with some limited testing. Still needs tests and documentation. Currently breaks toc extension - which should run after attr_list, not before.
|
|
|
|
| |
output xhtml. This fixes #9 among other bugs. The test suite even had bad tests that should have been failing. They also have been corrected.
|
|
|
|
| |
actualy useses these? Anyway, we now match markdown.pl.
|
|
|
|
| |
brackets in reference links. Now we do as well.
|
|\ |
|
| |
| |
| |
| | |
for pointing out the typo.
|
|/
|
|
| |
list item are now parsed correctly. One of those crazy wierd edge cases that no one would ever test for, but is obvious once you see it.
|
|
|
|
| |
now include smart handling of double underscores (not just single underscores). The new behavior may be called seperately as the 'smart_strong' extension or as part of the 'extra' extension.
|
| |
|
|\
| |
| | |
Have the HeaderId extension preserve periods when generating ids from headings
|
|/ |
|
|
|
|
| |
they are removed before escaping takes place. Related to issue #14.
|
|
|
|
| |
uppears that we are loosing escaped backslashes (both in the href and in the link label in the example given in issue 14.
|
| |
|
|\
| |
| | |
New HTML5 block elements
|