diff options
Diffstat (limited to 'docs/extensions/api.md')
-rw-r--r-- | docs/extensions/api.md | 109 |
1 files changed, 108 insertions, 1 deletions
diff --git a/docs/extensions/api.md b/docs/extensions/api.md index cba4ea7..ad0d254 100644 --- a/docs/extensions/api.md +++ b/docs/extensions/api.md @@ -48,6 +48,8 @@ class MyPreprocessor(Preprocessor): ## Inline Patterns {: #inlinepatterns } +### Legacy + Inline Patterns implement the inline HTML element syntax for Markdown such as `*emphasis*` or `[links](http://example.com)`. Pattern objects should be instances of classes that inherit from `markdown.inlinepatterns.Pattern` or @@ -85,7 +87,7 @@ from markdown.util import etree class EmphasisPattern(Pattern): def handleMatch(self, m): el = etree.Element('em') - el.text = m.group(3) + el.text = m.group(2) return el ``` @@ -110,8 +112,113 @@ implemented with separate instances of the `SimpleTagPattern` listed below. Feel free to use or extend any of the Pattern classes found at `markdown.inlinepatterns`. +### Future + +While users can still create plugins with the existing +`markdown.inlinepatterns.Pattern`, a new, more flexible inline processor has +been added which users are encouraged to migrate to. The new inline processor +is found at `markdown.inlinepatterns.InlineProcessor`. + +The new processor is very similar to legacy with two major distinctions. + +1. Patterns no longer need to match the entire block, so patterns no longer + start with `r'^(.*?)'` and end with `r'(.*?)!'`. This was a huge + performance sink and this requirement has been removed. The returned match + object will only contain what is explicitly matched in the pattern, and + extension pattern groups now start with `m.group(1)`. + +2. The `handleMatch` method now takes an additional input called `data`, + which is the entire block under analysis, not just what is matched with + the specified pattern. The method also returns the element *and* the index + boundaries relative to `data` that the return element is replacing + (usually `m.start(0)` and `m.end(0)`). If the boundaries are returned as + `None`, it is assumed that the match did not take place, and nothing will + be altered in `data`. + +If all you need is the same functionality as the legacy processor, you can do +as shown below. Most of the time, simple regular expression processing is all +you'll need. + +```python +from markdown.inlinepatterns import InlineProcessor +from markdown.util import etree + +# an oversimplified regex +MYPATTERN = r'\*([^*]+)\*' + +class EmphasisPattern(InlineProcessor): + def handleMatch(self, m, data): + el = etree.Element('em') + el.text = m.group(1) + return el, m.start(0), m.end(0) + +# pass in pattern and create instance +emphasis = EmphasisPattern(MYPATTERN) +``` + +But, the new processor allows you handle much more complex patterns that are +too much for Python's Re to handle. For instance, to handle nested brackets in +link patterns, the built-in link inline processor uses the following pattern to +find where a link *might* start: + +```python +LINK_RE = NOIMG + r'\[' +link = LinkInlineProcessor(LINK_RE, md_instance) +``` + +It then uses programmed logic to actually walk the string (`data`), starting at +where the match started (`m.start(0)`). If for whatever reason, the text +does not appear to be a link, it returns `None` for the start and end boundary +in order to communicate to the parser that no match was found. + +```python + # Just a snippet of of the link's handleMatch + # method to illustrate new logic + def handleMatch(self, m, data): + text, index, handled = self.getText(data, m.end(0)) + + if not handled: + return None, None, None + + href, title, index, handled = self.getLink(data, index) + if not handled: + return None, None, None + + el = util.etree.Element("a") + el.text = text + + el.set("href", href) + + if title is not None: + el.set("title", title) + + return el, m.start(0), index +``` + ### Generic Pattern Classes +Some example processors that are available. + +* **`SimpleTextInlineProcessor(pattern)`**: + + Returns simple text of `group(2)` of a `pattern` and the start and end + position of the match. + +* **`SimpleTagInlineProcessor(pattern, tag)`**: + + Returns an element of type "`tag`" with a text attribute of `group(3)` + of a `pattern`. `tag` should be a string of a HTML element (i.e.: 'em'). + It also returns the start and end position of the match. + +* **`SubstituteTagInlineProcessor(pattern, tag)`**: + + Returns an element of type "`tag`" with no children or text (i.e.: `br`) + and the start and end position of the match. + +A very small number of the basic legacy processors are still available to +prevent breakage of 3rd party extensions during the transition period to the +new processors. Three of the available processors are listed below. + * **`SimpleTextPattern(pattern)`**: Returns simple text of `group(2)` of a `pattern`. |