aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--docs/AUTHORS1
-rw-r--r--docs/CHANGE_LOG6
-rw-r--r--docs/release-2.0.2.txt9
-rw-r--r--docs/using_as_module.txt8
-rw-r--r--docs/writing_extensions.txt36
-rw-r--r--markdown/__init__.py59
-rw-r--r--markdown/commandline.py47
-rw-r--r--markdown/extensions/codehilite.py90
-rw-r--r--markdown/extensions/extra.py2
-rw-r--r--markdown/extensions/fenced_code.py66
-rw-r--r--markdown/extensions/footnotes.py24
-rw-r--r--markdown/extensions/toc.py13
-rw-r--r--markdown/inlinepatterns.py2
-rw-r--r--markdown/preprocessors.py112
-rw-r--r--markdown/tests/misc/div.html3
-rw-r--r--markdown/tests/misc/html.html4
-rw-r--r--markdown/tests/misc/html.txt4
-rw-r--r--markdown/tests/misc/multi-line-tags.html3
-rw-r--r--markdown/tests/misc/multiline-comments.html5
-rw-r--r--markdown/treeprocessors.py35
-rwxr-xr-xsetup.py76
-rw-r--r--tests/extensions-x-extra/raw-html.html14
-rw-r--r--tests/extensions-x-extra/raw-html.txt12
-rw-r--r--tests/extensions-x-toc/nested.html16
-rw-r--r--tests/extensions-x-toc/nested.txt9
-rw-r--r--tests/extensions-x-toc/nested2.html14
-rw-r--r--tests/extensions-x-toc/nested2.txt10
-rw-r--r--tests/misc/raw_whitespace.html8
-rw-r--r--tests/misc/raw_whitespace.txt10
-rw-r--r--tests/misc/smart_em.html5
-rw-r--r--tests/misc/smart_em.txt9
31 files changed, 489 insertions, 223 deletions
diff --git a/docs/AUTHORS b/docs/AUTHORS
index cfe2b34..2843b56 100644
--- a/docs/AUTHORS
+++ b/docs/AUTHORS
@@ -37,6 +37,7 @@ Daniel Krech
Steward Midwinter
Jack Miller
Neale Pickett
+Paul Stansifer
John Szakmeister
Malcolm Tredinnick
Ben Wilson
diff --git a/docs/CHANGE_LOG b/docs/CHANGE_LOG
index 1b4af45..e005ff8 100644
--- a/docs/CHANGE_LOG
+++ b/docs/CHANGE_LOG
@@ -1,6 +1,12 @@
PYTHON MARKDOWN CHANGELOG
=========================
+Sept 28, 2009: Released version 2.0.2-Final.
+
+May 20, 2009: Released version 2.0.1-Final.
+
+Mar 30, 2009: Released version 2.0-Final.
+
Mar 8, 2009: Release Candidate 2.0-rc-1.
Feb 2009: Added support for multi-level lists to new Blockprocessors.
diff --git a/docs/release-2.0.2.txt b/docs/release-2.0.2.txt
new file mode 100644
index 0000000..8ae9a3d
--- /dev/null
+++ b/docs/release-2.0.2.txt
@@ -0,0 +1,9 @@
+Python-Markdown 2.0.2 Release Notes
+===================================
+
+Python-Markdown 2.0.2 is a bug-fix release. No new features have been added.
+Most notably, the setup script has been updated to include a dependency on
+ElementTree on older versions of Python (< 2.5). There have also been a few
+fixes for minor parsing bugs in some edge cases. For a full list of changes,
+see the git log.
+
diff --git a/docs/using_as_module.txt b/docs/using_as_module.txt
index cfeb88d..130d0a7 100644
--- a/docs/using_as_module.txt
+++ b/docs/using_as_module.txt
@@ -20,13 +20,13 @@ string should work) and returns output as Unicode. Do not pass encoded strings
If your input is encoded, e.g. as UTF-8, it is your responsibility to decode
it. E.g.:
- input_file = codecs.open("some_file.txt", mode="r", encoding="utf8")
+ input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
text = input_file.read()
html = markdown.markdown(text, extensions)
If you later want to write it to disk, you should encode it yourself:
- output_file = codecs.open("some_file.html", "w", encoding="utf8")
+ output_file = codecs.open("some_file.html", "w", encoding="utf-8")
output_file.write(html)
More Options
@@ -61,7 +61,7 @@ The ``Markdown`` class has the method ``convertFile`` which reads in a file and
writes out to a file-like-object:
md = markdown.Markdown()
- md.convertFile(input="in.txt", output="out.html", encoding="utf8")
+ md.convertFile(input="in.txt", output="out.html", encoding="utf-8")
The markdown module also includes a shortcut function ``markdownFromFile`` that
wraps the above method.
@@ -69,7 +69,7 @@ wraps the above method.
markdown.markdownFromFile(input="in.txt",
output="out.html",
extensions=[],
- encoding="utf8",
+ encoding="utf-8",
safe=False)
In either case, if the ``output`` keyword is passed a file name (i.e.:
diff --git a/docs/writing_extensions.txt b/docs/writing_extensions.txt
index 860c2ec..3aad74a 100644
--- a/docs/writing_extensions.txt
+++ b/docs/writing_extensions.txt
@@ -38,15 +38,15 @@ Preprocessors munge the source text before it is passed into the Markdown
core. This is an excellent place to clean up bad syntax, extract things the
parser may otherwise choke on and perhaps even store it for later retrieval.
-Preprocessors should inherit from ``markdown.Preprocessor`` and implement
-a ``run`` method with one argument ``lines``. The ``run`` method of each
-Preprocessor will be passed the entire source text as a list of Unicode strings.
-Each string will contain one line of text. The ``run`` method should return
-either that list, or an altered list of Unicode strings.
+Preprocessors should inherit from ``markdown.preprocessors.Preprocessor`` and
+implement a ``run`` method with one argument ``lines``. The ``run`` method of
+each Preprocessor will be passed the entire source text as a list of Unicode
+strings. Each string will contain one line of text. The ``run`` method should
+return either that list, or an altered list of Unicode strings.
A pseudo example:
- class MyPreprocessor(markdown.Preprocessor):
+ class MyPreprocessor(markdown.preprocessors.Preprocessor):
def run(self, lines):
new_lines = []
for line in lines:
@@ -61,9 +61,9 @@ A pseudo example:
Inline Patterns implement the inline HTML element syntax for Markdown such as
``*emphasis*`` or ``[links](http://example.com)``. Pattern objects should be
-instances of classes that inherit from ``markdown.Pattern`` or one of its
-children. Each pattern object uses a single regular expression and must have
-the following methods:
+instances of classes that inherit from ``markdown.inlinepatterns.Pattern`` or
+one of its children. Each pattern object uses a single regular expression and
+must have the following methods:
* **``getCompiledRegExp()``**:
@@ -84,7 +84,7 @@ match everything before the pattern.
For an example, consider this simplified emphasis pattern:
- class EmphasisPattern(markdown.Pattern):
+ class EmphasisPattern(markdown.inlinepatterns.Pattern):
def handleMatch(self, m):
el = markdown.etree.Element('em')
el.text = m.group(3)
@@ -135,13 +135,13 @@ core BlockParser. This is where additional manipulation of the tree takes
place. Additionally, the InlineProcessor is a Treeprocessor which steps through
the tree and runs the InlinePatterns on the text of each Element in the tree.
-A Treeprocessor should inherit from ``markdown.Treeprocessor``,
+A Treeprocessor should inherit from ``markdown.treeprocessors.Treeprocessor``,
over-ride the ``run`` method which takes one argument ``root`` (an Elementree
object) and returns either that root element or a modified root element.
A pseudo example:
- class MyTreeprocessor(markdown.Treeprocessor):
+ class MyTreeprocessor(markdown.treeprocessors.Treeprocessor):
def run(self, root):
#do stuff
return my_modified_root
@@ -155,15 +155,15 @@ Postprocessors manipulate the document after the ElementTree has been
serialized into a string. Postprocessors should be used to work with the
text just before output.
-A Postprocessor should inherit from ``markdown.Postprocessor`` and
-over-ride the ``run`` method which takes one argument ``text`` and returns a
-Unicode string.
+A Postprocessor should inherit from ``markdown.postprocessors.Postprocessor``
+and over-ride the ``run`` method which takes one argument ``text`` and returns
+a Unicode string.
Postprocessors are run after the ElementTree has been serialized back into
Unicode text. For example, this may be an appropriate place to add a table of
contents to a document:
- class TocPostprocessor(markdown.Postprocessor):
+ class TocPostprocessor(markdown.postprocessors.Postprocessor):
def run(self, text):
return MYMARKERRE.sub(MyToc, text)
@@ -179,8 +179,8 @@ That Blockprocessor parses the block and adds it to the ElementTree. The
[[Definition Lists]] extension would be a good example of an extension that
adds/modifies Blockprocessors.
-A Blockprocessor should inherit from ``markdown.BlockProcessor`` and implement
-both the ``test`` and ``run`` methods.
+A Blockprocessor should inherit from ``markdown.blockprocessors.BlockProcessor``
+and implement both the ``test`` and ``run`` methods.
The ``test`` method is used by BlockParser to identify the type of block.
Therefore the ``test`` method must return a boolean value. If the test returns
diff --git a/markdown/__init__.py b/markdown/__init__.py
index 086fde9..26314f6 100644
--- a/markdown/__init__.py
+++ b/markdown/__init__.py
@@ -39,8 +39,8 @@ Copyright 2004 Manfred Stienstra (the original version)
License: BSD (see docs/LICENSE for details).
"""
-version = "2.0.1"
-version_info = (2,0,1, "Final")
+version = "2.0.3"
+version_info = (2,0,3, "Final")
import re
import codecs
@@ -182,7 +182,7 @@ class Markdown:
def __init__(self,
extensions=[],
extension_configs={},
- safe_mode = False,
+ safe_mode = False,
output_format=DEFAULT_OUTPUT_FORMAT):
"""
Creates a new Markdown instance.
@@ -200,12 +200,12 @@ class Markdown:
* "xhtml": Outputs latest supported version of XHTML (currently XHTML 1.1).
* "html4": Outputs HTML 4
* "html": Outputs latest supported version of HTML (currently HTML 4).
- Note that it is suggested that the more specific formats ("xhtml1"
+ Note that it is suggested that the more specific formats ("xhtml1"
and "html4") be used as "xhtml" or "html" may change in the future
- if it makes sense at that time.
+ if it makes sense at that time.
"""
-
+
self.safeMode = safe_mode
self.registeredExtensions = []
self.docType = ""
@@ -300,9 +300,9 @@ class Markdown:
# Map format keys to serializers
self.output_formats = {
- 'html' : html4.to_html_string,
+ 'html' : html4.to_html_string,
'html4' : html4.to_html_string,
- 'xhtml' : etree.tostring,
+ 'xhtml' : etree.tostring,
'xhtml1': etree.tostring,
}
@@ -327,12 +327,14 @@ class Markdown:
for ext in extensions:
if isinstance(ext, basestring):
ext = load_extension(ext, configs.get(ext, []))
- try:
- ext.extendMarkdown(self, globals())
- except AttributeError:
- message(ERROR, "Incorrect type! Extension '%s' is "
- "neither a string or an Extension." %(repr(ext)))
-
+ if isinstance(ext, Extension):
+ try:
+ ext.extendMarkdown(self, globals())
+ except NotImplementedError, e:
+ message(ERROR, e)
+ else:
+ message(ERROR, 'Extension "%s.%s" must be of type: "markdown.Extension".' \
+ % (ext.__class__.__module__, ext.__class__.__name__))
def registerExtension(self, extension):
""" This gets called by the extension """
@@ -346,7 +348,8 @@ class Markdown:
self.references.clear()
for extension in self.registeredExtensions:
- extension.reset()
+ if hasattr(extension, 'reset'):
+ extension.reset()
def set_output_format(self, format):
""" Set the output format for the class instance. """
@@ -395,7 +398,7 @@ class Markdown:
root = newRoot
# Serialize _properly_. Strip top-level tags.
- output, length = codecs.utf_8_decode(self.serializer(root, encoding="utf8"))
+ output, length = codecs.utf_8_decode(self.serializer(root, encoding="utf-8"))
if self.stripTopLevelTags:
try:
start = output.index('<%s>'%DOC_TAG)+len(DOC_TAG)+2
@@ -429,7 +432,7 @@ class Markdown:
Keyword arguments:
- * input: Name of source text file.
+ * input: File object or path of file as string.
* output: Name of output file. Writes to stdout if `None`.
* encoding: Encoding of input and output files. Defaults to utf-8.
@@ -438,7 +441,10 @@ class Markdown:
encoding = encoding or "utf-8"
# Read the source
- input_file = codecs.open(input, mode="r", encoding=encoding)
+ if isinstance(input, basestring):
+ input_file = codecs.open(input, mode="r", encoding=encoding)
+ else:
+ input_file = input
text = input_file.read()
input_file.close()
text = text.lstrip(u'\ufeff') # remove the byte-order mark
@@ -447,7 +453,7 @@ class Markdown:
html = self.convert(text)
# Write to file or stdout
- if isinstance(output, (str, unicode)):
+ if isinstance(output, basestring):
output_file = codecs.open(output, "w", encoding=encoding)
output_file.write(html)
output_file.close()
@@ -499,7 +505,8 @@ class Extension:
* md_globals: Global variables in the markdown module namespace.
"""
- pass
+ raise NotImplementedError, 'Extension "%s.%s" must define an "extendMarkdown"' \
+ 'method.' % (self.__class__.__module__, self.__class__.__name__)
def load_extension(ext_name, configs = []):
@@ -540,8 +547,8 @@ def load_extension(ext_name, configs = []):
# function called makeExtension()
try:
return module.makeExtension(configs.items())
- except AttributeError:
- message(CRITICAL, "Failed to initiate extension '%s'" % ext_name)
+ except AttributeError, e:
+ message(CRITICAL, "Failed to initiate extension '%s': %s" % (ext_name, e))
def load_extensions(ext_names):
@@ -582,15 +589,15 @@ def markdown(text,
* "xhtml": Outputs latest supported version of XHTML (currently XHTML 1.1).
* "html4": Outputs HTML 4
* "html": Outputs latest supported version of HTML (currently HTML 4).
- Note that it is suggested that the more specific formats ("xhtml1"
+ Note that it is suggested that the more specific formats ("xhtml1"
and "html4") be used as "xhtml" or "html" may change in the future
- if it makes sense at that time.
+ if it makes sense at that time.
Returns: An HTML document as a string.
"""
md = Markdown(extensions=load_extensions(extensions),
- safe_mode=safe_mode,
+ safe_mode=safe_mode,
output_format=output_format)
return md.convert(text)
@@ -602,7 +609,7 @@ def markdownFromFile(input = None,
safe_mode = False,
output_format = DEFAULT_OUTPUT_FORMAT):
"""Read markdown code from a file and write it to a file or a stream."""
- md = Markdown(extensions=load_extensions(extensions),
+ md = Markdown(extensions=load_extensions(extensions),
safe_mode=safe_mode,
output_format=output_format)
md.convertFile(input, output, encoding)
diff --git a/markdown/commandline.py b/markdown/commandline.py
index 1eedc6d..dce2b8a 100644
--- a/markdown/commandline.py
+++ b/markdown/commandline.py
@@ -2,47 +2,25 @@
COMMAND-LINE SPECIFIC STUFF
=============================================================================
-The rest of the code is specifically for handling the case where Python
-Markdown is called from the command line.
"""
import markdown
import sys
+import optparse
import logging
from logging import DEBUG, INFO, WARN, ERROR, CRITICAL
-EXECUTABLE_NAME_FOR_USAGE = "python markdown.py"
-""" The name used in the usage statement displayed for python versions < 2.3.
-(With python 2.3 and higher the usage statement is generated by optparse
-and uses the actual name of the executable called.) """
-
-OPTPARSE_WARNING = """
-Python 2.3 or higher required for advanced command line options.
-For lower versions of Python use:
-
- %s INPUT_FILE > OUTPUT_FILE
-
-""" % EXECUTABLE_NAME_FOR_USAGE
-
def parse_options():
"""
Define and parse `optparse` options for command-line usage.
"""
-
- try:
- optparse = __import__("optparse")
- except:
- if len(sys.argv) == 2:
- return {'input': sys.argv[1],
- 'output': None,
- 'safe': False,
- 'extensions': [],
- 'encoding': None }, CRITICAL
- else:
- print OPTPARSE_WARNING
- return None, None
-
- parser = optparse.OptionParser(usage="%prog INPUTFILE [options]")
+ usage = """%prog [options] [INPUTFILE]
+ (STDIN is assumed if no INPUTFILE is given)"""
+ desc = "A Python implementation of John Gruber's Markdown. " \
+ "http://www.freewisdom.org/projects/python-markdown/"
+ ver = "%%prog %s" % markdown.version
+
+ parser = optparse.OptionParser(usage=usage, description=desc, version=ver)
parser.add_option("-f", "--file", dest="filename", default=sys.stdout,
help="write output to OUTPUT_FILE",
metavar="OUTPUT_FILE")
@@ -56,10 +34,10 @@ def parse_options():
help="print info messages")
parser.add_option("-s", "--safe", dest="safe", default=False,
metavar="SAFE_MODE",
- help="safe mode ('replace', 'remove' or 'escape' user's HTML tag)")
+ help="'replace', 'remove' or 'escape' HTML tags in input")
parser.add_option("-o", "--output_format", dest="output_format",
default='xhtml1', metavar="OUTPUT_FORMAT",
- help="Format of output. One of 'xhtml1' (default) or 'html4'.")
+ help="'xhtml1' (default) or 'html4'.")
parser.add_option("--noisy",
action="store_const", const=DEBUG, dest="verbose",
help="print debug messages")
@@ -68,9 +46,8 @@ def parse_options():
(options, args) = parser.parse_args()
- if not len(args) == 1:
- parser.print_help()
- return None, None
+ if len(args) == 0:
+ input_file = sys.stdin
else:
input_file = args[0]
diff --git a/markdown/extensions/codehilite.py b/markdown/extensions/codehilite.py
index c5d496b..b9e1760 100644
--- a/markdown/extensions/codehilite.py
+++ b/markdown/extensions/codehilite.py
@@ -10,9 +10,9 @@ Copyright 2006-2008 [Waylan Limberg](http://achinghead.com/).
Project website: <http://www.freewisdom.org/project/python-markdown/CodeHilite>
Contact: markdown@freewisdom.org
-
+
License: BSD (see ../docs/LICENSE for details)
-
+
Dependencies:
* [Python 2.3+](http://python.org/)
* [Markdown 2.0+](http://www.freewisdom.org/projects/python-markdown/)
@@ -38,41 +38,45 @@ class CodeHilite:
Basic Usage:
>>> code = CodeHilite(src = 'some text')
>>> html = code.hilite()
-
+
* src: Source string or any object with a .readline attribute.
-
+
* linenos: (Boolen) Turn line numbering 'on' or 'off' (off by default).
* css_class: Set class name of wrapper div ('codehilite' by default).
-
+
Low Level Usage:
>>> code = CodeHilite()
>>> code.src = 'some text' # String or anything with a .readline attr.
>>> code.linenos = True # True or False; Turns line numbering on or of.
>>> html = code.hilite()
-
+
"""
- def __init__(self, src=None, linenos=False, css_class="codehilite"):
+ def __init__(self, src=None, linenos=False, css_class="codehilite",
+ lang=None, style='default', noclasses=False):
self.src = src
- self.lang = None
+ self.lang = lang
self.linenos = linenos
self.css_class = css_class
+ self.style = style
+ self.noclasses = noclasses
def hilite(self):
"""
- Pass code to the [Pygments](http://pygments.pocoo.org/) highliter with
- optional line numbers. The output should then be styled with css to
- your liking. No styles are applied by default - only styling hooks
- (i.e.: <span class="k">).
+ Pass code to the [Pygments](http://pygments.pocoo.org/) highliter with
+ optional line numbers. The output should then be styled with css to
+ your liking. No styles are applied by default - only styling hooks
+ (i.e.: <span class="k">).
returns : A string of html.
-
+
"""
self.src = self.src.strip('\n')
-
- self._getLang()
+
+ if self.lang == None:
+ self._getLang()
try:
from pygments import highlight
@@ -96,8 +100,10 @@ class CodeHilite:
lexer = guess_lexer(self.src)
except ValueError:
lexer = TextLexer()
- formatter = HtmlFormatter(linenos=self.linenos,
- cssclass=self.css_class)
+ formatter = HtmlFormatter(linenos=self.linenos,
+ cssclass=self.css_class,
+ style=self.style,
+ noclasses=self.noclasses)
return highlight(self.src, lexer, formatter)
def _escape(self, txt):
@@ -114,8 +120,8 @@ class CodeHilite:
txt = txt.replace('\t', ' '*TAB_LENGTH)
txt = txt.replace(" "*4, "&nbsp; &nbsp; ")
txt = txt.replace(" "*3, "&nbsp; &nbsp;")
- txt = txt.replace(" "*2, "&nbsp; ")
-
+ txt = txt.replace(" "*2, "&nbsp; ")
+
# Add line numbers
lines = txt.splitlines()
txt = '<div class="codehilite"><pre><ol>\n'
@@ -126,31 +132,31 @@ class CodeHilite:
def _getLang(self):
- """
+ """
Determines language of a code block from shebang lines and whether said
line should be removed or left in place. If the sheband line contains a
path (even a single /) then it is assumed to be a real shebang lines and
- left alone. However, if no path is given (e.i.: #!python or :::python)
+ left alone. However, if no path is given (e.i.: #!python or :::python)
then it is assumed to be a mock shebang for language identifitation of a
- code fragment and removed from the code block prior to processing for
- code highlighting. When a mock shebang (e.i: #!python) is found, line
- numbering is turned on. When colons are found in place of a shebang
- (e.i.: :::python), line numbering is left in the current state - off
+ code fragment and removed from the code block prior to processing for
+ code highlighting. When a mock shebang (e.i: #!python) is found, line
+ numbering is turned on. When colons are found in place of a shebang
+ (e.i.: :::python), line numbering is left in the current state - off
by default.
-
+
"""
import re
-
+
#split text into lines
lines = self.src.split("\n")
#pull first line to examine
fl = lines.pop(0)
-
+
c = re.compile(r'''
(?:(?:::+)|(?P<shebang>[#]!)) # Shebang or 2 or more colons.
- (?P<path>(?:/\w+)*[/ ])? # Zero or 1 path
- (?P<lang>[\w+-]*) # The language
+ (?P<path>(?:/\w+)*[/ ])? # Zero or 1 path
+ (?P<lang>[\w+-]*) # The language
''', re.VERBOSE)
# search first line for shebang
m = c.search(fl)
@@ -169,7 +175,7 @@ class CodeHilite:
else:
# No match
lines.insert(0, fl)
-
+
self.src = "\n".join(lines).strip("\n")
@@ -184,14 +190,16 @@ class HiliteTreeprocessor(markdown.treeprocessors.Treeprocessor):
for block in blocks:
children = block.getchildren()
if len(children) == 1 and children[0].tag == 'code':
- code = CodeHilite(children[0].text,
+ code = CodeHilite(children[0].text,
linenos=self.config['force_linenos'][0],
- css_class=self.config['css_class'][0])
- placeholder = self.markdown.htmlStash.store(code.hilite(),
+ css_class=self.config['css_class'][0],
+ style=self.config['pygments_style'][0],
+ noclasses=self.config['noclasses'][0])
+ placeholder = self.markdown.htmlStash.store(code.hilite(),
safe=True)
# Clear codeblock in etree instance
block.clear()
- # Change to p element which will later
+ # Change to p element which will later
# be removed when inserting raw html
block.tag = 'p'
block.text = placeholder
@@ -204,19 +212,23 @@ class CodeHiliteExtension(markdown.Extension):
# define default configs
self.config = {
'force_linenos' : [False, "Force line numbers - Default: False"],
- 'css_class' : ["codehilite",
+ 'css_class' : ["codehilite",
"Set class name for wrapper <div> - Default: codehilite"],
+ 'pygments_style' : ['tango', 'Pygments HTML Formatter Style (Colorscheme) - Default: tango'],
+ 'noclasses': [False, 'Use inline styles instead of CSS classes - Default false']
}
-
+
# Override defaults with user settings
for key, value in configs:
- self.setConfig(key, value)
+ self.setConfig(key, value)
def extendMarkdown(self, md, md_globals):
""" Add HilitePostprocessor to Markdown instance. """
hiliter = HiliteTreeprocessor(md)
hiliter.config = self.config
- md.treeprocessors.add("hilite", hiliter, "_begin")
+ md.treeprocessors.add("hilite", hiliter, "_begin")
+
+ md.registerExtension(self)
def makeExtension(configs={}):
diff --git a/markdown/extensions/extra.py b/markdown/extensions/extra.py
index 4a2ffbf..e569029 100644
--- a/markdown/extensions/extra.py
+++ b/markdown/extensions/extra.py
@@ -44,6 +44,8 @@ class ExtraExtension(markdown.Extension):
def extendMarkdown(self, md, md_globals):
""" Register extension instances. """
md.registerExtensions(extensions, self.config)
+ # Turn on processing of markdown text within raw html
+ md.preprocessors['html_block'].markdown_in_raw = True
def makeExtension(configs={}):
return ExtraExtension(configs=dict(configs))
diff --git a/markdown/extensions/fenced_code.py b/markdown/extensions/fenced_code.py
index 307b1dc..2b03bbc 100644
--- a/markdown/extensions/fenced_code.py
+++ b/markdown/extensions/fenced_code.py
@@ -9,7 +9,7 @@ This extension adds Fenced Code Blocks to Python-Markdown.
>>> import markdown
>>> text = '''
... A paragraph before a fenced code block:
- ...
+ ...
... ~~~
... Fenced code block
... ~~~
@@ -22,14 +22,14 @@ Works with safe_mode also (we check this because we are using the HtmlStash):
>>> markdown.markdown(text, extensions=['fenced_code'], safe_mode='replace')
u'<p>A paragraph before a fenced code block:</p>\\n<pre><code>Fenced code block\\n</code></pre>'
-
+
Include tilde's in a code block and wrap with blank lines:
>>> text = '''
... ~~~~~~~~
- ...
+ ...
... ~~~~
- ...
+ ...
... ~~~~~~~~'''
>>> markdown.markdown(text, extensions=['fenced_code'])
u'<pre><code>\\n~~~~\\n\\n</code></pre>'
@@ -40,7 +40,7 @@ Multiple blocks and language tags:
... ~~~~{.python}
... block one
... ~~~~
- ...
+ ...
... ~~~~.html
... <p>block two</p>
... ~~~~'''
@@ -52,39 +52,63 @@ Copyright 2007-2008 [Waylan Limberg](http://achinghead.com/).
Project website: <http://www.freewisdom.org/project/python-markdown/Fenced__Code__Blocks>
Contact: markdown@freewisdom.org
-License: BSD (see ../docs/LICENSE for details)
+License: BSD (see ../docs/LICENSE for details)
Dependencies:
-* [Python 2.3+](http://python.org)
+* [Python 2.4+](http://python.org)
* [Markdown 2.0+](http://www.freewisdom.org/projects/python-markdown/)
+* [Pygments (optional)](http://pygments.org)
"""
import markdown, re
+from markdown.extensions.codehilite import CodeHilite, CodeHiliteExtension
# Global vars
FENCED_BLOCK_RE = re.compile( \
- r'(?P<fence>^~{3,})[ ]*(\{?\.(?P<lang>[a-zA-Z0-9_-]*)\}?)?[ ]*\n(?P<code>.*?)(?P=fence)[ ]*$',
+ r'(?P<fence>^~{3,})[ ]*(\{?\.(?P<lang>[a-zA-Z0-9_-]*)\}?)?[ ]*\n(?P<code>.*?)(?P=fence)[ ]*$',
re.MULTILINE|re.DOTALL
)
CODE_WRAP = '<pre><code%s>%s</code></pre>'
LANG_TAG = ' class="%s"'
-
class FencedCodeExtension(markdown.Extension):
def extendMarkdown(self, md, md_globals):
""" Add FencedBlockPreprocessor to the Markdown instance. """
+ md.registerExtension(self)
- md.preprocessors.add('fenced_code_block',
- FencedBlockPreprocessor(md),
+ md.preprocessors.add('fenced_code_block',
+ FencedBlockPreprocessor(md),
"_begin")
class FencedBlockPreprocessor(markdown.preprocessors.Preprocessor):
-
+
+ def __init__(self, md):
+ markdown.preprocessors.Preprocessor.__init__(self, md)
+
+ self.checked_for_codehilite = False
+ self.codehilite_conf = {}
+
+ def getConfig(self, key):
+ if key in self.config:
+ return self.config[key][0]
+ else:
+ return None
+
def run(self, lines):
""" Match and store Fenced Code Blocks in the HtmlStash. """
+
+ # Check for code hilite extension
+ if not self.checked_for_codehilite:
+ for ext in self.markdown.registeredExtensions:
+ if isinstance(ext, CodeHiliteExtension):
+ self.codehilite_conf = ext.config
+ break
+
+ self.checked_for_codehilite = True
+
text = "\n".join(lines)
while 1:
m = FENCED_BLOCK_RE.search(text)
@@ -92,7 +116,21 @@ class FencedBlockPreprocessor(markdown.preprocessors.Preprocessor):
lang = ''
if m.group('lang'):
lang = LANG_TAG % m.group('lang')
- code = CODE_WRAP % (lang, self._escape(m.group('code')))
+
+ # If config is not empty, then the codehighlite extension
+ # is enabled, so we call it to highlite the code
+ if self.codehilite_conf:
+ highliter = CodeHilite(m.group('code'),
+ linenos=self.codehilite_conf['force_linenos'][0],
+ css_class=self.codehilite_conf['css_class'][0],
+ style=self.codehilite_conf['pygments_style'][0],
+ lang=(m.group('lang') if m.group('lang') else None),
+ noclasses=self.codehilite_conf['noclasses'][0])
+
+ code = highliter.hilite()
+ else:
+ code = CODE_WRAP % (lang, self._escape(m.group('code')))
+
placeholder = self.markdown.htmlStash.store(code, safe=True)
text = '%s\n%s\n%s'% (text[:m.start()], placeholder, text[m.end():])
else:
@@ -109,7 +147,7 @@ class FencedBlockPreprocessor(markdown.preprocessors.Preprocessor):
def makeExtension(configs=None):
- return FencedCodeExtension()
+ return FencedCodeExtension(configs=configs)
if __name__ == "__main__":
diff --git a/markdown/extensions/footnotes.py b/markdown/extensions/footnotes.py
index 6dacab7..e1a9cda 100644
--- a/markdown/extensions/footnotes.py
+++ b/markdown/extensions/footnotes.py
@@ -38,11 +38,18 @@ class FootnoteExtension(markdown.Extension):
""" Setup configs. """
self.config = {'PLACE_MARKER':
["///Footnotes Go Here///",
- "The text string that marks where the footnotes go"]}
+ "The text string that marks where the footnotes go"],
+ 'UNIQUE_IDS':
+ [False,
+ "Avoid name collisions across "
+ "multiple calls to reset()."]}
for key, value in configs:
self.config[key][0] = value
-
+
+ # In multiple invocations, emit links that don't get tangled.
+ self.unique_prefix = 0
+
self.reset()
def extendMarkdown(self, md, md_globals):
@@ -66,8 +73,9 @@ class FootnoteExtension(markdown.Extension):
">amp_substitute")
def reset(self):
- """ Clear the footnotes on reset. """
+ """ Clear the footnotes on reset, and prepare for a distinct document. """
self.footnotes = markdown.odict.OrderedDict()
+ self.unique_prefix += 1
def findFootnotesPlaceholder(self, root):
""" Return ElementTree Element that contains Footnote placeholder. """
@@ -91,11 +99,17 @@ class FootnoteExtension(markdown.Extension):
def makeFootnoteId(self, id):
""" Return footnote link id. """
- return 'fn:%s' % id
+ if self.getConfig("UNIQUE_IDS"):
+ return 'fn:%d-%s' % (self.unique_prefix, id)
+ else:
+ return 'fn:%s' % id
def makeFootnoteRefId(self, id):
""" Return footnote back-link id. """
- return 'fnref:%s' % id
+ if self.getConfig("UNIQUE_IDS"):
+ return 'fnref:%d-%s' % (self.unique_prefix, id)
+ else:
+ return 'fnref:%s' % id
def makeFootnotesDiv(self, root):
""" Return div of footnotes as et Element. """
diff --git a/markdown/extensions/toc.py b/markdown/extensions/toc.py
index 1624ccf..fd2a86a 100644
--- a/markdown/extensions/toc.py
+++ b/markdown/extensions/toc.py
@@ -60,13 +60,9 @@ class TocTreeprocessor(markdown.treeprocessors.Treeprocessor):
if header_rgx.match(c.tag):
tag_level = int(c.tag[-1])
- # Regardless of how many levels we jumped
- # only one list should be created, since
- # empty lists containing lists are illegal.
-
- if tag_level < level:
+ while tag_level < level:
list_stack.pop()
- level = tag_level
+ level -= 1
if tag_level > level:
newlist = etree.Element("ul")
@@ -75,7 +71,10 @@ class TocTreeprocessor(markdown.treeprocessors.Treeprocessor):
else:
list_stack[-1].append(newlist)
list_stack.append(newlist)
- level = tag_level
+ if level == 0:
+ level = tag_level
+ else:
+ level += 1
# Do not override pre-existing ids
if not "id" in c.attrib:
diff --git a/markdown/inlinepatterns.py b/markdown/inlinepatterns.py
index 331bead..917a9d3 100644
--- a/markdown/inlinepatterns.py
+++ b/markdown/inlinepatterns.py
@@ -69,7 +69,7 @@ STRONG_RE = r'(\*{2}|_{2})(.+?)\2' # **strong**
STRONG_EM_RE = r'(\*{3}|_{3})(.+?)\2' # ***strong***
if markdown.SMART_EMPHASIS:
- EMPHASIS_2_RE = r'(?<!\S)(_)(\S.+?)\2' # _emphasis_
+ EMPHASIS_2_RE = r'(?<!\w)(_)(\S.+?)\2(?!\w)' # _emphasis_
else:
EMPHASIS_2_RE = r'(_)(.+?)\2' # _emphasis_
diff --git a/markdown/preprocessors.py b/markdown/preprocessors.py
index ef04cab..b199f0a 100644
--- a/markdown/preprocessors.py
+++ b/markdown/preprocessors.py
@@ -77,20 +77,53 @@ class HtmlBlockPreprocessor(Preprocessor):
"""Remove html blocks from the text and store them for later retrieval."""
right_tag_patterns = ["</%s>", "%s>"]
+ attrs_pattern = r"""
+ \s+(?P<attr>[^>"'/= ]+)=(?P<q>['"])(?P<value>.*?)(?P=q) # attr="value"
+ | # OR
+ \s+(?P<attr1>[^>"'/= ]+)=(?P<value1>[^> ]+) # attr=value
+ | # OR
+ \s+(?P<attr2>[^>"'/= ]+) # attr
+ """
+ left_tag_pattern = r'^\<(?P<tag>[^> ]+)(?P<attrs>(%s)*)\s*\/?\>?' % attrs_pattern
+ attrs_re = re.compile(attrs_pattern, re.VERBOSE)
+ left_tag_re = re.compile(left_tag_pattern, re.VERBOSE)
+ markdown_in_raw = False
def _get_left_tag(self, block):
- return block[1:].replace(">", " ", 1).split()[0].lower()
+ m = self.left_tag_re.match(block)
+ if m:
+ tag = m.group('tag')
+ raw_attrs = m.group('attrs')
+ attrs = {}
+ if raw_attrs:
+ for ma in self.attrs_re.finditer(raw_attrs):
+ if ma.group('attr'):
+ if ma.group('value'):
+ attrs[ma.group('attr').strip()] = ma.group('value')
+ else:
+ attrs[ma.group('attr').strip()] = ""
+ elif ma.group('attr1'):
+ if ma.group('value1'):
+ attrs[ma.group('attr1').strip()] = ma.group('value1')
+ else:
+ attrs[ma.group('attr1').strip()] = ""
+ elif ma.group('attr2'):
+ attrs[ma.group('attr2').strip()] = ""
+ return tag, len(m.group(0)), attrs
+ else:
+ tag = block[1:].replace(">", " ", 1).split()[0].lower()
+ return tag, len(tag+2), {}
- def _get_right_tag(self, left_tag, block):
+ def _get_right_tag(self, left_tag, left_index, block):
for p in self.right_tag_patterns:
tag = p % left_tag
i = block.rfind(tag)
if i > 2:
- return tag.lstrip("<").rstrip(">"), i + len(p)-2 + len(left_tag)
- return block.rstrip()[-len(left_tag)-2:-1].lower(), len(block)
+ return tag.lstrip("<").rstrip(">"), i + len(p)-2 + left_index-2
+ return block.rstrip()[-left_index:-1].lower(), len(block)
def _equal_tags(self, left_tag, right_tag):
- if left_tag == 'div' or left_tag[0] in ['?', '@', '%']: # handle PHP, etc.
+ if left_tag[0] in ['?', '@', '%']: # handle PHP, etc.
return True
if ("/" + left_tag) == right_tag:
return True
@@ -113,7 +146,7 @@ class HtmlBlockPreprocessor(Preprocessor):
left_tag = ''
right_tag = ''
in_tag = False # flag
-
+
while text:
block = text[0]
if block.startswith("\n"):
@@ -125,13 +158,17 @@ class HtmlBlockPreprocessor(Preprocessor):
if not in_tag:
if block.startswith("<"):
- left_tag = self._get_left_tag(block)
- right_tag, data_index = self._get_right_tag(left_tag, block)
+ left_tag, left_index, attrs = self._get_left_tag(block)
+ right_tag, data_index = self._get_right_tag(left_tag,
+ left_index,
+ block)
if block[1] == "!":
# is a comment block
left_tag = "--"
- right_tag, data_index = self._get_right_tag(left_tag, block)
+ right_tag, data_index = self._get_right_tag(left_tag,
+ left_index,
+ block)
# keep checking conditions below and maybe just append
if data_index < len(block) \
@@ -147,13 +184,24 @@ class HtmlBlockPreprocessor(Preprocessor):
if self._is_oneliner(left_tag):
new_blocks.append(block.strip())
continue
-
+
if block.rstrip().endswith(">") \
and self._equal_tags(left_tag, right_tag):
- new_blocks.append(
- self.markdown.htmlStash.store(block.strip()))
+ if self.markdown_in_raw and 'markdown' in attrs.keys():
+ start = re.sub(r'\smarkdown(=[\'"]?[^> ]*[\'"]?)?',
+ '', block[:left_index])
+ end = block[-len(right_tag)-2:]
+ block = block[left_index:-len(right_tag)-2]
+ new_blocks.append(
+ self.markdown.htmlStash.store(start))
+ new_blocks.append(block)
+ new_blocks.append(
+ self.markdown.htmlStash.store(end))
+ else:
+ new_blocks.append(
+ self.markdown.htmlStash.store(block.strip()))
continue
- else: #if not block[1] == "!":
+ else:
# if is block level tag and is not complete
if markdown.isBlockLevel(left_tag) or left_tag == "--" \
@@ -169,19 +217,47 @@ class HtmlBlockPreprocessor(Preprocessor):
new_blocks.append(block)
else:
- items.append(block.strip())
+ items.append(block)
- right_tag, data_index = self._get_right_tag(left_tag, block)
+ right_tag, data_index = self._get_right_tag(left_tag,
+ left_index,
+ block)
if self._equal_tags(left_tag, right_tag):
# if find closing tag
in_tag = False
- new_blocks.append(
- self.markdown.htmlStash.store('\n\n'.join(items)))
+ if self.markdown_in_raw and 'markdown' in attrs.keys():
+ start = re.sub(r'\smarkdown(=[\'"]?[^> ]*[\'"]?)?',
+ '', items[0][:left_index])
+ items[0] = items[0][left_index:]
+ end = items[-1][-len(right_tag)-2:]
+ items[-1] = items[-1][:-len(right_tag)-2]
+ new_blocks.append(
+ self.markdown.htmlStash.store(start))
+ new_blocks.extend(items)
+ new_blocks.append(
+ self.markdown.htmlStash.store(end))
+ else:
+ new_blocks.append(
+ self.markdown.htmlStash.store('\n\n'.join(items)))
items = []
if items:
- new_blocks.append(self.markdown.htmlStash.store('\n\n'.join(items)))
+ if self.markdown_in_raw and 'markdown' in attrs.keys():
+ start = re.sub(r'\smarkdown(=[\'"]?[^> ]*[\'"]?)?',
+ '', items[0][:left_index])
+ items[0] = items[0][left_index:]
+ end = items[-1][-len(right_tag)-2:]
+ items[-1] = items[-1][:-len(right_tag)-2]
+ new_blocks.append(
+ self.markdown.htmlStash.store(start))
+ new_blocks.extend(items)
+ new_blocks.append(
+ self.markdown.htmlStash.store(end))
+ else:
+ new_blocks.append(
+ self.markdown.htmlStash.store('\n\n'.join(items)))
+ #new_blocks.append(self.markdown.htmlStash.store('\n\n'.join(items)))
new_blocks.append('\n')
new_text = "\n\n".join(new_blocks)
diff --git a/markdown/tests/misc/div.html b/markdown/tests/misc/div.html
index 7cd0d6d..7b68854 100644
--- a/markdown/tests/misc/div.html
+++ b/markdown/tests/misc/div.html
@@ -1,4 +1,5 @@
<div id="sidebar">
-<p><em>foo</em></p>
+ _foo_
+
</div> \ No newline at end of file
diff --git a/markdown/tests/misc/html.html b/markdown/tests/misc/html.html
index 81ac5ee..cd6d4af 100644
--- a/markdown/tests/misc/html.html
+++ b/markdown/tests/misc/html.html
@@ -5,5 +5,9 @@
<p>Now some <arbitrary>arbitrary tags</arbitrary>.</p>
<div>More block level html.</div>
+<div class="foo bar" title="with 'quoted' text." valueless_attr weirdness="<i>foo</i>">
+Html with various attributes.
+</div>
+
<p>And of course <script>blah</script>.</p>
<p><a href="script&gt;stuff&lt;/script">this <script>link</a></p> \ No newline at end of file
diff --git a/markdown/tests/misc/html.txt b/markdown/tests/misc/html.txt
index 3ac3ae0..c08fe1d 100644
--- a/markdown/tests/misc/html.txt
+++ b/markdown/tests/misc/html.txt
@@ -7,6 +7,10 @@ Now some <arbitrary>arbitrary tags</arbitrary>.
<div>More block level html.</div>
+<div class="foo bar" title="with 'quoted' text." valueless_attr weirdness="<i>foo</i>">
+Html with various attributes.
+</div>
+
And of course <script>blah</script>.
[this <script>link](<script>stuff</script>)
diff --git a/markdown/tests/misc/multi-line-tags.html b/markdown/tests/misc/multi-line-tags.html
index 763a050..784c1dd 100644
--- a/markdown/tests/misc/multi-line-tags.html
+++ b/markdown/tests/misc/multi-line-tags.html
@@ -1,4 +1,5 @@
<div>
-<p>asdf asdfasd</p>
+asdf asdfasd
+
</div> \ No newline at end of file
diff --git a/markdown/tests/misc/multiline-comments.html b/markdown/tests/misc/multiline-comments.html
index 547ba0b..12f8cb5 100644
--- a/markdown/tests/misc/multiline-comments.html
+++ b/markdown/tests/misc/multiline-comments.html
@@ -2,7 +2,7 @@
foo
--->
+-->
<p>
@@ -12,5 +12,6 @@ foo
<div>
-<p>foo</p>
+foo
+
</div> \ No newline at end of file
diff --git a/markdown/treeprocessors.py b/markdown/treeprocessors.py
index 1dc612a..ca3b02b 100644
--- a/markdown/treeprocessors.py
+++ b/markdown/treeprocessors.py
@@ -275,24 +275,25 @@ class InlineProcessor(Treeprocessor):
if child.getchildren():
stack.append(child)
- for element, lst in insertQueue:
- if element.text:
- element.text = \
- markdown.inlinepatterns.handleAttributes(element.text,
- element)
- i = 0
- for newChild in lst:
- # Processing attributes
- if newChild.tail:
- newChild.tail = \
- markdown.inlinepatterns.handleAttributes(newChild.tail,
+ if markdown.ENABLE_ATTRIBUTES:
+ for element, lst in insertQueue:
+ if element.text:
+ element.text = \
+ markdown.inlinepatterns.handleAttributes(element.text,
element)
- if newChild.text:
- newChild.text = \
- markdown.inlinepatterns.handleAttributes(newChild.text,
- newChild)
- element.insert(i, newChild)
- i += 1
+ i = 0
+ for newChild in lst:
+ # Processing attributes
+ if newChild.tail:
+ newChild.tail = \
+ markdown.inlinepatterns.handleAttributes(newChild.tail,
+ element)
+ if newChild.text:
+ newChild.text = \
+ markdown.inlinepatterns.handleAttributes(newChild.text,
+ newChild)
+ element.insert(i, newChild)
+ i += 1
return tree
diff --git a/setup.py b/setup.py
index cbfcaae..1224dd9 100755
--- a/setup.py
+++ b/setup.py
@@ -3,7 +3,8 @@
import sys, os
from distutils.core import setup
from distutils.command.install_scripts import install_scripts
-from markdown import version
+
+version = '2.0.3'
class md_install_scripts(install_scripts):
""" Customized install_scripts. Create markdown.bat for win32. """
@@ -23,39 +24,44 @@ class md_install_scripts(install_scripts):
except Exception, e:
print 'ERROR: Unable to create %s: %s' % (bat_path, e)
-setup(
- name = 'Markdown',
- version = version,
- url = 'http://www.freewisdom.org/projects/python-markdown',
- download_url = 'http://pypi.python.org/packages/source/M/Markdown/Markdown-%s.tar.gz'%version,
- description = "Python implementation of Markdown.",
- author = "Manfred Stienstra and Yuri takhteyev",
- author_email = "yuri [at] freewisdom.org",
- maintainer = "Waylan Limberg",
- maintainer_email = "waylan [at] gmail.com",
- license = "BSD License",
- packages = ['markdown', 'markdown.extensions', 'markdown.tests'],
- scripts = ['bin/markdown'],
- package_data = {'': ['tests/*/*.txt', 'tests/*/*.html', 'tests/*/*.cfg',
+data = dict(
+ name = 'Markdown',
+ version = version,
+ url = 'http://www.freewisdom.org/projects/python-markdown',
+ download_url = 'http://pypi.python.org/packages/source/M/Markdown/Markdown-%s.tar.gz' % version,
+ description = 'Python implementation of Markdown.',
+ author = 'Manfred Stienstra and Yuri takhteyev',
+ author_email = 'yuri [at] freewisdom.org',
+ maintainer = 'Waylan Limberg',
+ maintainer_email = 'waylan [at] gmail.com',
+ license = 'BSD License',
+ packages = ['markdown', 'markdown.extensions', 'markdown.tests'],
+ package_data = {'': ['tests/*/*.txt', 'tests/*/*.html', 'tests/*/*.cfg',
'tests/*/*/*.txt', 'tests/*/*/*.html', 'tests/*/*/*.cfg']},
- cmdclass = {'install_scripts': md_install_scripts},
- classifiers = ['Development Status :: 5 - Production/Stable',
- 'License :: OSI Approved :: BSD License',
- 'Operating System :: OS Independent',
- 'Programming Language :: Python',
- 'Programming Language :: Python :: 2',
- 'Programming Language :: Python :: 2.3',
- 'Programming Language :: Python :: 2.4',
- 'Programming Language :: Python :: 2.5',
- 'Programming Language :: Python :: 2.6',
- 'Programming Language :: Python :: 3',
- 'Programming Language :: Python :: 3.0',
- 'Topic :: Communications :: Email :: Filters',
- 'Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries',
- 'Topic :: Internet :: WWW/HTTP :: Site Management',
- 'Topic :: Software Development :: Documentation',
- 'Topic :: Software Development :: Libraries :: Python Modules',
- 'Topic :: Text Processing :: Filters',
- 'Topic :: Text Processing :: Markup :: HTML',
- ],
+ scripts = ['bin/markdown'],
+ cmdclass = {'install_scripts': md_install_scripts},
+ classifiers = ['Development Status :: 5 - Production/Stable',
+ 'License :: OSI Approved :: BSD License',
+ 'Operating System :: OS Independent',
+ 'Programming Language :: Python',
+ 'Programming Language :: Python :: 2',
+ 'Programming Language :: Python :: 2.3',
+ 'Programming Language :: Python :: 2.4',
+ 'Programming Language :: Python :: 2.5',
+ 'Programming Language :: Python :: 2.6',
+ 'Programming Language :: Python :: 3',
+ 'Programming Language :: Python :: 3.0',
+ 'Topic :: Communications :: Email :: Filters',
+ 'Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries',
+ 'Topic :: Internet :: WWW/HTTP :: Site Management',
+ 'Topic :: Software Development :: Documentation',
+ 'Topic :: Software Development :: Libraries :: Python Modules',
+ 'Topic :: Text Processing :: Filters',
+ 'Topic :: Text Processing :: Markup :: HTML',
+ ],
)
+
+if sys.version[:3] < '2.5':
+ data['install_requires'] = ['elementtree']
+
+setup(**data)
diff --git a/tests/extensions-x-extra/raw-html.html b/tests/extensions-x-extra/raw-html.html
new file mode 100644
index 0000000..b2a7c4d
--- /dev/null
+++ b/tests/extensions-x-extra/raw-html.html
@@ -0,0 +1,14 @@
+<div>
+
+<p><em>foo</em></p>
+</div>
+
+<div class="baz">
+
+<p><em>bar</em></p>
+</div>
+
+<div>
+
+<p><em>blah</em></p>
+</div> \ No newline at end of file
diff --git a/tests/extensions-x-extra/raw-html.txt b/tests/extensions-x-extra/raw-html.txt
new file mode 100644
index 0000000..284fe0c
--- /dev/null
+++ b/tests/extensions-x-extra/raw-html.txt
@@ -0,0 +1,12 @@
+<div markdown="1">_foo_</div>
+
+<div markdown=1 class="baz">
+_bar_
+</div>
+
+<div markdown>
+
+_blah_
+
+</div>
+
diff --git a/tests/extensions-x-toc/nested.html b/tests/extensions-x-toc/nested.html
new file mode 100644
index 0000000..a8a1583
--- /dev/null
+++ b/tests/extensions-x-toc/nested.html
@@ -0,0 +1,16 @@
+<h1 id="header-a">Header A</h1>
+<h2 id="header-1">Header 1</h2>
+<h3 id="header-i">Header i</h3>
+<h1 id="header-b">Header B</h1>
+<div class="toc">
+<ul>
+<li><a href="#header-a">Header A</a><ul>
+<li><a href="#header-1">Header 1</a><ul>
+<li><a href="#header-i">Header i</a></li>
+</ul>
+</li>
+</ul>
+</li>
+<li><a href="#header-b">Header B</a></li>
+</ul>
+</div> \ No newline at end of file
diff --git a/tests/extensions-x-toc/nested.txt b/tests/extensions-x-toc/nested.txt
new file mode 100644
index 0000000..9b515f9
--- /dev/null
+++ b/tests/extensions-x-toc/nested.txt
@@ -0,0 +1,9 @@
+# Header A
+
+## Header 1
+
+### Header i
+
+# Header B
+
+[TOC]
diff --git a/tests/extensions-x-toc/nested2.html b/tests/extensions-x-toc/nested2.html
new file mode 100644
index 0000000..bf87716
--- /dev/null
+++ b/tests/extensions-x-toc/nested2.html
@@ -0,0 +1,14 @@
+<div class="toc">
+<ul>
+<li><a href="#start-with-header-other-than-one">Start with header other than one.</a></li>
+<li><a href="#header-3">Header 3</a><ul>
+<li><a href="#header-4">Header 4</a></li>
+</ul>
+</li>
+<li><a href="#header-3_1">Header 3</a></li>
+</ul>
+</div>
+<h3 id="start-with-header-other-than-one">Start with header other than one.</h3>
+<h3 id="header-3">Header 3</h3>
+<h4 id="header-4">Header 4</h4>
+<h3 id="header-3_1">Header 3</h3> \ No newline at end of file
diff --git a/tests/extensions-x-toc/nested2.txt b/tests/extensions-x-toc/nested2.txt
new file mode 100644
index 0000000..9db4d8c
--- /dev/null
+++ b/tests/extensions-x-toc/nested2.txt
@@ -0,0 +1,10 @@
+[TOC]
+
+### Start with header other than one.
+
+### Header 3
+
+#### Header 4
+
+### Header 3
+
diff --git a/tests/misc/raw_whitespace.html b/tests/misc/raw_whitespace.html
new file mode 100644
index 0000000..7a6f131
--- /dev/null
+++ b/tests/misc/raw_whitespace.html
@@ -0,0 +1,8 @@
+<p>Preserve whitespace in raw html</p>
+<pre>
+class Foo():
+ bar = 'bar'
+
+ def baz(self):
+ print self.bar
+</pre> \ No newline at end of file
diff --git a/tests/misc/raw_whitespace.txt b/tests/misc/raw_whitespace.txt
new file mode 100644
index 0000000..bbc7cec
--- /dev/null
+++ b/tests/misc/raw_whitespace.txt
@@ -0,0 +1,10 @@
+Preserve whitespace in raw html
+
+<pre>
+class Foo():
+ bar = 'bar'
+
+ def baz(self):
+ print self.bar
+</pre>
+
diff --git a/tests/misc/smart_em.html b/tests/misc/smart_em.html
new file mode 100644
index 0000000..5683b25
--- /dev/null
+++ b/tests/misc/smart_em.html
@@ -0,0 +1,5 @@
+<p><em>emphasis</em></p>
+<p>this_is_not_emphasis</p>
+<p>[<em>punctuation with emphasis</em>]</p>
+<p>[<em>punctuation_with_emphasis</em>]</p>
+<p>[punctuation_without_emphasis]</p> \ No newline at end of file
diff --git a/tests/misc/smart_em.txt b/tests/misc/smart_em.txt
new file mode 100644
index 0000000..3c56842
--- /dev/null
+++ b/tests/misc/smart_em.txt
@@ -0,0 +1,9 @@
+_emphasis_
+
+this_is_not_emphasis
+
+[_punctuation with emphasis_]
+
+[_punctuation_with_emphasis_]
+
+[punctuation_without_emphasis]