title: Library Reference prev_title: Installation prev_url: install.html next_title: Command Line next_url: cli.html Using Markdown as a Python Library ================================== First and foremost, Python-Markdown is intended to be a python library module used by various projects to convert Markdown syntax into HTML. The Basics ---------- To use markdown as a module: import markdown html = markdown.markdown(your_text_string) The Details ----------- Python-Markdown provides two public functions ([`markdown.markdown`](#markdown) and [`markdown.markdownFromFile`](#markdownFromFile)) both of which wrap the public class [`markdown.Markdown`](#Markdown). If you're processing one document at a time, these functions will serve your needs. However, if you need to process multiple documents, it may be advantageous to create a single instance of the `markdown.Markdown` class and pass multiple documents through it. If you do use a single instance though, make sure to call the `reset` method appropriately ([see below](#convert)). ### `markdown.markdown (text [, **kwargs])` {: #markdown } The following options are available on the `markdown.markdown` function: * __`text`__{: #text } (required): The source unicode string. !!! note "Important" Python-Markdown expects **Unicode** as input (although some simple ASCII strings *may* work) and returns output as Unicode. Do not pass encoded strings to it! If your input is encoded, (e.g. as UTF-8), it is your responsibility to decode it. For example: input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8") text = input_file.read() html = markdown.markdown(text) If you want to write the output to disk, you *must* encode it yourself: output_file = codecs.open("some_file.html", "w", encoding="utf-8", errors="xmlcharrefreplace" ) output_file.write(html) * __`extensions`__{: #extensions }: A list of extensions. Python-Markdown provides an [API](extensions/api.html) for third parties to write extensions to the parser adding their own additions or changes to the syntax. A few commonly used extensions are shipped with the markdown library. See the [extension documentation](extensions/index.html) for a list of available extensions. The list of extensions may contain instances of extensions and/or strings of extension names. extensions=[MyExtension(), 'path.to.my.ext', 'extra'] !!! warning The prefered method is to pass in an instance of an extension. The other methods may be removed in a future version of Python-Markdown. When passing in extension instances, each class instance must be a subclass of `markdown.extensions.Extension` and any configuration options should be defined when initiating the class instance rather than using the [extension_configs](#extension_configs) keyword. For example: from markdown.extensions import Extension class MyExtension(Extension): # define your extension here... markdown.markdown(text, extensions=[MyExtension(option='value')]) If an extension name is provided as a string, the extension must be importable as a python module on your PYTHONPATH. Python's dot notation is supported. Therefore, to import the 'extra' extension, one could do `extensions=['markdown.extensions.extra']` However, if no dots are provided in the string (`extensions=['extra']`) Markdown will first look for the module `markdown.extensions.extra` (the built-in extension), then a module named `mdx_extra` ('mdx_' will be appended to the beginning of the string) at the root of your PYTHONPATH. When loading an extension by name (as a string), you may either pass in configuration settings to the extension using the [extension_configs](#extension_configs) keyword or by appending the settings to the name in the following format: extensions=['name(option1=value,option2=value)'] Note that there are no quotes or whitespace in the above format, which severely limits how it can be used. For more complex settings, it is suggested that an instance of a class be passed in or, if that's not possible, the [extension_configs](#extension_configs) keyword be used. !!! seealso "See Also" See the documentation of the [Extension API](extensions/api.html) for assistance in creating extensions. * __`extension_configs`__{: #extension_configs }: A dictionary of configuration settings for extensions. Any configuration settings will only be passed to extensions loaded by name (as a string). When loading extensions as class instances, pass the configuration settings directly to the class when initializing it. !!! Note The prefered method is to pass in an instance of an extension, which does not require use of the `extension_configs` keyword at all. See the [extensions](#extensions) keyword for details. The dictionary of configuration settings must be in the following format: extension_configs = {'extension_name_1': [ ('option_1', 'value_1'), ('option_2', 'value_2') ], 'extension_name_2': [ ('option_1', 'value_1') ] } See the documentation specific to the extension you are using for help in specifying configuration settings for that extension. * __`output_format`__{: #output_format }: Format of output. Supported formats are: * `"xhtml1"`: Outputs XHTML 1.x. **Default**. * `"xhtml5"`: Outputs XHTML style tags of HTML 5 * `"xhtml"`: Outputs latest supported version of XHTML (currently XHTML 1.1). * `"html4"`: Outputs HTML 4 * `"html5"`: Outputs HTML style tags of HTML 5 * `"html"`: Outputs latest supported version of HTML (currently HTML 4). The values can be in either lowercase or uppercase. !!! warning It is suggested that the more specific formats ("xhtml1", "html5", & "html4") be used as the more general formats ("xhtml" or "html") may change in the future if it makes sense at that time. * __`safe_mode`__{: #safe_mode }: Disallow raw html. If you are using Markdown on a web system which will transform text provided by untrusted users, you may want to use the "safe_mode" option which ensures that the user's HTML tags are either replaced, removed or escaped. (They can still create links using Markdown syntax.) The following values are accepted: * `False` (Default): Raw HTML is passed through unaltered. * `replace`: Replace all HTML blocks with the text assigned to `html_replacement_text` To maintain backward compatibility, setting `safe_mode=True` will have the same effect as `safe_mode='replace'`. To replace raw HTML with something other than the default, do: md = markdown.Markdown(safe_mode='replace', html_replacement_text='--RAW HTML NOT ALLOWED--') * `remove`: All raw HTML will be completely stripped from the text with no warning to the author. * `escape`: All raw HTML will be escaped and included in the document. For example, the following source: Foo bar. Will result in the following HTML:
Foo <b>bar</b>.
!!! Note "safe_mode" also alters the default value for the [`enable_attributes`](#enable_attributes) option. !!! seealso "See Also" HTML sanitizers (like [Bleach]) may provide a better solution for dealing with markdown text submitted by untrusted users. That way, both the HTML generated by Markdown and user submited raw HTML are fully sanitized. import markdown import bleach html = bleach.clean(markdown.markdown(evil_text)) [Bleach]: https://github.com/jsocol/bleach * __`html_replacement_text`__{: #html_replacement_text }: Text used when safe_mode is set to `replace`. Defaults to `[HTML_REMOVED]`. * __`tab_length`__{: #tab_length }: Length of tabs in the source. Default: 4 * __`enable_attributes`__{: #enable_attributes}: Enable the conversion of attributes. Defaults to `True`, unless [`safe_mode`](#safe_mode) is enabled, in which case the default is `False`. !!! Note `safe_mode` only overrides the default. If `enable_attributes` is explicitly set, the explicit value is used regardless of `safe_mode`. However, this could potentially allow an untrusted user to inject JavaScript into your documents. * __`smart_emphasis`__{: #smart_emphasis }: Treat `_connected_words_` intelligently Default: True * __`lazy_ol`__{: #lazy_ol }: Ignore number of first item of ordered lists. Default: True Given the following list: 4. Apples 5. Oranges 6. Pears By default markdown will ignore the fact the the first line started with item number "4" and the HTML list will start with a number "1". If `lazy_ol` is set to `False`, then markdown will output the following HTML: