title:      Library Reference
prev_title: Installation
prev_url:   install.html
next_title: Command Line
next_url:   cli.html


Using Markdown as a Python Library
==================================

First and foremost, Python-Markdown is intended to be a python library module
used by various projects to convert Markdown syntax into HTML.

The Basics
----------

To use markdown as a module:

    import markdown
    html = markdown.markdown(your_text_string)

The Details
-----------

Python-Markdown provides two public functions ([`markdown.markdown`](#markdown)
and [`markdown.markdownFromFile`](#markdownFromFile)) both of which wrap the 
public class [`markdown.Markdown`](#Markdown). If you're processing one 
document at a time, the functions will serve your needs. However, if you need 
to process multiple documents, it may be advantageous to create a single 
instance of the `markdown.Markdown` class and pass multiple documents through 
it.

### `markdown.markdown (text [, **kwargs])` {: #markdown }

The following options are available on the `markdown.markdown` function:

* __`text`__{: #text } (required): The source text string.

    Note that Python-Markdown expects **Unicode** as input (although
    a simple ASCII string may work) and returns output as Unicode.
    Do not pass encoded strings to it! If your input is encoded, (e.g. as 
    UTF-8), it is your responsibility to decode it.  For example:

        input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
        text = input_file.read()
        html = markdown.markdown(text)

    If you want to write the output to disk, you must encode it yourself:

        output_file = codecs.open("some_file.html", "w", 
                                  encoding="utf-8", 
                                  errors="xmlcharrefreplace"
        )
        output_file.write(html)

* __`extensions`__{: #extensions }: A list of extensions.

    Python-Markdown provides an [API](extensions/api.html) for third parties to
    write extensions to the parser adding their own additions or changes to the
    syntax. A few commonly used extensions are shipped with the markdown 
    library. See the [extension documentation](extensions/index.html) for a 
    list of available extensions.

    The list of extensions may contain instances of extensions and/or strings 
    of extension names. 

        extensions=[MyExtension(), 'path.to.my.ext', 'extra']
    
    When passing in extension instances, each class instance must be a subclass
    of `markdown.extensions.Extension` and any configuration options should be 
    defined when initiating the class instance rather than using the 
    [extension_configs](#extension_configs) keyword. For example:

        from markdown.extensions import Extension
        class MyExtension(Extension):
            # define your extension here...
    
        markdown.markdown(text, extensions=[MyExtension(configs={'option': 'value'}))

    If an extension name is provided as a string, the extension must be 
    importable as a python module on your PYTHONPATH. Python's dot notation is 
    supported. Therefore, to import the 'extra' extension, one could do 
    `extensions=['markdown.extensions.extra']` However, if no dots are provided 
    in the string (`extensions=['extra']`) Markdown will first look for the 
    module `markdown.extensions.extra` (the built-in extension), then a module 
    named `mdx_extra` ('mdx_' will be appended to the beginning of the string) 
    at the root of your PYTHONPATH. 

    When loading an extension by name (as a sting), you may either pass in
    configuration settings to the extension using the 
    [extension_configs](#extension_configs) keyword or by appending the 
    settings to the name in the following format:

        extensions=['name(option1=value,option2=value)']
    
    Note that there are no quotes or whitespace in the above format, which 
    severely limits how it can be used. For more complex settings, it is 
    suggested that the [extension_configs](#extension_configs) keyword
    be used or an instance of a class be passed in.

    See the documentation of the [Extension API](extensions/api.html) for 
    assistance in creating extensions.

* __`extension_configs`__{: #extension_configs }: A dictionary of
  configuration settings for extensions.

    Any configuration settings will only be passed to extensions loaded by name 
    (as a string). When loading extensions as class instances, pass the 
    configuration settings directly to the class when initializing it. 
    
    The dictionary of configuration settings must be in the following format:

        extension_configs = {'extension_name_1':
                               [
                                  ('option_1', 'value_1'),
                                  ('option_2', 'value_2')
                               ],
                             'extension_name_2':
                               [
                                  ('option_1', 'value_1')
                               ]
                            }
    See the documentation specific to the extension you are using for help in 
    specifying configuration settings for that extension.

* __`output_format`__{: #output_format }: Format of output. 

    Supported formats are:

    * `"xhtml1"`: Outputs XHTML 1.x. **Default**.
    * `"xhtml5"`: Outputs XHTML style tags of HTML 5
    * `"xhtml"`: Outputs latest supported version of XHTML (currently XHTML 1.1).
    * `"html4"`: Outputs HTML 4
    * `"html5"`: Outputs HTML style tags of HTML 5
    * `"html"`: Outputs latest supported version of HTML (currently HTML 4).

    Note that it is suggested that the more specific formats ("xhtml1",
    "html5", & "html4") be used as "xhtml" or "html" may change in the future
    if it makes sense at that time. The values can either be lowercase or 
    uppercase.

* __`safe_mode`__{: #safe_mode }: Disallow raw html.

    If you are using Markdown on a web system which will transform text 
    provided by untrusted users, you may want to use the "safe_mode" 
    option which ensures that the user's HTML tags are either replaced, 
    removed or escaped. (They can still create links using Markdown syntax.)

    The following values are accepted:

    * `False` (Default): Raw HTML is passed through unaltered.

    * `replace`: Replace all HTML blocks with the text assigned to 
      `html_replacement_text` To maintain backward compatibility, setting 
      `safe_mode=True` will have the same effect as `safe_mode='replace'`.   

        To replace raw HTML with something other than the default, do:

            md = markdown.Markdown(safe_mode='replace', 
                               html_replacement_text='--RAW HTML NOT ALLOWED--')

    * `remove`: All raw HTML will be completely stripped from the text with
      no warning to the author.

    * `escape`: All raw HTML will be escaped and included in the document.

        For example, the following source:

            Foo <b>bar</b>.

        Will result in the following HTML:

            <p>Foo &lt;b&gt;bar&lt;/b&gt;.</p>

    Note that "safe_mode" also alters the default value for the 
    [`enable_attributes`](#enable_attributes) option.

* __`html_replacement_text`__{: #html_replacement_text }: Text used when 
  safe_mode is set to `replace`. Defaults to `[HTML_REMOVED]`.

* __`tab_length`__{: #tab_length }: Length of tabs in the source. Default: 4

* __`enable_attributes`__{: #enable_attributes}: Enable the conversion of 
  attributes. Defaults to `True`, unless [`safe_mode`](#safe_mode) is enabled, 
  in which case the default is `False`.

    Note that `safe_mode` only overrides the default. If `enable_attributes` 
    is explicitly set, the explicit value is used regardless of `safe_mode`.
    However, this could potentially allow an untrusted user to inject
    JavaScript into your documents.

* __`smart_emphasis`__{: #smart_emphasis }: Treat `_connected_words_` 
  intelligently Default: True

* __`lazy_ol`__{: #lazy_ol }: Ignore number of first item of ordered lists. 
  Default: True

    Given the following list:

        4. Apples
        5. Oranges
        6. Pears

    By default markdown will ignore the fact the the first line started 
    with item number "4" and the HTML list will start with a number "1".
    If `lazy_ol` is set to `True`, then markdown will output the following
    HTML:

        <ol>
          <li start="4">Apples</li>
          <li>Oranges</li>
          <li>Pears</li>
        </ol>


### `markdown.markdownFromFile (**kwargs)` {: #markdownFromFile }

With a few exceptions, `markdown.markdownFromFile` accepts the same options as 
`markdown.markdown`. It does **not** accept a `text` (or Unicode) string. 
Instead, it accepts the following required options:

* __`input`__{: #input } (required): The source text file.

    `input` may be set to one of three options:

    * a string which contains a path to a readable file on the file system,
    * a readable file-like object,
    * or `None` (default) which will read from `stdin`.

* __`output`__{: #output }: The target which output is written to.

    `output` may be set to one of three options:

    * a string which contains a path to a writable file on the file system,
    * a writable file-like object,
    * or `None` (default) which will write to `stdout`.

* __`encoding`__{: #encoding }: The encoding of the source text file. Defaults 
  to "utf-8". The same encoding will always be used for input and output. 
  The 'xmlcharrefreplace' error handler is used when encoding the output.

    **Note:** This is the only place that decoding and encoding of unicode
    takes place in Python-Markdown. If this rather naive solution does not
    meet your specific needs, it is suggested that you write your own code
    to handle your encoding/decoding needs.

### `markdown.Markdown ([**kwargs])` {: #Markdown }

The same options are available when initializing the `markdown.Markdown` class
as on the [`markdown.markdown`](#markdown) function, except that the class does
**not** accept a source text string on initialization. Rather, the source text
string must be passed to one of two instance methods:

* `Markdown.convert(source)`{: #convert }

    The `source` text must meet the same requirements as the [`text`](#text) 
    argument of the [`markdown.markdown`](#markdown) function.

    You should also use this method if you want to process multiple strings
    without creating a new instance of the class for each string.

        md = markdown.Markdown()
        html1 = md.convert(text1)
        html2 = md.convert(text2)

    Note that depending on which options and/or extensions are being used,
    the parser may need its state reset between each call to `convert`.

        html1 = md.convert(text1)
        md.reset()
        html2 = md.convert(text2)
    
    You can also change calls to `reset` together:
    
        html3 = md.reset().convert(text3)

* `Markdown.convertFile(**kwargs)`{: #convertFile }

    The arguments of this method are identical to the arguments of the same
    name on the `markdown.markdownFromFile` function ([`input`](#input), 
    [`output`](#output), and [`encoding`](#encoding)). As with the 
    [`convert`](#convert) method, this method should be used to 
    process multiple files without creating a new instance of the class for 
    each document. State may need to be `reset` between each call to 
    `convertFile` as is the case with `convert`.