diff options
Diffstat (limited to 'docs/reference.txt')
-rw-r--r-- | docs/reference.txt | 244 |
1 files changed, 244 insertions, 0 deletions
diff --git a/docs/reference.txt b/docs/reference.txt new file mode 100644 index 0000000..c1561f7 --- /dev/null +++ b/docs/reference.txt @@ -0,0 +1,244 @@ +title: Library Reference +prev_title: Installation +prev_url: install.html +next_title: Command Line +next_url: cli.html + + +Using Markdown as a Python Library +================================== + +First and foremost, Python-Markdown is intended to be a python library module +used by various projects to convert Markdown syntax into HTML. + +The Basics +---------- + +To use markdown as a module: + + import markdown + html = markdown.markdown(your_text_string) + +The Details +----------- + +Python-Markdown provides two public functions ([`markdown.markdown`](#markdown) +and [`markdown.markdownFromFile`](#markdownFromFile)) both of which wrap the +public class [`markdown.Markdown`](#Markdown). If you're processing one +document at a time, the functions will serve your needs. However, if you need +to process multiple documents, it may be advantageous to create a single +instance of the `markdown.Markdown` class and pass multiple documents through +it. + +### `markdown.markdown (text [, **kwargs])` {: #markdown } + +The following options are available on the `markdown.markdown` function: + +* __`text`__{: #text } (required): The source text string. + + Note that Python-Markdown expects **Unicode** as input (although + a simple ASCII string may work) and returns output as Unicode. + Do not pass encoded strings to it! If your input is encoded, (e.g. as + UTF-8), it is your responsibility to decode it. For example: + + input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8") + text = input_file.read() + html = markdown.markdown(text) + + If you want to write the output to disk, you must encode it yourself: + + output_file = codecs.open("some_file.html", "w", + encoding="utf-8", + errors="xmlcharrefreplace" + ) + output_file.write(html) + +* __`extensions`__{: #extensions }: A list of extensions. + + Python-Markdown provides an API for third parties to write extensions to + the parser adding their own additions or changes to the syntax. A few + commonly used extensions are shipped with the markdown library. See + the [extension documentation](extensions/index.html) for a list of + available extensions. + + The list of extensions may contain instances of extensions or stings of + extension names. If an extension name is provided as a string, the + extension must be importable as a python module either within the + `markdown.extensions` package or on your PYTHONPATH with a name starting + with `mdx_`, followed by the name of the extension. Thus, + `extensions=['extra']` will first look for the module + `markdown.extensions.extra`, then a module named `mdx_extra`. + +* __`extension-configs`__{: #extension_configs }: A dictionary of + configuration settings for extensions. + + The dictionary must be of the following format: + + extension-configs = {'extension_name_1': + [ + ('option_1', 'value_1'), + ('option_2', 'value_2') + ], + 'extension_name_2': + [ + ('option_1', 'value_1') + ] + } + See the documentation specific to the extension you are using for help in + specifying configuration settings for that extension. + +* __`output_format`__{: #output_format }: Format of output. + + Supported formats are: + + * `"xhtml1"`: Outputs XHTML 1.x. **Default**. + * `"xhtml5"`: Outputs XHTML style tags of HTML 5 + * `"xhtml"`: Outputs latest supported version of XHTML (currently XHTML 1.1). + * `"html4"`: Outputs HTML 4 + * `"html5"`: Outputs HTML style tags of HTML 5 + * `"html"`: Outputs latest supported version of HTML (currently HTML 4). + + Note that it is suggested that the more specific formats ("xhtml1", + "html5", & "html4") be used as "xhtml" or "html" may change in the future + if it makes sense at that time. The values can either be lowercase or + uppercase. + +* __`safe_mode`__{: #safe_mode }: Disallow raw html. + + If you are using Markdown on a web system which will transform text + provided by untrusted users, you may want to use the "safe_mode" + option which ensures that the user's HTML tags are either replaced, + removed or escaped. (They can still create links using Markdown syntax.) + + The following values are accepted: + + * `False` (Default): Raw HTML is passed through unaltered. + + * `replace`: Replace all HTML blocks with the text assigned to + `html_replacement_text` To maintain backward compatibility, setting + `safe_mode=True` will have the same effect as `safe_mode='replace'`. + + To replace raw HTML with something other than the default, do: + + md = markdown.Markdown(safe_mode='replace', + html_replacement_text='--RAW HTML NOT ALLOWED--') + + * `remove`: All raw HTML will be completely stripped from the text with + no warning to the author. + + * `escape`: All raw HTML will be escaped and included in the document. + + For example, the following source: + + Foo <b>bar</b>. + + Will result in the following HTML: + + <p>Foo <b>bar</b>.</p> + + Note that "safe_mode" does not alter the `enable_attributes` option, which + could allow someone to inject javascript (i.e., `{@onclick=alert(1)}`). You + may also want to set `enable_attributes=False` when using "safe_mode". + +* __`html_replacement_text`__{: #html_replacement_text }: Text used when + safe_mode is set to `replace`. Defaults to `[HTML_REMOVED]`. + +* __`tab_length`__{: #tab_length }: Length of tabs in the source. Default: 4 + +* __`enable_attributes`__{: #enable_attributes}: Enable the conversion of + attributes. Default: True + +* __`smart_emphasis`__{: #smart_emphasis }: Treat `_connected_words_` + intelligently Default: True + +* __`lazy_ol`__{: #lazy_ol }: Ignore number of first item of ordered lists. + Default: True + + Given the following list: + + 4. Apples + 5. Oranges + 6. Pears + + By default markdown will ignore the fact the the first line started + with item number "4" and the HTML list will start with a number "1". + If `lazy_ol` is set to `True`, then markdown will output the following + HTML: + + <ol> + <li start="4">Apples</li> + <li>Oranges</li> + <li>Pears</li> + </ol> + + +### `markdown.markdownFromFile (**kwargs)` {: #markdownFromFile } + +With a few exceptions, `markdown.markdownFromFile` accepts the same options as +`markdown.markdown`. It does **not** accept a `text` (or Unicode) string. +Instead, it accepts the following required options: + +* __`input`__{: #input } (required): The source text file. + + `input` may be set to one of three options: + + * a string which contains a path to a readable file on the file system, + * a readable file-like object, + * or `None` (default) which will read from `stdin`. + +* __`output`__{: #output }: The target which output is written to. + + `output` may be set to one of three options: + + * a string which contains a path to a writable file on the file system, + * a writable file-like object, + * or `None` (default) which will write to `stdout`. + +* __`encoding`__{: #encoding }: The encoding of the source text file. Defaults + to "utf-8". The same encoding will always be used for input and output. + The 'xmlcharrefreplace' error handler is used when encoding the output. + + **Note:** This is the only place that decoding and encoding of unicode + takes place in Python-Markdown. If this rather naive solution does not + meet your specific needs, it is suggested that you write your own code + to handle your encoding/decoding needs. + +### `markdown.Markdown ([**kwargs])` {: #Markdown } + +The same options are available when initializing the `markdown.Markdown` class +as on the [`markdown.markdown`](#markdown) function, except that the class does +**not** accept a source text string on initialization. Rather, the source text +string must be passed to one of two instance methods: + +* `Markdown.convert(source)` {: #convert } + + The `source` text must meet the same requirements as the [`text`](#text) + argument of the [`markdown.markdown`](#markdown) function. + + You should also use this method if you want to process multiple strings + without creating a new instance of the class for each string. + + md = markdown.Markdown() + html1 = md.convert(text1) + html2 = md.convert(text2) + + Note that depending on which options and/or extensions are being used, + the parser may need its state reset between each call to `convert`. + + html1 = md.convert(text1) + md.reset() + html2 = md.convert(text2) + + You can also change calls to `reset` togeather: + + html3 = md.reset().convert(text3) + +* `Markdown.convertFile(**kwargs)` {: #convertFile } + + The arguments of this method are identical to the arguments of the same + name on the `markdown.markdownFromFile` function ([`input`](#input), + [`output`](#output), and [`encoding`](#encoding)). As with the + [`convert`](#convert) method, this method should be used to + process multiple files without creating a new instance of the class for + each document. State may need to be `reset` between each call to + `convertFile` as is the case with `convert`. |