Skip to content

Docma API Reference

docma

This is the primary docma API for compiling and rendering document templates.

Typical usage would be:

from docma import compile_template, render_template

template_src_dir = 'a/b/c'
template_location = 'my-template.zip'  # ... or a directory when experimenting
pdf_location = 'my-doc.pdf'
params = { ... }  # A Dict of parameters.

compile_template(template_src_dir, template_location)

pdf = render_template_to_pdf(template_location, params)

# We now have a pypdf PdfWriter object. Do with it what you will. e.g.
pdf.write(pdf_location)

docma.compile_template

compile_template(src_dir: str, tpkg: str) -> None

Compile a document source directory into a docma template package.

Parameters:

Name Type Description Default
src_dir str

Source directory.

required
tpkg str

Location of the compiled document template package.

required

docma.get_template_info

get_template_info(tpkg: PackageReader) -> dict[str, Any]

Get information about a document template package.

docma.read_template_version_info

read_template_version_info(
    tpkg: PackageReader,
) -> dict[str, Any]

Read version information from a magic file in a compiled template package.

docma.render_template_to_html

render_template_to_html(
    template_pkg_name: str, render_params: dict[str, Any]
) -> BeautifulSoup

Render a template to self contained HTML.

Parameters:

Name Type Description Default
template_pkg_name str

Name of the ZIP file / direectory containing the compiled template package.

required
render_params dict[str, Any]

Rendering parameters.

required

Returns:

Type Description
BeautifulSoup

A BeautifulSoup HTML structure.

docma.render_template_to_pdf

render_template_to_pdf(
    template_pkg_name: str,
    render_params: dict[str, Any],
    watermark: Sequence[str] = None,
    stamp: Sequence[str] = None,
    compression: int = 0,
) -> PdfWriter

Generate PDF output from a document template package.

Parameters:

Name Type Description Default
template_pkg_name str

Name of the ZIP file / direectory containing the compiled template package.

required
render_params dict[str, Any]

Rendering parameters.

required
watermark Sequence[str]

A sequence of IDs of overlay documents to apply under the PDF.

None
stamp Sequence[str]

A sequence of IDs of overlay documents to apply over the PDF.

None
compression int

Compression level for PDF output 0..9.

0

Returns:

Type Description
PdfWriter

PDF output file as a PyPDF PdfWriter instance.

docma.safe_render_path

safe_render_path(
    path: str,
    context: DocmaRenderContext,
    *args: dict[str, Any]
) -> str

Safely render a file path.

Safety here means that each path component is individually rendered and we are very restrictive about what characters the rendered component can contain.

Parameters:

Name Type Description Default
path str

A file path string (e.g. /a/b/c)

required
context DocmaRenderContext

Document rendering context.

required
args dict[str, Any]

Additional rendering params.

()

Returns:

Type Description
str

A rendered file path.

Raises:

Type Description
ValueError

If the path contains any invalid characters.

docma.commands

CLI commands.

docma.commands.CliCommand

CliCommand(subparser)

Bases: ABC

CLI subcommand handler.

CLI subcommands should be declared as a subclass and also registered using the register decorator so that they are automatically discovered.

Subclasses may implement add_arguments and check_arguments methods and must implement the execute method.

Thus:

@CliCommand.register('command-name')
class Whatever(CliCommand):

    def add_arguments(self) -> None:
        self.argp.add_argument('--arg1', action='store')

    def execute(self, args: Namespace) -> None:
        print(f'The argument value for arg1 is {args.arg1}')

Initialize the command handler.

add_arguments

add_arguments()

Add arguments to the command handler.

check_arguments staticmethod

check_arguments(args: Namespace)

Validate arguments.

Parameters:

Name Type Description Default
args Namespace

The namespace containing the arguments.

required

Raises:

Type Description
ValueError

If the arguments are invalid.

execute abstractmethod staticmethod

execute(args: Namespace) -> None

Execute the CLI command with the specified arguments.

register classmethod

register(name: str) -> Callable

Register a CLI command handler class.

This is a decorator. Usage is:

@CliCommand.register('my_command')
class MyCommand(CliCommand):
    ...

The help for the command is taken from the first line of the docstring.

docma.compilers

Content compilers turn non-HTML content into HTML content as part of template compilation.

They accept a single bytes argument containing source content and return a string object containing HTML.

docma.compilers.compiler_for_file

compiler_for_file(file: Path) -> Callable

Get the handler for the specfied file based on suffix.

docma.compilers.compiler_for_suffix

compiler_for_suffix(suffix: str) -> Callable

Get the handler for the specfied format.

docma.compilers.content_compiler

content_compiler(*suffixes: str) -> Callable

Register a document compiler for the specfied filename suffixes.

This is a decorator used like so:

@content_compiler('md')
def compile_markdown(src_data: bytes) -> str:
    ...

Parameters:

Name Type Description Default
suffixes str

File suffixes (without the dot) handled by the decorated function.

()

docma.data_providers

Data providers access external data sources (e.g. databases) as part of template rendering.

They are referenced in HTML content via a data source specification containing these elements:

  • type
  • location
  • query
  • target.

See DataSourceSpec for details on what the components mean.

Data providers must have the following signature:

Data Provider Signature

@data_provider(*src_type: str)
def whatever(
    data_src: DataSourceSpec,
    context: DocmaRenderContext,
    params: dict[str, Any],
    **kwargs,
) -> list[dict[str, Any]]:
Name Type Description
data_src DataSourceSpec The data source specification.
context DocmaRenderContext The template package. This allows the content generator to access files in the template, if required.
params dict[str, Any] The run-time rendering parameters provided during the render phase of document production.
kwargs Any This is a sponge for spare arguments. Some handlers don't require the params argument.

Data providers must return a list of dictionaries, where each dictionary contains one row of data. This is the format required by Altair-Vega and is also suitable for consumption in Jinja templates.

Data providers should generally raise a DocmaDataProviderError on failure.

docma.data_providers.DataSourceSpec

DataSourceSpec(
    src_type: str,
    location: str,
    query: str = None,
    target: str = None,
)

Data source specifier.

The components are:

Parameters:

Name Type Description Default
src_type str

The data provider type (e.g. csv if the data comes from a CSV file). This controls the connection / access mechanism.

required
location str

The location where to find the data. For a file based source it would be the path to the file in the document template package.

required
query str

The query to execute on the data provider. This is required for database-like sources. It is not used for some types (e.g. data sourced from the rendering parameters).

None
target str

The position in the Vega-Lite specification where the data will be attached. This is a dot separated dictionary key sequence. If not provided, this defaults to data.values, which is the primary data location for a Vega-Lite specification. When referenced in HTML, these are formatted as strings in the form <type>;<path>[;query[;target] Examples -------- - params;a.b.c Get data from key a->b->c in the rendering parameters and attach it in the default location in the chart specification. - file;data/x.csv;;data.values Get data from the CSV file data/x.csv and attach it in the chart specification at date->values (which happens to be the default). Note that the query is empty in this case. - file;data/x.sqlite;q.sql Run a query on the SQLite3 database and attach the data at the degault location in the chart specification.

None

Initialise a data source specification.

from_string classmethod

from_string(s: str) -> DataSourceSpec

Initialise a data source specification from a string.

Parameters:

Name Type Description Default
s str

A string in the form <type>;<path>[;query[;target]

required

docma.data_providers.data_provider

data_provider(data_src_type: str) -> Callable

Register the data provider function for the specified data source type.

This is a decorator used like so:

@data_provider('postgres')
def postgres(
    data_src: DataSourceSpec, pkg: PackageReader, params: dict[str, Any]
)  -> list[dict[str, Any]]:
    ...

Parameters:

Name Type Description Default
data_src_type str

Data source type.

required

docma.data_providers.load_data

load_data(
    data_src: DataSourceSpec,
    context: DocmaRenderContext,
    **kwargs
) -> list[dict[str, Any]]

Load a data set.

Parameters:

Name Type Description Default
data_src DataSourceSpec

Data source specifier.

required
context DocmaRenderContext

Document rendering context.

required
kwargs

Additional keyword arguments for the data loader.

{}

Returns:

Type Description
list[dict[str, Any]]

A list of dicts, each containing one row.

docma.exceptions

Docma exceptions.

docma.exceptions.DocmaCompileError

Bases: DocmaError

Content compiler error.

docma.exceptions.DocmaDataProviderError

Bases: DocmaError

Data provider error.

docma.exceptions.DocmaError

Bases: Exception

Base class for docma errors.

docma.exceptions.DocmaGeneratorError

Bases: DocmaError

Content generator error.

docma.exceptions.DocmaImportError

Bases: DocmaError

Content importer error.

docma.exceptions.DocmaInternalError

Bases: DocmaError

Internal error.

docma.exceptions.DocmaPackageError

Bases: DocmaError

Packaging and content related errors.

docma.exceptions.DocmaUrlFetchError

Bases: DocmaError

URL fetcher error.

docma.generators

Docma content generators produce dynamic content (e.g. charts) as part of template rendering.

They are activated in a HTML file in a docma template by invoking a URL in the form:

docma:<content-type>?option1=value1&option2=value2

Info

Do not use docma:// as that implies a netloc which is not used here.

Content generators have the following signature:

Content Generator Signature

@content_generator(content_type: str, validator=type[BaseModel])))
def content_generator(options: type[BaseModel], context: DocmaRenderContext) -> dict[str, Any]:

The content_generator registration decorator has the following parameters:

Name Type Description
content_type str The content type. This is the first component of the URL after the docma scheme.
validator type[BaseModel] A class derived from a Pydantic BaseModel that will validate and hold the query parameters from the URL.

The content generator itself has the following parameters:

Name Type Description
options type[BaseModel] An instance of the Pydantic validator class for the generator.
context DocmaRenderContext The template package. This allows the content generator to access files in the template, if required.

Content generators return the value required of a Weasyprint URL fetcher. i.e. a dictionary containing (at least):

  • string: The bytes of the content (yes ... it says string but it's bytes).

  • mimetype: The MIME type of the content.

Content generators should raise a DocmaGeneratorError on failure.

Look at the swatch content generator as an example.

docma.generators.content_generator

content_generator(
    content_type: str, validator: type[BaseModel]
) -> Callable

Register the generator function for the specified content type.

This is a decorator used like so:

class WhateverOptions(BaseModel):
    param_a: str
    param_b: int

@content_generator('whatever', WhateverOptions)
def _(pkg: PackageReader, options: WhateverOptions, params: dict[str]) -> dict[str, Any]:
    ...

Parameters:

Name Type Description Default
content_type str

The content type. This is the first component of the URL after the docma scheme.

required
validator type[BaseModel]

A class derived from a Pydantic BaseModel that will hold the query parameters from the URL.

required

docma.generators.content_generator_for_type

content_generator_for_type(content_type: str) -> Callable

Get the generator function for the specified content type.

docma.importers

Content importers bring in external content when compiling a docma template.

See Document Imports.

They accept a single URL style argument and return bytes with the content. The scheme from the URL is used to select the appropriate import handler.

Importers have the following signature:

Importer Signature

@content_importer(*schemes: str)
def whatever(url: str, max_size: int = 0) -> bytes:
Name Type Description
url str The URL to import. The scheme has already been used to select the correct importer.
max_size int The maximum size of the imported content. If non-zero, the importer should do its best to determine the imported data size and throw a DocmaImportError if the data exceeds this size.

Importers should raise a DocmaImportError on failure.

docma.importers.content_importer

content_importer(*schemes: str) -> Callable

Register document importers for the specified URL schemes.

This is a decorator used like so:

@content_importer('http', 'https')
def http(url: str) -> bytes:
    ...

docma.importers.import_content cached

import_content(url: str, max_size: int = 0) -> bytes

Import the content from the specified URL style source.

Parameters:

Name Type Description Default
url str

The URL from which to import.

required
max_size int

The maximum size in bytes of the object. Default is 0 (no limit).

0

docma.jinja

Core Jinja components for docma: plugins, extra filters etc.

docma.jinja.DocmaJinjaEnvironment

DocmaJinjaEnvironment(*args, **kwargs)

Bases: Environment

Jinja2 environment with some docma add-ons.

Prep a Jinja2 environment for use in docma.

docma.jinja.DocmaRenderContext dataclass

DocmaRenderContext(
    tpkg: PackageReader,
    params: dict[str, Any] = dict(),
    env: DocmaJinjaEnvironment = None,
)

Simple grouping construct for essential bits involved in rendering.

The render context provides the following attributes:

  • the package reader for the document template;
  • the rendering parameters; and
  • the Jinja environment

_

_(
    v: Iterable, *args: dict[str, Any], **kwargs
) -> list[str]

Render each element of an iterable to produce a list of strings.

Parameters:

Name Type Description Default
v Iterable

The iterable to render.

required
args dict[str, Any]

Additional dictionaries of parameters for rendering.

()
kwargs

Additional keyword parameters for rendering.

{}

render

render(v, *args, **kwargs)

Raise exception on unhandled types.

docma.jinja.NoLoader

Bases: BaseLoader

Jinja2 loader that prevents loading.

get_source

get_source(environment, template: str)

Block template loading.

docma.jinja.jext

jext(cls: type[Extension]) -> type[Extension]

Register Jinja extensions.

docma.jinja.jfunc

jfunc(name: str)

Register Jinja extra functions.

docma.lib

Utilities of one kind or another.

docma.lib.db

DB utilities.

get_paramstyle_from_conn

get_paramstyle_from_conn(conn) -> str

Hack to get the DBAPI paramstyle from a database connection in a driver independent way.

DBAPI 2.0 implementations show very little consistency. Dodgy as. Works for a few of the common ones.

docma.lib.html

HTML utilities.

html_append

html_append(
    html1: BeautifulSoup, html2: BeautifulSoup
) -> None

Append head and body of a HTML document into another HTML document.

docma.lib.http

HTTP utilities.

get_url

get_url(
    url: str, max_size: int = 0, timeout: int = HTTP_TIMEOUT
) -> bytes

Fetch a URL and return its contents.

Results are cached, based only on the URL.

Parameters:

Name Type Description Default
url str

The URL to fetch.

required
max_size int

The maximum allowed size of the object being fetched. If 0, no limit is applied.

0
timeout int

Timeout in seconds on HTTP operations

HTTP_TIMEOUT

Returns:

Type Description
bytes

The object being fetched as bytes.

docma.lib.jsonschema

JSON schema utilities.

JsonSchemaBuiltinsResolver

JsonSchemaBuiltinsResolver(
    base: FormatChecker | None = None,
)

Bases: PluginResolver

Adapts builtin jsonschema.FormatChecker checkers into boolean-returning callables.

This is required because we are using our own plugin registration and lookup system and it just registers the callable implementing the plugin. The JSONschema version of this for format checkers registers the callable as well as exceptions it may raise that indicate a negative format match. We need to adapt that style to our own plugin style.

Create a new PluginResolver instance.

Parameters:

Name Type Description Default
base FormatChecker | None

A base FormatChecker instance from which to obtain the checkers. Defaults to a clean FormatChecker instance.

None
resolve
resolve(name: str) -> PluginType | None

Resolve a plugin name into a callable or return None.

PluginFormatChecker

PluginFormatChecker(resolvers: Sequence[PluginResolver])

Bases: FormatChecker

A JSONschema FormatChecker that uses PluginRouter to look up format checkers.

Expects all format plugins to be boolean-returning callables. i.e. they must not use an exception to flag that the value does not comply with the format.

Parameters:

Name Type Description Default
resolvers Sequence[PluginResolver]

An iterable of PluginResolver instances. Plugin lookups will try each resolver in turn.

required

Create a new PluginResolver instance.

check
check(instance: Any, format: str) -> None

Check whether the instance conforms to the given format.

format_checker

format_checker(*args, **kwargs) -> PluginType

Mark a callable as a format checker suitable for Jinja or JSONschema.

docma.lib.logging

Logging related stuff.

ColourLogHandler

ColourLogHandler(colour: bool = True)

Bases: Handler

Basic stream handler that writes to stderr with colours for log levels.

Allow colour to be enabled or disabled.

emit
emit(record: LogRecord) -> None

Print the record to stderr with some colour enhancement.

get_log_level

get_log_level(s: str) -> int

Convert string log level to the corresponding integer log level.

Parameters:

Name Type Description Default
s str

A string version of a log level (e.g. 'error', 'info'). Case is not significant.

required

Returns:

Type Description
int

The numeric logLevel equivalent.

Raises:

Type Description
ValueError

If the supplied string cannot be converted.

setup_logging

setup_logging(
    level: str,
    colour: bool = True,
    name: str | None = None,
    prefix: str | None = None,
) -> None

Set up logging.

Parameters:

Name Type Description Default
level str

Logging level. The string format of a level (eg 'debug').

required
colour bool

If True and logging to the terminal, colourise messages for different logging levels. Default True.

True
name str | None

The name of the logger to configure. If None, configure the root logger.

None
prefix str | None

Messages are prefixed by this string (with colon+space appended). Default None.

None

Raises:

Type Description
ValueError

If an invalid log level or syslog facility is specified.

docma.lib.metadata

Document metadata management for docma.

DocumentMetadata

DocumentMetadata(**kwargs: Any)

Bases: MutableMapping

Document metadata manager and format converter.

This provides an output format agnostic container for document metadata (title, subject, author etc.). It can provide a metadata structure in the format required for either PDF or HTML.

Initialize metadata.

as_dict
as_dict(format: str = None) -> dict[str, Any]

Return a dictionary with all attributes.

Parameters:

Name Type Description Default
format str

Adjust metadata for the specified format. Allowed values are None, html and pdf, PDF has names like /Author whereas HTML convention is author. PDF convention on list items (e.g. /Keywords) is to join them with semi-colons. HTML convention is commas.

None
normalise_attr_name staticmethod
normalise_attr_name(attr_name: str) -> str

Normalise attribute name to handle PDF and plain variants.

normalise_attr_value classmethod
normalise_attr_value(attr_value: Any) -> list | str

Normalise attribute value to preserve lists but everything else is string.

to_pdf_name staticmethod
to_pdf_name(name: str) -> str

Convert a metadata attribute name to PDF metadata style.

docma.lib.misc

Miscellaneous utilities.

StoreNameValuePair

Bases: Action

Store argpare values from options of the form --option name=value.

The destination (self.dest) will be created as a dict {name: value}. This allows multiple name-value pairs to be set for the same option.

Usage is:

argparser.add_argument('-x', metavar='key=value', action=StoreNameValuePair)

or argparser.add_argument('-x', metavar='key=value ...', action=StoreNameValuePair, nargs='+')

chunks

chunks(
    s: str | bytes, size: int = CHUNK_SIZE
) -> Iterator[str | bytes]

Yield successive chunks from a string.

css_id

css_id(s: str) -> str

Convert a string to a valid CSS identifier (e.g. for clases or IDs).

datetime_pdf_format

datetime_pdf_format(dt: datetime = None) -> str

Convert a timezone aware datetime object into PDF format.

See: https://www.verypdf.com/pdfinfoeditor/pdf-date-format.htm

Parameters:

Name Type Description Default
dt datetime

Timezone aware datetime object. Defaults to current time UTC

None

deep_update_dict

deep_update_dict(*d: dict | None) -> dict

Deep update dictionaries into the first one.

Parameters:

Name Type Description Default
d dict | None

Dictionaries to merge.

()

Returns:

Type Description
dict

The first dict in the sequence containing the merged dictionary.

dot_dict_get

dot_dict_get(d: dict[str, Any], key: str) -> Any

Access a dict element based on a hierarchical dot separated key.

All the components of the key must be present in the dict or its a KeyError.

Parameters:

Name Type Description Default
d dict[str, Any]

The dict to be accessed.

required
key str

The compound key in the form a.b.c....

required

dot_dict_set

dot_dict_set(
    d: dict[str, Any], key: str, value: Any
) -> dict[str, Any]

Set a dict element based on a hierarchical dot separated key.

All the parent components of the key must be present in the dict or its a KeyError.

Parameters:

Name Type Description Default
d dict[str, Any]

The dict to be modified

required
key str

The compound key in the form a.b.c....

required
value Any

The value to set.

required

Returns:

Type Description
dict[str, Any]

The dict.

env_config cached

env_config(prefix: str, group: str) -> dict[str, str]

Read configuration from environment variables and dotenv.

This will return a dictionary of all values <prefix>_<group>_* read from the environment plus all values <group>_* read from .env. Environment vars take precedence. The <prefix>/<group> components are removed from the names. All keys are converted to lowercase.

Parameters:

Name Type Description Default
prefix str

A prefix for values read from environment variables. An underscore is added. This is not used for variables read from .env.

required
group str

A prefix the identifies a group of variables to be fetched.

required

Returns:

Type Description
dict[str, str]

A dictionary of values read from environment variables.

flatten_iterable

flatten_iterable(iterable: Iterable) -> list

Flatten a nested iterable into a single level list.

html_to_pdf

html_to_pdf(
    html_src: str,
    url_fetcher: Callable = None,
    font_config: FontConfiguration = None,
) -> PdfReader

Convert HTML source to PDF.

Unfortunately, weasyprint's document construct is fairly opqaue and limited, with no real support for document composition or manipulation. So we basically use weasyprint to generate a PDF, write it out, then re-read it into a PyPDF PdfReader

This is all done in memory. Should be ok in the context of this application but could add temp files if needed.

Parameters:

Name Type Description Default
html_src str

HTML source to convert (actual HTML -- not a file name).

required
url_fetcher Callable

Custom URL fetcher for WeasyPrint.

None
font_config FontConfiguration

WeasyPrint font configuration for @font-face rules.

None

Returns:

Type Description
PdfReader

PyPDF PDF reader.

load_font cached

load_font(name: str, size: int) -> ImageFont

Load a truetype font or the default font.

path_matches

path_matches(path: Path, patterns: Iterable[str]) -> bool

Check if a path matches any in list of glob patterns.

str2bool

str2bool(s: str | bool) -> bool

Convert a string to a boolean.

This is a (case insensitive) semantic conversion.

'true', 't', 'yes', 'y', non-zero int as str --> True
'false', 'f', 'no', 'n', zero as str --> False

Parameters:

Name Type Description Default
s str | bool

A boolean or a string representing a boolean. Whitespace is stripped. Boolean values are passed back unchanged.

required

Returns:

Type Description
bool

A boolean derived from the input value.

Raises:

Type Description
ValueError

If the value cannot be converted.

docma.lib.packager

Utils to package files into a collection.

DirPackageReader

DirPackageReader(path: Path)

Bases: PackageReader

Create a directory package reader.

Create a directory package reader.

exists
exists(file: Path | str) -> bool

Check if a file exists in the package.

is_dir
is_dir(file: Path | str) -> bool

Check if path is a directory.

namelist
namelist(base: Path | str = None) -> Iterator[Path]

Get an iterator over file names under the specified base directory.

read_bytes
read_bytes(file: Path | str) -> bytes

Read a binary file from the package.

read_text
read_text(file: Path | str) -> str

Read a text file from the package.

DirPackageWriter

DirPackageWriter(path: Path)

Bases: PackageWriter

Package files into a directory.

Create a directory package writer.

add_file
add_file(src: Path, dst: Path | str) -> Path

Add a file to the package.

exists
exists(file: Path | str) -> bool

Check if a file exists in the package.

write_bytes
write_bytes(content: bytes, file: Path | str) -> Path

Create a new file in the package with the specified content.

write_string
write_string(content: str, file: Path | str) -> Path

Create a new file in the package with the specified content.

PackageReader

PackageReader(path: Path)

Bases: ABC, BaseLoader

Access package files.

Create package reader.

exists abstractmethod
exists(file: Path | str) -> bool

Check if a file exists in the package.

get_source
get_source(environment, template: str)

Allow jinja2 to use this class as a custom loader.

is_dir abstractmethod
is_dir(file: Path | str) -> bool

Check if path is a directory.

namelist abstractmethod
namelist(base: Path | str = None) -> Iterator[Path]

Get an iterator over file names under the specified base directory.

new staticmethod
new(path: Path | str) -> PackageReader

Create a new package reader.

read_bytes abstractmethod
read_bytes(file: Path | str) -> bytes

Read a binary file from the package.

Parameters:

Name Type Description Default
file Path | str

The path of the file to be read, relative to the package root.

required

Returns:

Type Description
bytes

The content of the file.

read_text abstractmethod
read_text(file: Path | str) -> str

Read a text file from the package.

Parameters:

Name Type Description Default
file Path | str

The path of the file to be read, relative to the package root.

required

Returns:

Type Description
str

The content of the file.

PackageWriter

PackageWriter(path: Path)

Bases: ABC

Package files up.

Create packager.

add_file abstractmethod
add_file(src: Path, dst: Path | str)

Create a new file in the package from the specified file.

Parameters:

Name Type Description Default
src Path

The path to the source file.

required
dst Path | str

The name of the file to be created. This must be a relative path.

required
exists abstractmethod
exists(file: Path | str) -> bool

Check if a file exists in the package.

new staticmethod
new(path: Path | str) -> PackageWriter

Create a new package writer.

write_bytes abstractmethod
write_bytes(content: bytes, file: Path | str) -> Path

Create a new file in the package with the specified content.

Parameters:

Name Type Description Default
content bytes

The content to be added.

required
file Path | str

The name of the file to be created. This must be a relative path.

required

Returns:

Type Description
Path

The relative path to the created file.

write_string abstractmethod
write_string(content: str, file: Path | str)

Create a new file in the package with the specified content.

Parameters:

Name Type Description Default
content str

The content to be added.

required
file Path | str

The name of the file to be created. This must be a relative path.

required

Returns:

Type Description

The relative path to the created file.

ZipPackageReader

ZipPackageReader(path: Path)

Bases: PackageReader

Create a ZIP package reader.

Create a ZIP package reader.

close
close()

Close the zip file.

exists
exists(file: Path | str) -> bool

Check if a file exists in the package.

is_dir
is_dir(file: Path | str) -> bool

Check if path is a directory.

namelist
namelist(base: Path | str = None) -> Iterator[Path]

Get an iterator over file names under the specified base directory.

read_bytes
read_bytes(file: Path | str) -> bytes

Read a binary file from the package.

read_text
read_text(file: Path | str) -> str

Read a text file from the package.

ZipPackageWriter

ZipPackageWriter(path: Path)

Bases: PackageWriter

Package files into zip file.

Create a zip packager.

add_file
add_file(src: Path, dst: Path | str) -> Path

Add a file to the zip file.

close
close()

Close the zip file.

exists
exists(file: Path | str) -> bool

Check if a file exists in the package.

write_bytes
write_bytes(content: bytes, file: Path | str) -> Path

Create a new file in the ZIP file with the specified content.

write_string
write_string(content: str, file: Path | str) -> Path

Create a new file in the ZIP file with the specified content.

docma.lib.path

File path based utilities.

relative_path

relative_path(root: Path, path: Path | str) -> Path

Ensure a relative path is contained within a root path.

This is a bit crude but is intented to make sure a path doesn't stray outside a given root using relative path trickery.

Parameters:

Name Type Description Default
root Path

Root path.

required
path Path | str

A relative path.

required

Returns:

Type Description
Path

A resolved relative path with respect to root.

Raises:

Type Description
ValueError

If path is not a relative path or not contained within root.

walkpath

walkpath(path: Path | Path) -> Iterator[Path | ZipPath]

Walk a directory and yield all files in it.

docma.lib.plugin

Generic plugin mechanism for frameworks that lookup plugins using a dict.

e.g. Jinja filters and tests, and JSONschema format checkers use dicts to hold their plugins.

The dictionary object that the framework normally uses for looking up plugins gets replaced by a PluginRouter object. This appears as a dictionary that returns a callable that implements the plugin.

The plugin router must be initialised with one or more PluginResolver instances. A resolvers maps a plugin name to the callable that implements it, or None if it is not known to the resolver. The PluginResolver instance tries each of its resolvers in turn until it either finds the plugin or exhausts all resolvers.

Resolvers don't have to worry about cacheing. PluginRouter does that.

In the case of JInja filters, this gets used like so ...

from jinja2 import Environment
from docma.lib.plugin import MappingResolver, PackageResolver, PluginRouter

e = Environment()
e.filters = PluginRouter(
    [
        MappingResolver(e.filters),  # Preserve Jinja built-in filters
        PackageResolver('docma.plugins.jinja_filters'),
    ],
)

MappingResolver

MappingResolver(mapping: Mapping[str, PluginType])

Bases: PluginResolver

Resolver that looks up plugins from a plain dictionary.

In the case of Jinja filters, for example, This can be used to hold Jinja's factory-fitted filters. e.g.

env = jinja2.Environment()
env.filters = PluginRouter([MappingResolver(env.filters)])

Create a MappingResolver instance.

resolve
resolve(name: str) -> PluginType | None

Return the plugin if present, else None.

PackageResolver

PackageResolver(package: str, plugin_types: str | set[str])

Bases: PluginResolver

Resolve plugins from a Python package hierarchy.

  • Top-level plugins (directly in the base package) are loaded eagerly.
  • Category plugins (subpackages) are loaded lazily when first accessed.

Plugins are organised as a standard python package hierarchy with the plugin callables (typically functions) marked with a decorator, plugin_marker(). The package hierarchy (excluding the final Python filename) is used to form the prefix for the plugin name. The last component of the name is specified in the decorator.

The plugin_decorator annotates the plugin with two attributes:

  • _plugin_types: A set of strings that is used to determine if a resolver instance should notice the plugin.
  • _plugin_names: A tuple of strings containing the trailing component of the plugin names.

This is a sample hierarchy for a plugin package.

    .
    ├── __init__.py
    └── filters                 (Base for all plugins)
        ├── __init__.py
        ├── whatever.py         (Plagins with no category)
        └── au                  (Base for all "au.*" plugins
            ├── __init__.py
            ├── company_ids.py  (Can contain multiple plugins, eg. "abn", "acn")
            └── tax             (Base for all "au.tax.*" plugins)
                ├── __init__.py
                └── tax.py      (Can contain multiple plugins, eg. "tfn")

In this example, we are defining plugins named au.abn and au.tax.tfn. The "category" here is au.

Filters at the top level (e.g. in whatever.py) are preloaded. The nested plugins are lazy loaded at the category level when invoked.

Create a PackageResolver instance.

Parameters:

Name Type Description Default
package str

A Python package name (e.g. jinja.filters).

required
plugin_types str | set[str]

A set of strings indicating the type of plugins to load.

required
_load_category
_load_category(category: str) -> None

Import and register all plugins in a category package.

_load_top_level
_load_top_level() -> None

Load plugins from modules directly under the package (no category).

_module_prefix_from_name
_module_prefix_from_name(module_name: str) -> str

Return the namespace prefix for plugins in a module based on its name.

For example: - Base package: 'docma.plugins.filters' - Module name: 'docma.plugins.filters.au.tax.tax' - Result: 'au.tax'

The module filename (last component) is not included in the prefix.

_register_module_plugins
_register_module_plugins(
    module: ModuleType, prefix: str
) -> None

Find and register plugin callables from a module.

resolve
resolve(name: str) -> PluginType | None

Resolve a plugin name into a callable or return None.

Plugin

Container for plugin related functionality.

plugin staticmethod
plugin(
    plugin_types: str | Iterable[str],
    name: str,
    *aliases: str,
    deprecation: str | None = None
) -> PluginType

Decorate a plugin so we know its name and to which families it belongs.

Parameters:

Name Type Description Default
plugin_types str | Iterable[str]

Indicates the families to which this plugin belongs.

required
name str

Plugin name.

required
aliases str

Alternative names. No need to use this for case variations as these are supported natively.

()
deprecation str | None

Optional deprecation warning.

None

PluginError

Bases: Exception

Exception to raise when a plugin or plugin category cannot be loaded.

PluginResolver

Bases: ABC

Base class for plugin resolvers.

resolve abstractmethod
resolve(name: str) -> PluginType | None

Resolve a plugin name into a callable or return None.

PluginRouter

PluginRouter(resolvers: Sequence[PluginResolver])

Bases: MutableMapping[str, PluginType]

A dict-like object for dict lookup that delegates resolution to resolvers.

Parameters:

Name Type Description Default
resolvers Sequence[PluginResolver]

An iterable of PluginResolver instances. Plugin lookups will try each resolver in turn.

required

Create a new PluginResolver instance.

_canonical staticmethod
_canonical(key: str) -> str

Return canonical lowercase key for lookup.

_deprecation_check
_deprecation_check(
    key: str, plugin: PluginType
) -> PluginType

Raise a deprecation warning for deprecated plugins.

Parameters:

Name Type Description Default
key str

The key used to resolve the plugin.

required
plugin PluginType

The plugin to check.

required

Returns:

Type Description
PluginType

The plugin (unmodified).

get
get(key: str, default=None) -> PluginType

Get a plugin by name.

jfilter

jfilter(*args, **kwargs) -> PluginType

Mark a callable as a Jinja filter.

jtest

jtest(*args, **kwargs) -> PluginType

Mark a callable as a Jinja test.

docma.lib.query

Manage query specifications.

DocmaQuerySpecification

Bases: BaseModel

Model for managing query specifications.

row_checker cached property
row_checker: Validator

Row validator for data coming from the database.

_ classmethod
_(value: dict[str, Any]) -> dict[str, Any]

Validate row schema.

check_row
check_row(data: dict[str, Any])

Validate a row of data from the database against the schema.

fetch_from_cursor
fetch_from_cursor(cursor) -> list[dict[str, Any]]

Consume the data from a cursor on which a query has been executed.

Parameters:

Name Type Description Default
cursor

DBAPI 2.0 cursor. The query must have already been executed.

required
prepare_query
prepare_query(
    context: DocmaRenderContext,
    params: dict[str, Any] = None,
    paramstyle: str = "format",
) -> tuple[str, list] | tuple[str, dict[str, Any]]

Prepare a query from a query specification.

Parameters:

Name Type Description Default
context DocmaRenderContext

Document rendering context.

required
params dict[str, Any]

Additional rendering parameters.

None
paramstyle str

DBAPI 2.0 paramstyle.

'format'

Returns:

Type Description
tuple[str, list] | tuple[str, dict[str, Any]]

For named param styles: A tuple: (rendered query text, dict of rendered param values) For positional param styles: A tuple: (rendered query text, list of rendered query parameters) The rationale for this structure is that the query and param values get fed directly to a DBAPI 2.0 cursor.execute() so they need to be in the right format for the respective paramstyle.

QueryOptions

Bases: BaseModel

Configuration options for a query.

QueryParameter

Bases: BaseModel

Model for managing query parameters.

QueryParameterType

Bases: Enum

Allow for some basic type casting in query parameters.

docma.url_fetchers

URL fetchers are used during the compile phase to inject template content referenced by URLs.

See Dynamic Content Generation for more information.

If a fetcher for a given schema does not have a custom fetcher, the default URL fetcher is used.

URL fetchers have the following signature:

URL Fetcher Signature

from docma.url_fetchers import url_fetcher

@url_fetcher(*schemes)
def url_fetcher(purl: ParseResult, context: DocmaRenderContext) -> dict[str, Any]:
    ...
Name Type Description
purl urllib.parse.ParseResult The parsed URL.
context DocmaRenderContext The template package. This allows the content generator to access files in the template, if required.

URL fetchers return the value required of a Weasyprint URL fetcher. i.e. a dictionary containing (at least):

  • string: The bytes of the content (yes ... it says string but it's bytes).

  • mimetype: The MIME type of the content.

URL fetchers should raise a DocmaUrlFetchError on failure.

docma.url_fetchers.get_url_fetcher_for_scheme

get_url_fetcher_for_scheme(scheme: str) -> Callable

Get URL fetcher for the specified URL scheme.

docma.url_fetchers.url_fetcher

url_fetcher(*schemes: str) -> Callable

Register a URL fetcher for the specified URL schemes.