Skip to content

getml/deigma

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

deigma

A type-safe templating library for python

δεῖγμᾰ • (deîgmă) n (genitive δείγμᾰτος); third declension Pronounciation: IPA(key): /dêːŋ.ma/ (DHEEG-mah, THEEG-mah)

specimen, sample pattern

Installation

For now, just install deigma directly from github.

pip install git+https://github.com/getml/deigma

Getting started

The template decorator lets you inject templates into dataclasses:

Hello, world!

@template(
    """
    Hello, {{ name }}!
    """
)
class HelloTemplate:
   name: str

Rendering templates

Templates are automatically rendered, whenever they are cast to str. It doesn't matter whether this is done explicitly or implicitly:

str(HelloTemplate(name="world"))
# 'Hello, world!'

f"{HelloTemplate(name="world")}"
# 'Hello, world!'

print(HelloTemplate(name="world"))
# Hello, world!

Note, that, as templates are just types (dataclasses to be precise), the constructor will return an instance of the template, i.e. a template with all variables bound to the values passed to the constructor:

HelloTemplate(name="world")
# HelloTemplate(name='world')

Of course, you can also bind templates to variables:

hello_world = HelloTemplate(name="world")
hello_world
# HelloTemplate(name='world')

To render the bound template, you just cast it to a string:

str(hello_world)
# 'Hello, world!'

Defining template sources

Inline

You can supply the template source inline. source is the only positional argument:

@template(
    """
    Hello, {{ name }}!
    """
)
class HelloTemplate:
    name: str

You can also supply the template source as a keyword argument:

@template(source="Hello, {{ name }}!")
class HelloTemplate:
    name: str

Note

If you supply your source as a multiline string, your source should have an equal amount of leading whitespace on all lines. This is to not break the template cleanup (carried out through standardlib's cleandoc).

@template(
    source=(
        """
        Hello, {{ name }}!
        """
    )
)
class HelloTemplate:
    name: str
In a separate file

You can also load the template source from a file:

{# hello_template.jinja #}
Hello, {{ name }}!
@template(path="hello_template.jinja")
class HelloTemplate:
    name: str

Note

Be careful to not mix up source and path arguments as both can be strings.

Features

Type-safe templating

Static type checking

As templates are just dataclasses, you get all the benefits of static type checking.

HelloTemplate(nme="world")
             # ^--- squigly underline here

HelloTemplate(name=1)
             # ^--- squigly underline here

Template definition time validation

Missmatches between template vars and data are caught at import time:

from deigma import template

@template(
    """
    Hello, {{ name }}!
    """
)
class HelloTemplate:
    nam: str
# ValueError: Template variables mismatch. Template fields must match variables in source:
# 
# fields on type: {'nam'}, variables in source: {'name'}

Runtime validation

Deigma uses pydantic under the hood to validate the data passed to the template at runtime. This ensures that the data passed to the template is always valid. This also means you can use all pydantic features (like field constraints) in your templates:

from pydantic import Field

@template(
    """
    Hello, {{ name }}!
    """
)
class HelloTemplate:
    name: str = Field(min_length=5)

print(HelloTemplate(name="world"))
# Hello, world!

print(HelloTemplate(name="Li"))
# ValidationError: 1 validation error for HelloTemplate
# name
#   String should have at least 5 characters [type=string_too_short, input_value='Li', input_type=str]
#     For further information visit https://errors.pydantic.dev/2.10/v/string_too_short

Serialization

With deigma, the idea is to keep the serialization logic out of your business logic but also, to keep it out of your template sources. Having the serialization logic inside your template sources results in:

  • templates being harder to read and understand
  • templates being harder to maintain
  • templates being harder to test
  • templates being harder to reuse
  • ...

So, instead of doing this:

@template(
    """
    {{ user | tojson }}
    """
)
class RawUserTemplate:
    user: dict[str, str]

You can do this:

@template(
    """
    {{ user }}
    """,
    serialize=serialize_json,
)
class UserTemplate:
    user: User

The idea is that in template sources, you just reference objects. The representation of these objects is then handled by the template engine. This way, you can keep your templates clean and readable. As serializers are just injected as a dependency into the template, you can easily change the serialization behavior for all templates by changing the serializer in the template decorator. This also makes it straightforward to write custom serializers (see below).

Advanced usage

SerializationProxy

By default, for rendering, deigma passes down proxies to the compiled (jinja) template. Proxies try to mimic the behavior of the original object as closely as possible. But carry some special features making them particularly useful for serialization:

  1. Field serializers are applied before acessing a field. This allows for something like this:
SQLKeywordName = Annotated[str, PlainSerializer(lambda keyword: keyword.upper())]


@dataclass
class SQLKeyword:
    name: SQLKeywordName
    description: str


@template(
    """
    # SQL Keywords
    {% for keyword in keywords %}
    - {{ keyword.name }}: {{ keyword.description }}
    {% endfor %}
    """
)
class SQLKeywordListingTemplate:
    keywords: list[SQLKeyword]


keywords = [SQLKeyword(name="select", description="The select clause ..."), ...]

print(SQLKeywordListingTemplate(keywords=keywords))
# # SQL Keywords
#
# - SELECT: The select clause ...
# ...

Without the proxy, the field serializer would get lost at runtime when accessing the field (in this case name) in the template. With the proxy, the serializer is applied before accessing the field, so the field is already serialized when it is accessed in the template.

But field serializers are even applied when rendering the compound object natively:

@template("{{ keywords }}")
class LiteralSQLKeywordListingTemplate:
    keywords: list[SQLKeyword]

print(LiteralSQLKeywordListingTemplate(keywords=keywords))
# [{'name': 'SELECT', 'description': 'The select clause ...'}, ...]
  1. Pydantic CoreSchemas (i.e. Serializers and Validators) are built once and are cached afterwards. This allows for very efficient builtups of TypeAdapters in custom serializers. Generally, the CoreSchema is already known at template definition time, so SerializationProxy allows for receiving a schema (via a TypeAdapter) upon build. See Template lifecycle for more details.

  2. For further performance improvements, SerializationProxy also caches the serialized data itself at template instantiation time, and uses this data if the underlying object is immutable. With this, serialization at runtime effectively becomes a dictionary lookup.

Note

SerializationProxy is a highly experimental feature and might break in subtle ways. Currently, deigma defaults to using them (we might swap this default in the future). You can opt out of this globally by setting DEIGMA_USE_PROXY=0 in your environment, or by setting deigma.template.USE_PROXY = False. You can also opt out on the template level by supplying the argument use_proxy=False to the template decorator. With SerializationProxy disabled, we pass down the already serialized data directly to the template. In this case type information is lost. If you encounter any issues, please report them.

Template lifecycle

TODO

With SerializationProxy

Without SerializationProxy

Custom serialization

By default, template variables are serialized using str. You can inject serializers into templates in two ways.

  1. On the template level, by passing a serialize function to the template decorator:
class User(TypedDict):
    first_name: str
    last_name: str

@template(
    """
    {{ user }}
    """,
    serialize=partial(json.dumps, indent=2),
)
class UserTemplate:
    user: User

print(UserTemplate(user=User(first_name="Li", last_name="Si")))
# {
#   "first_name": "Li",
#   "last_name": "Si"
# }

Template serializers are applied by injecting the serializer into the Jinja environment and using them as a filter applied last on all template variable occurrences (auto_serialize, see AutoSerializeExtension for details). This means serializers are applied lazily inside the template environment at the template variable level whenever a variable is interpolated. So whenever a {{ var }} is encountered, the serializer is automatically applied to var before rendering. This ensures consistent application of serializers and eliminates the need to apply them manually. It also simplifies writing custom serializers (see below) and allows centralized control of serialization. You can easily change the serializer for all variables in a template by modifying the template's serialize attribute. For custom field-level serialization, see below. Note that since templates are rendered through jinja2, you can manually apply additional filters if necessary. However, this is not recommended, as it can lead to inconsistent serialization behavior, break separation of concerns, compromise static analysis capabilities, and make your code harder to understand.

  1. On field level, by leveraging pydantic's serialization capabilities:
  • Using field_serializer:
from pydantic import PlainSerializer, field_serializer

@template(
    """
    {{ user }}
    """
)
class UserTemplate:
    user: User
    
    @field_serializer("user")
    def inline_user(self, user: User) -> str:
        return f"{user.first_name} {user.last_name}"

print(UserTemplate(user=User(first_name="Li", last_name="Si")))
# Li Si
  • Using PlainSerializer:
UserInline = Annotated[User, PlainSerializer(lambda user: f"{user.first_name} {user.last_name}")]

@template(
    """
    {{ user }}
    """
)
class UserTemplate:
    user: UserInline

print(UserTemplate(user=User(first_name="Li", last_name="Si")))
# Li Si

Built-in serializers

Deigma comes with a set of built-in serialization functions that you can use out of the box:

  • serialize_str: Serializes objects to strings
  • serialize_repr: Serializes objects to their repr
  • serialize_json: Serializes objects to JSON
  • serialize_json_schema: Serializes objects to JSON
  • serialize_md_json: Serializes objects to JSON, wrapped in markdown code fences
  • serialize_md_json_schema: Serializes objects to JSON schema, wrapped in markdown code fences

Writing custom serializers

Serializers are very simple by design. Serializers are applied on template variable level. This ensures that writing a serializer is straightforward and doesn't require any special knowledge or API. It's literally just a function that takes a Serializable (a data structure that could be handled by pydantic's serialization machinery) and returns a string. Here's the generic signature of a serializer:

def serialize(obj: Serializeable) -> str:
    ...

Replacing template instance data

As bound template instances are just dataclasses, you can replace their fields using the replace function from the dataclasses module:

from dataclasses import replace

hello_monde = replace(hello_world, name="Monde")
hello_monde
# HelloTemplate(name='Monde')

str(hello_monde)
# 'Hello, Monde!'

For your convenience, however, we re-expose the replace function in deigma. This might save you an additional import line:

from deigma import template, replace
# ...

Replacing template type data

You can also replace data on the template type. For this, use the with_ function:

from deigma import with_

BonjourTemplate = with_(HelloTemplate, source="Bonjour, {{ name }}!")
print(BonjourTemplate(name="Monde"))
# Bonjour, Monde!

with_(BonjourTemplate, serialize=serialize_json)(name="Monde")
# '"Bonjour, Monde!"'

Contributing

Development

About

Type-safe templating for python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages