API Reference¶

Convert data structures to native language literal syntax.

class literalizer.CommentConfig(prefix: str, suffix: str)¶

Configuration for language comment syntax.

prefix: str¶

suffix: str¶

class literalizer.DateFormatConfig(formatter: ~collections.abc.Callable[[~datetime.date], str], preamble_lines: tuple[str, ...] = (), type_produced: type = <class 'datetime.date'>)¶

Configuration for a single date format.

formatter: Callable[[date], str]¶

preamble_lines: tuple[str, ...] = ()¶

type_produced¶: alias of date

class literalizer.DatetimeFormatConfig(formatter: ~collections.abc.Callable[[~datetime.datetime], str], preamble_lines: tuple[str, ...] = (), type_produced: type = <class 'datetime.datetime'>)¶

Configuration for a single datetime format.

formatter: Callable[[datetime], str]¶

preamble_lines: tuple[str, ...] = ()¶

type_produced¶: alias of datetime

class literalizer.DictFormatConfig(open_fn: Callable[[dict[str, Value]], str], close: str, format_entry: Callable[[str, Value, str], str], empty_dict: str | None, preamble_lines: tuple[str, ...], narrowed_open: str | None)¶

Configuration for dict formatting.

close: str¶

empty_dict: str | None¶

format_entry: Callable[[str, Value, str], str]¶

narrowed_open: str | None¶

open_fn: Callable[[dict[str, Value]], str]¶

preamble_lines: tuple[str, ...]¶

class literalizer.Language(*args, **kwargs)¶

Protocol describing how a language formats scalar literals and sequences.

Predefined instances for common languages are available as module-level constants in literalizer.languages (e.g. PYTHON, JAVASCRIPT). To support additional languages or override defaults, write a class that provides all the required attributes.

property bytes_formats: type[Enum]¶: Enum class whose members list the bytes formats this language supports.

comment_config: CommentConfig¶: Configuration for the language’s comment syntax.

property comment_format: Enum¶: The comment format chosen for this language instance.

property comment_formats: type[Enum]¶: Enum class whose members list the comment formats this language supports.

compute_body_preamble: Callable[[frozenset[type], Value], tuple[str, ...]]¶

Computes body-preamble lines based on which types are present in the data. Most languages build this from scalar_body_preamble; Haskell overrides it to compose the data Val declaration and imports dynamically.

The second argument is the original data value, allowing implementations to inspect actual values when needed (e.g. to determine whether datetime microsecond-precision imports are required).

property date_formats: type[Enum]¶: Enum class whose members list the date formats this language supports.

property datetime_formats: type[Enum]¶: Enum class whose members list the datetime formats this language supports.

property declaration_style: Enum¶: The declaration style chosen for this language instance.

property declaration_styles: type[Enum]¶: Enum class whose members list the declaration style options this language supports.

property dict_entry_style: Enum¶: The dict entry style chosen for this language instance.

property dict_entry_styles: type[Enum]¶: Enum class whose members list the dict entry style options this language supports.

property dict_format: Enum¶: The dict/map format chosen for this language instance.

dict_format_config: DictFormatConfig¶: Configuration for dict formatting.

property dict_formats: type[Enum]¶: Enum class whose members list the dict/map format options this language supports.

element_separator: str¶: The separator placed between elements in inline sequences.

extension: str¶: The file extension for this language, including the leading dot.

false_literal: str¶: The literal representing false/False.

property float_format: Enum¶: The float format chosen for this language instance.

property float_formats: type[Enum]¶: Enum class whose members list the float format options this language supports.

property format_bytes: Callable[[bytes], str]¶: Callable that formats a bytes value as a string literal.

property format_date: Callable[[date], str]¶: Callable that formats a datetime.date as a string literal.

property format_datetime: Callable[[datetime], str]¶: Callable that formats a datetime.datetime as a string literal.

property format_float: Callable[[float], str]¶: Callable that formats a float value as a literal.

property format_integer: Callable[[int], str]¶: Callable that formats an int value as a literal.

property format_ordered_map_entry: Callable[[str, Value, str], str]¶: Callable that formats one ordered-map entry.

property format_sequence_entry: Callable[[Value, str], str]¶: Callable that formats a sequence entry.

property format_set_entry: Callable[[Value, str], str]¶: Callable that formats a set entry.

property format_string: Callable[[str], str]¶: Callable that formats a string value as a quoted literal.

property format_variable_assignment: Callable[[str, str, Value], str]¶

Callable that formats an assignment to an existing variable.

Called as format_variable_assignment(name, value, data) where name is the variable name, value is the already-formatted literal value, and data is the original parsed data structure.

property format_variable_declaration: Callable[[str, str, Value], str]¶

Callable that formats a new variable declaration.

Called as format_variable_declaration(name, value, data) where name is the variable name, value is the already-formatted literal value, and data is the original parsed data structure.

indent: str¶: The indentation step for elements inside delimiters in multi-line structures (e.g. " " for 4-space indent).

indent_closing_delimiter: bool¶: Whether to indent the closing delimiter of multi-line structures by one indent step.

property integer_format: Enum¶: The integer format chosen for this language instance.

property integer_formats: type[Enum]¶: Enum class whose members list the integer format options this language supports.

property line_ending: Enum¶: The line ending option chosen for this language instance.

property line_endings: type[Enum]¶: Enum class whose members list the line ending options this language supports.

null_literal: str¶: The literal representing null/None.

property numeric_separator: Enum¶: The numeric separator option chosen for this language instance.

property numeric_separators: type[Enum]¶: Enum class whose members list the numeric separator options this language supports.

ordered_map_format_config: OrderedMapFormatConfig¶: Configuration for ordered-map formatting.

property pygments_name: str | None¶

The Pygments lexer short name for syntax highlighting.

None if Pygments does not support this language.

scalar_body_preamble: dict[type, tuple[str, ...]]¶

Maps Python scalar types to body-preamble lines that are prepended to the generated code.

Most languages leave this empty. Haskell uses it for typeclass instance definitions.

scalar_preamble: dict[type, tuple[str, ...]]¶: Maps Python scalar types to the preamble lines required when that type appears in the data. For example, a language that needs import datetime when dates are present would include {datetime.date: ("import datetime",)}.

property sequence_format: SequenceFormat¶: The sequence format chosen for this language instance.

sequence_format_config: SequenceFormatConfig¶: Configuration for the chosen sequence format.

property sequence_formats: type[Enum]¶: Enum class whose members list the sequence formats this language supports.

property sequence_open: Callable[[list[Value]], str]¶

Callable that returns the opening delimiter for a sequence.

Receives the list of items about to be formatted, so the delimiter can depend on the element types when needed. For a fixed delimiter use fixed_sequence_open().

property set_format: Enum¶: The set format chosen for this language instance.

set_format_config: SetFormatConfig¶: Configuration for the chosen set format.

property set_formats: type[Enum]¶: Enum class whose members list the set formats this language supports.

skip_null_dict_values: bool¶: Whether to omit dict entries whose value is None.

special_float_preamble: tuple[str, ...]¶: Preamble lines added only when special float values (inf, -inf, nan) appear in the data. Most languages set this to (). Languages whose special-float literals require imports (e.g. Go needs import "math") populate this field so the import is only emitted when actually needed.

static_body_preamble: Sequence[str]¶: Lines that are always prepended to the generated code, regardless of what types appear in the data. Appears after the header preamble but before the code body. Use an empty sequence when none are needed.

static_preamble: Sequence[str]¶: Lines (imports, package declarations, etc.) that are always emitted before the generated code, regardless of what types appear in the data. Use an empty sequence when none are needed.

property string_format: Enum¶: The string format chosen for this language instance.

property string_formats: type[Enum]¶: Enum class whose members list the string format options this language supports.

supports_collection_comments: bool¶

Whether the language supports comments inside collection initializers.

When False, YAML comments on collection elements are emitted as standalone comment lines immediately before the collection (or before the variable declaration when a variable name is supplied) rather than being placed inside the {...} block.

supports_scalar_before_comments: bool¶

Whether the language supports a line comment between the assignment operator and the value on the next line.

For example, in JavaScript const x = // note\n42; is valid because the parser continues the incomplete expression past the line comment. In Python x = # note\n42 is a syntax error because the # comment terminates the statement.

When False, YAML comments that appear before a scalar value are emitted as standalone comment lines immediately before the variable declaration rather than between the = and the value.

supports_scalar_inline_comments: bool¶

Whether the language supports a trailing line comment on a scalar value without breaking surrounding syntax.

For example, in JavaScript const x = 42 // note is valid because no closing token follows on the same line. In C ((_CVal){.i = 42 // note}); is a syntax error because the // comment consumes the closing });.

When False, YAML inline comments on scalar values are emitted as standalone comment lines immediately before the variable declaration rather than being appended after the value.

property trailing_comma: Enum¶: The trailing comma option chosen for this language instance.

trailing_comma_config: TrailingCommaConfig¶

Configuration for trailing-comma behavior.

Trailing commas are only added to collection formats that support them. See TrailingCommaConfig for details.

property trailing_commas: type[Enum]¶: Enum class whose members list the trailing comma options this language supports.

true_literal: str¶: The literal representing true/True.

type_hint_collection_preamble_lines: Callable[[frozenset[type]], tuple[str, ...]]¶

Callable that receives the set of collection types that have empty instances in the data and returns preamble lines needed for type-hint annotations.

Most languages return () unconditionally; Python uses this to emit from typing import Any only when the specific empty collection types present actually require it.

property variable_type_hints: Enum¶: The variable type hint option chosen for this language instance.

property variable_type_hints_formats: type[Enum]¶: Enum class whose members list the variable type hint options this language supports.

class literalizer.LanguageCls¶

Meta-class that declares the nested format Enum class attributes.

Language classes use metaclass=LanguageCls so that downstream code can write dict[str, LanguageCls] and access cls.DateFormats, cls.SequenceFormats, etc. without cast or type: ignore.

BytesFormats: type[Enum]¶

CommentFormats: type[Enum]¶

DateFormats: type[Enum]¶

DatetimeFormats: type[Enum]¶

DeclarationStyles: type[Enum]¶

DictEntryStyles: type[Enum]¶

DictFormats: type[Enum]¶

EmptyDictKey: type[Enum]¶

FloatFormats: type[Enum]¶

IntegerFormats: type[Enum]¶

LineEndings: type[Enum]¶

NumericLiteralSuffixes: type[Enum]¶

NumericSeparators: type[Enum]¶

SequenceFormats: type[Enum]¶

SetFormats: type[Enum]¶

StringFormats: type[Enum]¶

TrailingCommas: type[Enum]¶

VariableTypeHints: type[Enum]¶

extension: str¶

pygments_name: str | None¶

supports_default_dict_key_type: bool¶

supports_default_dict_value_type: bool¶

supports_default_ordered_map_value_type: bool¶

supports_default_sequence_element_type: bool¶

supports_default_set_element_type: bool¶

supports_non_printable_ascii_dict_keys: bool¶

class literalizer.LiteralizeResult(code: str, preamble: tuple[str, ...], body_preamble: tuple[str, ...])¶

Result of converting data to a native language literal.

property bare_code: str¶

The literal text without body_preamble prepended.

Identical to code when body_preamble is empty.

body_preamble: tuple[str, ...]¶

Type-definition lines (e.g. F#’s type Val = …, Haskell’s data Val = … and typeclass instances) that are prepended to code. Empty when none are needed.

Use bare_code to obtain the literal text without these lines.

code: str¶

The formatted literal text.

When a language defines scalar_body_preamble entries (e.g. Haskell typeclass instances), those lines are prepended to the code so they appear in the correct structural position.

preamble: tuple[str, ...]¶: Lines (imports, package declarations, etc.) that must precede the generated code. Empty when none are needed.

class literalizer.OrderedMapFormatConfig(open_str: str, close: str, preamble_lines: tuple[str, ...])¶

Configuration for ordered-map formatting.

close: str¶

open_str: str¶

preamble_lines: tuple[str, ...]¶

class literalizer.SequenceFormatConfig(sequence_open: Callable[[list[Value]], str], close: str, supports_heterogeneity: bool, single_element_trailing_comma: bool, supports_trailing_comma: bool, empty_sequence: str | None, preamble_lines: tuple[str, ...], format_entry: Callable[[Value, str], str], typed_opener_fallback: str | None, uses_typed_literal_for_scalars: bool, requires_uniform_record_shapes: bool)¶

Configuration for a single sequence format.

close: str¶

empty_sequence: str | None¶

format_entry: Callable[[Value, str], str]¶

preamble_lines: tuple[str, ...]¶

requires_uniform_record_shapes: bool¶

sequence_open: Callable[[list[Value]], str]¶

single_element_trailing_comma: bool¶

supports_heterogeneity: bool¶

supports_trailing_comma: bool¶

typed_opener_fallback: str | None¶

uses_typed_literal_for_scalars: bool¶

class literalizer.SetFormatConfig(set_open: Callable[[list[Value]], str], close: str, empty_set: str | None, preamble_lines: tuple[str, ...], set_opener_template: str)¶

Configuration for a single set format.

close: str¶

empty_set: str | None¶

preamble_lines: tuple[str, ...]¶

set_open: Callable[[list[Value]], str]¶

set_opener_template: str¶

with_typed_opener(*, type_to_opener: Callable[[type | ListType | DictType], str | None], fallback: str) → SetFormatConfig¶

Return a copy with set_open replaced by a typed opener.

The type_to_opener callable is used to infer the opener from the element type. When inference fails, fallback is used instead.

class literalizer.TrailingCommaConfig(multiline_trailing_comma: bool)¶

Configuration for trailing-comma behavior.

When multiline_trailing_comma is True, trailing commas are added to multiline collections where the chosen format supports them. Some sequence formats (e.g. Java’s List.of()) do not support trailing commas; in those cases the trailing comma is omitted regardless of this setting.

multiline_trailing_comma: bool¶

literalizer.fixed_dict_open(*, open_str: str) → Callable[[dict[str, Value]], str]¶

Return a dict_open callable that always returns open_str.

Use this as dict_open when the opening delimiter for dicts is a fixed string that does not depend on the dict contents.

Example: fixed_dict_open(open_str="{")({"a": 1}) -> "{".

literalizer.fixed_sequence_open(*, open_str: str) → Callable[[list[Value]], str]¶

Return a sequence_open callable that always returns open_str.

Use this as sequence_open when the opening delimiter for sequences is a fixed string that does not depend on the sequence contents.

Example: fixed_sequence_open(open_str="[")([1, 2, 3]) -> "[".

literalizer.fixed_set_open(*, open_str: str) → Callable[[list[Value]], str]¶

Return a set_open callable that always returns open_str.

Use this as set_open when the opening delimiter for sets is a fixed string that does not depend on the set contents.

Example: fixed_set_open(open_str="{")([1, 2, 3]) -> "{".

literalizer.literalize_json(*, json_string: str, language: Language, pre_indent_level: int, include_delimiters: bool, variable_name: str | None, new_variable: bool, error_on_coercion: bool) → LiteralizeResult¶

Convert a JSON string to native language literal text.

Parameters:

json_string – A JSON string representing a scalar, array, or object.
language – A Language instance describing how to format literals. Use one of the built-in constants (e.g. PYTHON, GO) or provide your own.
pre_indent_level – Number of indent steps to prepend to every output line. For example, 2 with a 4-space indent produces an 8-space margin. Defaults to 0.
include_delimiters – If True, include the collection delimiters ([ … ] for arrays, { … } for dicts).
variable_name – If given, wrap the output in a variable declaration using the language’s format_variable_declaration or format_variable_assignment callable.
new_variable – If True (the default), use format_variable_declaration (e.g. const x = in JavaScript). If False, use format_variable_assignment (e.g. x =). Only relevant when variable_name is given.
error_on_coercion – If True, raise HeterogeneousCoercionError instead of silently coercing heterogeneous scalar collections to strings. Only has an effect when the the language’s sequence format does not support heterogeneity.

Raises:

JSONParseError – If json_string is not valid JSON.
HeterogeneousCoercionError – If error_on_coercion is True and the data contains heterogeneous scalar collections that would be coerced.

literalizer.literalize_yaml(*, yaml_string: str, language: Language, pre_indent_level: int, include_delimiters: bool, variable_name: str | None, new_variable: bool, error_on_coercion: bool) → LiteralizeResult¶

Convert a YAML string to native language literal text.

YAML comments are preserved in the output using the target language’s comment syntax. The comment prefix is read from the comment_prefix attribute of language (defaulting to "#" when the attribute is absent).

Parameters:

yaml_string – A YAML string representing a scalar, sequence, or mapping.
language – A Language instance describing how to format literals. Use one of the built-in constants (e.g. PYTHON, GO) or provide your own.
pre_indent_level – Number of indent steps to prepend to every output line. For example, 2 with a 4-space indent produces an 8-space margin. Defaults to 0.
include_delimiters – If True, include the collection delimiters ([ … ] for arrays, { … } for dicts).
variable_name – If given, wrap the output in a variable declaration using the language’s format_variable_declaration or format_variable_assignment callable.
new_variable – If True (the default), use format_variable_declaration (e.g. const x = in JavaScript). If False, use format_variable_assignment (e.g. x =). Only relevant when variable_name is given.
error_on_coercion – If True, raise HeterogeneousCoercionError instead of silently coercing heterogeneous scalar collections to strings. Only has an effect when the the language’s sequence format does not support heterogeneity.

Raises:

YAMLParseError – If yaml_string is not valid YAML.
HeterogeneousCoercionError – If error_on_coercion is True and the data contains heterogeneous scalar collections that would be coerced.

Exceptions¶

Exceptions raised by literalizer.

exception literalizer.exceptions.HeterogeneousCoercionError¶: Raised when a collection contains heterogeneous scalar types and the language would coerce them to strings, but the caller opted to receive an error instead.

exception literalizer.exceptions.InvalidDictKeyError¶

Raised when a dict key cannot be represented in the target language.

This includes empty-string keys and keys containing characters that the language’s label syntax does not support (e.g. control characters in Dhall backtick-quoted labels).

exception literalizer.exceptions.JSONParseError¶: Raised when a JSON string cannot be parsed.

exception literalizer.exceptions.NullInCollectionError¶: Raised when a collection contains null elements and the chosen format does not support them (e.g. Java’s List.of()).

exception literalizer.exceptions.ParseError¶: Raised when input cannot be parsed into a data structure.

exception literalizer.exceptions.YAMLParseError¶: Raised when a YAML string cannot be parsed.