Template Configurations for Alchemist¶
Template configurations in Alchemist allow you to extend or modify out of the box conversion results using custom conversion templates. Templates are test files utilizing the Jinja templating engine to generate the desired output based on the provided configuration settings.
Understanding Templates¶
To use a template you need two things:
- A template configuration in Alchemist configuration file.
- A template text. It may either be stored in a file or provided inline in the configuration.
The template text is processed by the Jinja templating engine. Alchemist provides a number of built-in variables and functions that can be used in the template:
- The node matched and being replaced by a template is always available in the template as
node
. - If match patterns are used, the
match
dictionary will have each value captured by a pattern (key matching the capture key). render
- a function that will defer back to Alchemist to convert a node. Usually used to convert children of the matched node.get_node_by_id
- a function that will return an AST node by its IDnow
- current datetime in UTC.none_rendered_str
- a special object, that should be used to indicate that a node should not be rendered
For a full reference on Jinja syntax, see the Jinja documentation.
A target dialect may define additional template features. For more information, refer dialects docs.
Template Config Options¶
Option | Description |
---|---|
template | Name of the Jinja template file to use. If no extension is given, all of the following will be tried: sql , py , yml , yaml , jinja . Only one of template or inline must be set, except for the skip output_type - for which neither is necessary. |
inline | Inline template string to use. If inline is not empty, template must not be set. |
output_type | Type of output to generate. For supported output types see output types. |
output_relative_path | Relative path for the output file. Can be a Path , str , or None . If None , the template name + ID is used. Can also include {var_name} placeholders, which are substituted with a number of format variables, including all node properties and source_relative_path . This option is ignored for output types other than file . |
match_nodes | List of configs used to match the current node by type and/or attribute values. It uses the same syntax as Node Filter Configs. If any of the configs include this node (or none of them exclude it) - it is considered a match. |
match_ids | Deprecated! Use match_nodes instead. List of node Content or regular IDs to match (content_id and id attributes). |
match_paths | List of XPath match patterns. These patterns define which nodes should be processed by the template. For more information on how to write XPath patterns and examples, see the XPath Match Patterns section. |
match_patterns | List of tree match patterns. For more information on how to write match patterns and examples, see the Match Patterns section. |
tags | List of template tags. These tags are used to apply template filters. |
model_output_type | One of the supported JSON model names when output_type is set to MODEL . List of models depend on the dialect, refer to the dialects docs. |
Example Configuration¶
converter:
template_configs:
- template: full_program
output_type: "file"
match_nodes:
- include: "e237a0654a8e79ad04cc8050ea0cf235acb2ebaeeaca1a2681a6f2d199659fa6"
output_relative_path: "{source_relative_path}/{source_file_name}.py"
- template: intnx_runtime
output_type: "snippet"
match_paths:
- SASSQLProcedure//SASCallExpression
match_patterns:
- '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment, () -> alignment])'
- '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment])'
tags:
- "runtime"
- inline: "SPLIT_PART(COL_NAME, '/', 2)"
match_nodes:
- include: "id_of_scan_expr"
- inline: "some valid regex expression"
match_nodes:
- include: "id_of_prxchange_expr"
XPath Match Patterns¶
XPath match patterns are used to define which nodes in the AST should be processed by the template. XPath patterns in Alchemist are similar to XML XPath and consist of a sequence of paths. Each path is in the format @parent_field_name[index]type_name_or_pattern
and is separated by a slash.
- The slash is optional at the start of the XPath. If omitted, it is the same as using
//the rest of the path
. //
is a wildcard (anywhere) that matches any path.@parent_field_name
andindex
are optional.type_name_or_pattern
is optional, except for the last path in the XPath. It may be a type name or a a single node pattern (see below).
XPath Examples¶
Note: the following examples do not use real AST node types, which depend on the dialect. To get the real node tpyes you can use AST Explorer
CallExpr/@arguments[0]Id
matches the first argument of a call expression at any level in the tree, which is of typeId
.(CallExpr @func_name="math.*")/@arguments[*]Id
similar to the above, but matches any call expression whose function name matches the patternmath.*
.@arguments[*]CallExpr
matches any CallExpr as long as it is an argument (stored in a field with the said name) of a parent expression.CallExpr
matches any call expression at any level in the tree.
Match Patterns¶
Match patterns are used for two main purposes:
- To match a particular node in the AST based on its attributes and children.
- To extract values from within the matched node subtree for use in the conversion template.
Patterns are written using a syntax similar to a cross between S-expressions and regex. To understand the syntax and usage, let's explore some examples of valid patterns.
Pattern Syntax¶
In match patterns, the following syntax is used:
- Parentheses () are used to match an AST Node. What's inside these brackets define a node pattern. The minimal possible pattern is
()
, which will match any node. - The left bracket may be followed by one or more node types (class names) separated by
|
. Each class name will match the type itself and any subclasses. If 2+ classes specified via(Class1 | Class2)
this works asOR
- The rest of the content within the parentheses defines attributes and children. To match an attribute by value and/or capture it's value use the following general syntax
@attr_name [= VALUE] [-> capture_key]
where= VALUE
andcapture_key
are both optional.- VALUE may be one of the following
- a nested node pattern
- A plain string or regex that will match against string represnetaion of the attribute. Regex uses Python syntax
- A
None
keyword, that will match when the value is, well, you guessed it - None. - A variable
$capture_key
. This will match value against a previously captured value. E.g.(SASCallExpression @arguments=[(*) -> left, $left])
will match any function call that has two identical arguments - A sequence of values in
[]
brackets, including empty sequence[]
. Within the sequence you can define multiple values, capture_keys and even use regex-like quantifiers, like so:[VALUE -> capture_key, ANOTHER_0_OR_MORE_TIMES* -> another_capture_key, ()* -> tail]
.capture_key
is optional.
-> capture_key
is used to capture the value of the attribute. The value is available in the pattern itself via variables (see above) as well as as in template match dictionarymatch["capture_key"]
. If you want to capture multiple values into a list, use+
, e.g.(@args=[() -> +args, () -> mid_arg, () -> +args])
will first and last arguments intoargs
list.
- VALUE may be one of the following
Pattern Examples¶
As an example of a typical usage, let's see how pattern can be used to find and convert all SAS macro calls named GET_VIPS
with 2 or more arguments. The following pattern will match such calls:
'(SASMacroCallExpression @arguments=[() -> call_type, () -> name_col, ()* -> tail_args] @func_name=(SASCodeName @identifier="GET_VIPS"))'
And may then be used in a template like so:
udf_vips_{{match["call_type"].value.value | lower}}({{match["name_col"].value.value | lower}},
{%- for arg in match["tail_args"] -%}
{{- render(arg.value)}}{{ ", " if not loop.last else "" }}
{%- endfor -%}
)
Output Types¶
Output types define how the output of the template is handled. The following output types are supported by all dialects:
skip
- the matching node will be skipped from conversion. This has the same effect as an excluding node filter, but allows using xpath and pattern matching.snippet
- The resulting string is inserted into the source file at the location of the matched node.line
- like snippet, but the resulting string is inserted on a new line.chunk
- same as line, except for the way it is represented in the side by side view in UI. Each chunk is a separate block in the view.file
- The resulting string becomes one of the reulsting converted outputs and is written to a file according to the path specification. The file path is defined by theoutput_relative_path
field in the template config. The path is relative to the output path set in the configuration.model
- the result is treated as a json representation of a model, which may then be used by the specific dialect. This type should only be used for dialects that operate on models, such as theprophecy
dialect. For more information, refer dialects docs.
In addition to these, dialects may define their own output types. Consult the dialect documentation for more information.
Usage examples¶
Altering the table in addition to the out-of-the-box conversion results¶
Alchemist doesn't handle permissions automatically. Here is an example of how using the template allows for customer-specific customization.
converter:
template_configs:
- template: template.jinja
match_patterns:
- |
(SASSQLCreateAsSelectStatement
@dataset=(SASDatasetExpression @dataset_name=(SASCodeQualifiedDatasetRef
@source=(SASCodeName @identifier -> out_schema)
@dataset_name=(SASCodeName @identifier -> out_table)
))
)
output_type: "line"
tags:
- "py_spark"
- "py_var"
The content of the template.jinja
file:
{{ render(node, no_templates=True) }}
spark.sql(f"ALTER TABLE {% raw %}{{% endraw %}{{match["out_schema"]}}{% raw %}}{% endraw %}.{{match["out_table"]}} SET OWNER TO `{OWNER_GROUP}`")
This will add the following statement at the end of rendering the node:
spark.sql(f"ALTER TABLE {out_schema}.out_table SET OWNER TO `{OWNER_GROUP}`")