Conversion Templates¶
Templates in Alchemist allow you to extend or modify out-of-the-box conversion results. Templates consist of a pattern (the "what") and Jinja text files (the "how") combined to generate the desired output based on the provided configuration settings.
Understanding Templates¶
To use a template, you need two things:
- A template configuration in the Alchemist configuration file.
- A template text. It may either be stored in a file or provided inline in the configuration.
The template text is processed by the Jinja templating engine. Alchemist provides a number of built-in variables and functions that can be used in the template:
- The node matched and being replaced by a template is always available in the template as
node
. - If node patterns are used, the
match
dictionary will have each value captured by a pattern (key matching the capture key). render
- a function that will defer back to Alchemist to convert a node. Usually used to convert children of the matched node.get_node_by_id
- a function that will return an AST node by its ID.now
- current datetime in UTC.none_rendered_str
- a special object that should be used to indicate that a node should not be rendered.
For a full reference on Jinja syntax, see the Jinja documentation.
A target dialect may define additional template features. For more information, refer to the dialects documentation.
Templates in Alchemist UI¶
Both the code pattern and Jinja text can be managed in the Alchemist UI. The main interface for managing templates is in the side panel on the Code & AST Search pages (1).
If your config has some templates defined, they will be available in the side panel, or you can add a new one by clicking the "Add Template" (2) button (currently visible on the AST Search page only).
Here is a rundown of the available controls:
- (3) - allows editing the Jinja inline template text or the name of the template file. This will only be available for standard, non-dialect specific templates.
- (4) - opens additional controls:
- (5) - changes the template name (
name
key in the config). - (6) - opens the template pattern in AST Search (from any page) - see below.
- (7) - deletes the template. This change has an immediate effect on the current runtime config (see below), but will not change the config file! If you want to save the changes, use the "Save As" button on the Config page to export the updated config.
- (5) - changes the template name (
- (8) - enables or disables the template in the current runtime config. Disabled templates are not used during re-conversion. This will only have an effect when prototyping in the UI, as all templates defined in the configuration file are always used when running Alchemist conversion from the console.
Testing & Saving Patterns¶
The AST Search page allows testing, updating the patterns used in the templates, or creating new templates with the defined patterns.
- (1) - allows opening the template pattern for review and editing.
- (2) - editors for patterns based on node filters, XPath, and node patterns.
- (3) - save (update) the pattern of the current template, or save it as a new template.
- (4) - run the pattern search on the current AST. This will validate the syntax and highlight all nodes that match the pattern in the AST view (use "Global Search" to search across all source files).
- (5) - preview the match dictionary for the selected node. This will show all values captured by the pattern in the current node.
Prototyping Templates¶
When you are working on a new template, you may want to test it before committing it to your production Alchemist config.
Our UI provides a way to prototype templates. This is the typical workflow:
- Run conversion on 1-2 source files with UI (
-u
flag in the CLI). - Open the AST Search page to create or modify the pattern, and test it to make sure the correct nodes are matched. If you need to capture some values, use node patterns and review the resulting match dictionary (5).
- Edit the template text in the UI (3).
- Switch to the "Code" page and use the "Reconvert" button to see the effect of the template on conversion results.
- Enable/disable templates in the side panel (8) to see before/after results.
Template Config Options¶
Option | Description |
---|---|
name | Optional name of the template. Used for identification in UI and error messages. |
template | Name of the Jinja template file to use. If no extension is given, all of the following will be tried: sql , py , yml , yaml , jinja . Only one of template or inline must be set, except for the skip output_type - for which neither is necessary. |
inline | Inline template string to use. If inline is not empty, template must not be set. |
output_type | Type of output to generate. For supported output types see output types. |
output_relative_path | Relative path for the output file. Can be a Path , str , or None . If None , the template name + ID is used. Can also include {var_name} placeholders, which are substituted with a number of format variables, including all node properties and source_relative_path . This option is ignored for output types other than file . |
match_nodes | List of configs used to match the current node by type and/or attribute values. It uses the same syntax as Node Filter Configs. If any of the configs include this node (or none of them exclude it) - it is considered a match. |
match_ids | Deprecated! Use match_nodes instead. List of node Content or regular IDs to match (content_id and id attributes). |
match_paths | List of XPath patterns. These patterns define which nodes should be processed by the template. For more information on how to write XPath patterns and examples, see the XPath Patterns section. |
match_patterns | List of tree node patterns. For more information on how to write node patterns and examples, see the Node Patterns section. |
tags | List of template tags. These tags are used to apply template filters. |
model_output_type | One of the supported JSON model names when output_type is set to MODEL . List of models depend on the dialect, refer to the dialects docs. |
Example Configuration¶
converter:
template_configs:
- template: full_program
output_type: "file"
match_nodes:
- include: "e237a0654a8e79ad04cc8050ea0cf235acb2ebaeeaca1a2681a6f2d199659fa6"
output_relative_path: "{source_relative_path}/{source_file_name}.py"
- template: intnx_runtime
output_type: "snippet"
match_paths:
- SASSQLProcedure//SASCallExpression
match_patterns:
- '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment, () -> alignment])'
- '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment])'
tags:
- "runtime"
- inline: "SPLIT_PART(COL_NAME, '/', 2)"
match_nodes:
- include: "id_of_scan_expr"
- name: "my custom name for the template"
inline: "REGEXP_EXTRACT(column_name, '\s+42', 1)"
match_nodes:
- include: "id_of_prxchange_expr"
XPath Patterns¶
XPath patterns are used to define which nodes in the AST should be processed by the template. XPath patterns in Alchemist are similar to XML XPath and consist of a sequence of paths. Each path is in the format @parent_field_name[index]type_name_or_pattern
and is separated by a slash.
- The slash is optional at the start of the XPath. If omitted, it is the same as using
//the rest of the path
. //
is a wildcard (anywhere) that matches any path.@parent_field_name
andindex
are optional.type_name_or_pattern
is optional, except for the last path in the XPath. It may be a type name or a a single node pattern (see below).
XPath Examples¶
Note: the following examples do not use real AST node types, which depend on the dialect. To get the real node tpyes you can use AST Explorer
CallExpr/@arguments[0]Id
matches the first argument of a call expression at any level in the tree, which is of typeId
.(CallExpr @func_name="math.*")/@arguments[*]Id
similar to the above, but matches any call expression whose function name matches the patternmath.*
.@arguments[*]CallExpr
matches any CallExpr as long as it is an argument (stored in a field with the said name) of a parent expression.CallExpr
matches any call expression at any level in the tree.
Node Patterns¶
Node patterns are used for two main purposes:
- To match a particular node in the AST based on its attributes and children.
- To extract values from within the matched node subtree for use in the conversion template.
Patterns are written using a syntax similar to a cross between S-expressions and regex. To understand the syntax and usage, let's explore some examples of valid patterns.
Pattern Syntax¶
In node patterns, the following syntax is used:
- Parentheses () are used to match an AST Node. What's inside these brackets define a node pattern. The minimal possible pattern is
()
, which will match any node. - The left bracket may be followed by one or more node types (class names) separated by
|
. Each class name will match the type itself and any subclasses. If 2+ classes specified via(Class1 | Class2)
this works asOR
- The rest of the content within the parentheses defines attributes and children. To match an attribute by value and/or capture it's value use the following general syntax
@attr_name [= VALUE] [-> capture_key]
where= VALUE
andcapture_key
are both optional.- VALUE may be one of the following
- a nested node pattern
- A plain string or regex that will match against string represnetaion of the attribute. Regex uses Python syntax
- A
None
keyword, that will match when the value is, well, you guessed it - None. - A variable
$capture_key
. This will match value against a previously captured value. E.g.(SASCallExpression @arguments=[(*) -> left, $left])
will match any function call that has two identical arguments - A sequence of values in
[]
brackets, including empty sequence[]
. Within the sequence you can define multiple values, capture_keys and even use regex-like quantifiers, like so:[VALUE -> capture_key, ANOTHER_0_OR_MORE_TIMES* -> another_capture_key, ()* -> tail]
.capture_key
is optional.
-> capture_key
is used to capture the value of the attribute. The value is available in the pattern itself via variables (see above) as well as as in template match dictionarymatch["capture_key"]
. If you want to capture multiple values into a list, use+
, e.g.(@args=[() -> +args, () -> mid_arg, () -> +args])
will first and last arguments intoargs
list.
- VALUE may be one of the following
Pattern Examples¶
As an example of a typical usage, let's see how pattern can be used to find and convert all SAS macro calls named GET_VIPS
with 2 or more arguments. The following pattern will match such calls:
'(SASMacroCallExpression @arguments=[() -> call_type, () -> name_col, ()* -> tail_args] @func_name=(SASCodeName @identifier="GET_VIPS"))'
And may then be used in a template like so:
udf_vips_{{match["call_type"].value.value | lower}}({{match["name_col"].value.value | lower}},
{%- for arg in match["tail_args"] -%}
{{- render(arg.value)}}{{ ", " if not loop.last else "" }}
{%- endfor -%}
)
Output Types¶
Output types define how the output of the template is handled. The following output types are supported by all dialects:
skip
- the matching node will be skipped from conversion. This has the same effect as an excluding node filter, but allows using xpath and pattern matching.snippet
- The resulting string is inserted into the source file at the location of the matched node.line
- like snippet, but the resulting string is inserted on a new line.chunk
- same as line, except for the way it is represented in the side by side view in UI. Each chunk is a separate block in the view.file
- The resulting string becomes one of the reulsting converted outputs and is written to a file according to the path specification. The file path is defined by theoutput_relative_path
field in the template config. The path is relative to the output path set in the configuration.model
- the result is treated as a json representation of a model, which may then be used by the specific dialect. This type should only be used for dialects that operate on models, such as theprophecy
dialect. For more information, refer dialects docs.
In addition to these, dialects may define their own output types. Consult the dialect documentation for more information.
Usage examples¶
Altering the table in addition to the out-of-the-box conversion results¶
Alchemist doesn't handle permissions automatically. Here is an example of how using the template allows for customer-specific customization.
converter:
template_configs:
- template: template.jinja
match_patterns:
- |
(SASSQLCreateAsSelectStatement
@dataset=(SASDatasetExpression @dataset_name=(SASCodeQualifiedDatasetRef
@source=(SASCodeName @identifier -> out_schema)
@dataset_name=(SASCodeName @identifier -> out_table)
))
)
output_type: "line"
tags:
- "py_spark"
- "py_var"
The content of the template.jinja
file:
{{ render(node, no_templates=True) }}
spark.sql(f"ALTER TABLE {% raw %}{{% endraw %}{{match["out_schema"]}}{% raw %}}{% endraw %}.{{match["out_table"]}} SET OWNER TO `{OWNER_GROUP}`")
This will add the following statement at the end of rendering the node:
spark.sql(f"ALTER TABLE {out_schema}.out_table SET OWNER TO `{OWNER_GROUP}`")