Conversion Templates¶

Templates in Alchemist allow you to extend or modify out-of-the-box conversion results. Templates consist of a pattern (the "what") and Jinja text files (the "how") combined to generate the desired output based on the provided configuration settings.

Understanding Templates¶

To use a template, you need two things:

A template configuration in the Alchemist configuration file.
A template text. It may either be stored in a file or provided inline in the configuration.

The template text is processed by the Jinja templating engine. Alchemist provides a number of built-in variables and functions that can be used in the template:

The node matched and being replaced by a template is always available in the template as node.
If node patterns are used, the match dictionary will have each value captured by a pattern (key matching the capture key).
render - a function that will defer back to Alchemist to convert a node. Usually used to convert children of the matched node.
get_node_by_id - a function that will return an AST node by its ID.
now - current datetime in UTC.
none_rendered_str - a special object that should be used to indicate that a node should not be rendered.

For a full reference on Jinja syntax, see the Jinja documentation.

A target dialect may define additional template features. For more information, refer to the dialects documentation.

Templates in Alchemist UI¶

Both the code pattern and Jinja text can be managed in the Alchemist UI. The main interface for managing templates is in the side panel on the Code & AST Search pages (1).

Template UI

If your config has some templates defined, they will be available in the side panel, or you can add a new one by clicking the "Add Template" (2) button (currently visible on the AST Search page only).

Here is a rundown of the available controls:

(3) - allows editing the Jinja inline template text or the name of the template file. This will only be available for standard, non-dialect specific templates.
(4) - opens additional controls:
- (5) - changes the template name (name key in the config).
- (6) - opens the template pattern in AST Search (from any page) - see below.
- (7) - deletes the template. This change has an immediate effect on the current runtime config (see below), but will not change the config file! If you want to save the changes, use the "Save As" button on the Config page to export the updated config.
(8) - enables or disables the template in the current runtime config. Disabled templates are not used during re-conversion. This will only have an effect when prototyping in the UI, as all templates defined in the configuration file are always used when running Alchemist conversion from the console.

Testing & Saving Patterns¶

The AST Search page allows testing, updating the patterns used in the templates, or creating new templates with the defined patterns.

(1) - allows opening the template pattern for review and editing.
(2) - editors for patterns based on node filters, XPath, and node patterns.
(3) - save (update) the pattern of the current template, or save it as a new template.
(4) - run the pattern search on the current AST. This will validate the syntax and highlight all nodes that match the pattern in the AST view (use "Global Search" to search across all source files).
(5) - preview the match dictionary for the selected node. This will show all values captured by the pattern in the current node.

Prototyping Templates¶

When you are working on a new template, you may want to test it before committing it to your production Alchemist config.

Our UI provides a way to prototype templates. This is the typical workflow:

Run conversion on 1-2 source files with UI (-u flag in the CLI).
Open the AST Search page to create or modify the pattern, and test it to make sure the correct nodes are matched. If you need to capture some values, use node patterns and review the resulting match dictionary (5).
Edit the template text in the UI (3).
Switch to the "Code" page and use the "Reconvert" button to see the effect of the template on conversion results.
Enable/disable templates in the side panel (8) to see before/after results.

Template Config Options¶

Option	Description
name	Optional name of the template. Used for identification in UI and error messages.
template	Name of the Jinja template file to use. If no extension is given, all of the following will be tried: `sql`, `py`, `yml`, `yaml`, `jinja`. Only one of `template` or `inline` must be set, except for the `skip` output_type - for which neither is necessary.
inline	Inline template string to use. If inline is not empty, `template` must not be set.
output_type	Type of output to generate. For supported output types see output types.
output_relative_path	Relative path for the output file. Can be a `Path`, `str`, or `None`. If `None`, the template name + ID is used. Can also include `{var_name}` placeholders, which are substituted with a number of format variables, including all node properties and `source_relative_path`. This option is ignored for output types other than `file`.
match_nodes	List of configs used to match the current node by type and/or attribute values. It uses the same syntax as Node Filter Configs. If any of the configs include this node (or none of them exclude it) - it is considered a match.
match_ids	Deprecated! Use `match_nodes` instead. List of node Content or regular IDs to match (`content_id` and `id` attributes).
match_paths	List of XPath patterns. These patterns define which nodes should be processed by the template. For more information on how to write XPath patterns and examples, see the XPath Patterns section.
match_patterns	List of tree node patterns. For more information on how to write node patterns and examples, see the Node Patterns section.
tags	List of template tags. These tags are used to apply template filters.
model_output_type	One of the supported JSON model names when `output_type` is set to `MODEL`. List of models depend on the dialect, refer to the dialects docs.

Example Configuration¶

converter:
  template_configs:
    - template: full_program
      output_type: "file"
      match_nodes:
        - include: "e237a0654a8e79ad04cc8050ea0cf235acb2ebaeeaca1a2681a6f2d199659fa6"
      output_relative_path: "{source_relative_path}/{source_file_name}.py"
    - template: intnx_runtime
      output_type: "snippet"
      match_paths:
        - SASSQLProcedure//SASCallExpression
      match_patterns:
        - '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment, () -> alignment])'
        - '(SASCallExpression @func_name="INTNX" @arguments=[() -> interval, () -> start_from, () -> increment])'
      tags:
        - "runtime"
    - inline: "SPLIT_PART(COL_NAME, '/', 2)"
      match_nodes:
        - include: "id_of_scan_expr"
    - name: "my custom name for the template"
      inline: "REGEXP_EXTRACT(column_name, '\s+42', 1)"
      match_nodes:
        - include: "id_of_prxchange_expr"

XPath Patterns¶

XPath patterns are used to define which nodes in the AST should be processed by the template. XPath patterns in Alchemist are similar to XML XPath and consist of a sequence of paths. Each path is in the format @parent_field_name[index]type_name_or_pattern and is separated by a slash.

The slash is optional at the start of the XPath. If omitted, it is the same as using //the rest of the path.
// is a wildcard (anywhere) that matches any path.
@parent_field_name and index are optional.
type_name_or_pattern is optional, except for the last path in the XPath. It may be a type name or a a single node pattern (see below).

XPath Examples¶

Note: the following examples do not use real AST node types, which depend on the dialect. To get the real node tpyes you can use AST Explorer

CallExpr/@arguments[0]Id matches the first argument of a call expression at any level in the tree, which is of type Id.
(CallExpr @func_name="math.*")/@arguments[*]Id similar to the above, but matches any call expression whose function name matches the pattern math.*.
@arguments[*]CallExpr matches any CallExpr as long as it is an argument (stored in a field with the said name) of a parent expression.
CallExpr matches any call expression at any level in the tree.

Node Patterns¶

Node patterns are used for two main purposes:

To match a particular node in the AST based on its attributes and children.
To extract values from within the matched node subtree for use in the conversion template.

Patterns are written using a syntax similar to a cross between S-expressions and regex. To understand the syntax and usage, let's explore some examples of valid patterns.

Pattern Syntax¶

In node patterns, the following syntax is used:

Parentheses () are used to match an AST Node. What's inside these brackets define a node pattern. The minimal possible pattern is (), which will match any node.
The left bracket may be followed by one or more node types (class names) separated by |. Each class name will match the type itself and any subclasses. If 2+ classes specified via (Class1 | Class2) this works as OR
The rest of the content within the parentheses defines attributes and children. To match an attribute by value and/or capture it's value use the following general syntax @attr_name [= VALUE] [-> capture_key] where = VALUE and capture_key are both optional.
- VALUE may be one of the following
  - a nested node pattern
  - A plain string or regex that will match against string represnetaion of the attribute. Regex uses Python syntax
  - A None keyword, that will match when the value is, well, you guessed it - None.
  - A variable $capture_key. This will match value against a previously captured value. E.g. (SASCallExpression @arguments=[(*) -> left, $left]) will match any function call that has two identical arguments
  - A sequence of values in [] brackets, including empty sequence []. Within the sequence you can define multiple values, capture_keys and even use regex-like quantifiers, like so: [VALUE -> capture_key, ANOTHER_0_OR_MORE_TIMES* -> another_capture_key, ()* -> tail]. capture_key is optional.
- -> capture_key is used to capture the value of the attribute. The value is available in the pattern itself via variables (see above) as well as as in template match dictionary match["capture_key"]. If you want to capture multiple values into a list, use +, e.g. (@args=[() -> +args, () -> mid_arg, () -> +args]) will first and last arguments into args list.

Pattern Examples¶

As an example of a typical usage, let's see how pattern can be used to find and convert all SAS macro calls named GET_VIPS with 2 or more arguments. The following pattern will match such calls:

'(SASMacroCallExpression @arguments=[() -> call_type, () -> name_col, ()* -> tail_args] @func_name=(SASCodeName @identifier="GET_VIPS"))'

And may then be used in a template like so:

udf_vips_{{match["call_type"].value.value | lower}}({{match["name_col"].value.value | lower}},
{%- for arg in match["tail_args"] -%}
    {{- render(arg.value)}}{{ ", " if not loop.last else "" }}
{%- endfor -%}
)

Output Types¶

Output types define how the output of the template is handled. The following output types are supported by all dialects:

skip - the matching node will be skipped from conversion. This has the same effect as an excluding node filter, but allows using xpath and pattern matching.
snippet - The resulting string is inserted into the source file at the location of the matched node.
line - like snippet, but the resulting string is inserted on a new line.
chunk - same as line, except for the way it is represented in the side by side view in UI. Each chunk is a separate block in the view.
file - The resulting string becomes one of the reulsting converted outputs and is written to a file according to the path specification. The file path is defined by the output_relative_path field in the template config. The path is relative to the output path set in the configuration.
model - the result is treated as a json representation of a model, which may then be used by the specific dialect. This type should only be used for dialects that operate on models, such as the prophecy dialect. For more information, refer dialects docs.

In addition to these, dialects may define their own output types. Consult the dialect documentation for more information.

Usage examples¶

Altering the table in addition to the out-of-the-box conversion results¶

Alchemist doesn't handle permissions automatically. Here is an example of how using the template allows for customer-specific customization.

converter:
  template_configs:
    - template: template.jinja
      match_patterns:
        - |
          (SASSQLCreateAsSelectStatement
            @dataset=(SASDatasetExpression @dataset_name=(SASCodeQualifiedDatasetRef
              @source=(SASCodeName @identifier -> out_schema)
              @dataset_name=(SASCodeName @identifier -> out_table)
            ))
          )

      output_type: "line"
      tags:
        - "py_spark"
        - "py_var"

The content of the template.jinja file:

{{ render(node, no_templates=True) }}

spark.sql(f"ALTER TABLE {% raw %}{{% endraw %}{{match["out_schema"]}}{% raw %}}{% endraw %}.{{match["out_table"]}} SET OWNER TO `{OWNER_GROUP}`")

This will add the following statement at the end of rendering the node:

spark.sql(f"ALTER TABLE {out_schema}.out_table SET OWNER TO `{OWNER_GROUP}`")