Skip to content

Extra Template Features in Databricks Dialect

Databricks dialect provides additional template types.

Output Types

  • sas_dbr_notebook_header - This template is specifically designed for SAS DI Jobs, EG Flows, or SAS Programs and allows you to define a custom notebook header with arbitrary code, provided as either a Jinja file or an inline string.

    converter:
    user_template_paths:
        - "templates"
    
    template_configs:
        # Custom imports from a jinja file for SAS DI Jobs
        - template: imports.jinja
        match_patterns:
            - (SASMetaJob)
        output_type: sas_dbr_notebook_header
    
        # Custom imports from an inline string for SAS EG Flows
        - inline: |
            from migration.helpers import *
            from pyspark.sql.functions import *
        match_patterns:
            - (SASEGProcessFlowContainer)
        output_type: sas_dbr_notebook_header
    
        # Custom imports from an inline string for SAS Programs
        - inline: from migration.helpers import *
        match_patterns:
            - (SASProgram)
        output_type: sas_dbr_notebook_header
    
  • sas_dbr_tr_func - This template is specifically designed for SAS DI Transforms and allows you to replace a transform with a Python function.

    Here is an example of the config.yaml:

    converter:
      template_configs:
      - match_nodes:
        - attr: sas_meta_id
          include:
          - A5JCJVA0.BW000753
          type: SASMetaTransformBase
        output_type: sas_dbr_tr_func
        template: 1_A5JCJVA0.BW000753.py
      user_template_paths:
      - templates
    

    Here is an example of the content for the sas_dbr_tr_func template file:

    _alc_template_metadata_yaml = """
    func_name: step5_Join
    in_ds_sas_meta_ids:
    - A5JCJVA0.BU0004QD
    - A5JCJVA0.BU0004QM
    out_ds_sas_meta_ids:
    - A5JCJVA0.BU0004QL
    pass_vars_arg: false
    extra_args: []
    step: 5
    """
    
    
    def step5_Join(df_input1, df_input2):
        # Code of the transform
        return df_output
    

    The _alc_template_metadata_yaml is a special variable that contains metadata for the template. It is used by Alchemist to generate the appropriate function call within the notebook code and is not included in the generated notebook. Currently, it contains the following fields:

    • func_name: The name of the function that will be called in the notebook's main job function. This should match the function name in the template code.
    • in_ds_sas_meta_ids: A list of SAS dataset meta ids that will be passed as dataframes to the function. The order of the ids should match the order of the function's arguments.
    • out_ds_sas_meta_ids: A list of SAS dataset meta ids that will be returned as dataframes by the function. The order of the ids should match the order of the function's return values.
    • pass_vars_arg: Specifies whether the vars argument should be passed to the function. It will be placed after the dataframes arguments.
    • extra_args: A list of additional argument names to be passed to the function last, in the specified order.
    • step: The step number of the transform function. Currently, used for informational purposes only.