Skip to content

Databricks Dialect Configuration

The Databricks Dialect Configuration extends the base converter config and allows you to configure additional aspects of the conversion process. The following sections describe the available configuration classes and their fields.

Databricks Converter Config

Field Description Default Value
file_path_map File path mapping (see below) Empty dictionary
group_nodes_into_paragraphs Whether to merge similar consecutive statements into a single paragraph. true
render_markdown_headers Whether Markdown headers-based structure should be included in produced notebooks. true
render_all_source_code Whether original SAS source code should be included in produced notebooks. true
spark_conf_ansi_enabled Whether to generate code that assumes ANSI SQL mode is enabled in Spark. (see docs) true
sas SAS Specific Config Default SAS Config

Fila Path Map

File path mapping is used to convert source file location to a new cloud location. Result is always a posix path.

Mapping may specify a prefix part of the full path as it appears in the source, and how it should be converted to the target part of the path itself.

The longest matching prefix will be used. If no prefix matches, the original path will be used (which is probably not what you want).

The resulting path will be automatically converted to posix path.

Example:

  • for the path C:\User\any_local_dir\file.xlsx
  • the mapping can be {"C:\\User\\": "dbfs:/mnt/User/}
  • and the final path will be dbfs:/mnt/User/any_local_dir/file.xlsx

Databricks SAS Specific Converter Config

Field Description Default Value
year_cutoff SAS YEARCUTOFF option value (see docs) 40
libref_to_schema Mapping of SAS librefs to Spark schemas. Empty dictionary

Example

Here's an example of how you can define libref_to_schema in the configuration file:

converter:
  sas:
    libref_to_schema:
      libref1: spark_schema1
      libref2: "{spark_schema_var}"

In this example, libref1 will be converted to spark_schema1, and libref2 will be converted to {spark_schema_var}, assuming that it will be used in f-strings and the variable spark_schema_var will be defined in the output code.