Analyzing Files with Alchemist¶
Have you tried the tutorial?
If you haven't already, start with the tutorial to get a quick overview of how to use Alchemist. The tutorial will guide you through the process of setting up Alchemist, preparing the project, analyzing your sources, and generating reports.
To analyze your files using Alchemist, follow these steps. You can start with a single file, a folder, or even the root folder containing all your sources. The output will be generated in the format specified by the -f
option.
Organize your project space
We recommend sitcking to the standard usage pattern. Read before proceeding.
Analyzing a Single File or Folder¶
- Open the command prompt or terminal.
-
Run the following command to analyze a single file or folder:
alchemist analyze src -p project_name -f xlsx -o /path/to/output/folder path/to/a/file_or_folder
- Replace
project_name
with a unique name for your analysis project. -f xlsx
specifies the output format as an Excel file. You can change this to other format, for exampleparquet
:
alchemist analyze src -p project_name -f parquet -o /path/to/output/folder path/to/a/file_or_folder
Note
There is no need to append
.exe
to thealchemist
command on Windows. It should work as shown. - Replace
-
If the analysis is successful for a single file, you can proceed to analyze a folder or the root folder containing all your sources.
Analyzing Huge Source Trees in Batches¶
If you have tens of thousands of files or more, it is unlikely that you machine can handle all of them at once. In this case, you will have to analyze them in batches. Here is how to do it right.
Prepare Source Files¶
- Create a new folder (1) for source files.
- Copy source files to this folder. Do not copy any data, only source code and other supported files (e.g. SAS EG projects
.egp
or SAS Metadata exports.spk
).- See detailed instructions on how to prepare
.spk
files in dialects documentation.
- See detailed instructions on how to prepare
Retain original folder structure for sources
The way your code and other sources is organized on the server may provide important insights and help identify owners. Thus we highly recommend retaining the original folder structure inside the src
folder.
For more details see how to prepare folder structure.
- You must have permissions to write files to this folder. If you experience issues on MacOs, check troubleshooting tips below.
Run Analyzer¶
Run the following command changing the subfolder and the numeric suffix in project name for each batch:
alchemist analyze src -p project_name_BATCH_N -f parquet -o /path/to/output/folder src/path/to/folder/batch_N
If You Can't Run Alchemist from the Source Tree Root Folder¶
If for some reason you can't run Alchemist from the root folder of your source tree, e.g. if sources are on a network drive, you can use the --source-root-path
option to specify the root path of your source tree. This way, Alchemist will be able to determine the relative paths of the sources correctly.
alchemist analyze src -p project_name_BATCH_N -f parquet -o /path/to/output/folder --source-root-path /path/to/source/tree/root /path/to/source/tree/root/path/to/folder/batch_N
Advanced Analysis Using DDS Export Feature¶
Minimum License Requirements
The export of detailed data via this feature is exclusively available with the unlimited analyzer bundle (see our features page). Without this bundle, the data is obfuscated to remove any potentially sensitive information and then encrypted. This data can only be analyzed by the Alchemist team or an authorized partner.
Alchemist provides a robust and extensive set of datamarts, offering insights into your code and environment. However, the potential for information extraction and use cases extends beyond the standard reports.
Here are a few examples of what you can achieve:
- Identify usage of plain passwords: Locate instances where plain passwords are used, helping you to enhance your security measures.
- Build custom data and program lineage: Utilize extra company-specific knowledge that can't be inferred from the code automatically to create custom data and program lineage.
- Enhance employee toolkit: Extract structured information that reflects actual business processes and logic from code, and feed it into your private GenAI assistant knowledge base.
- Find specific function usage: Identify all places where a particular function is used with specific parameters. For example, you may have insecure, outdated, or banned custom library functions that you'd like to identify to assess risks and prioritize migration.
- And more: The possibilities are vast and can be tailored to your specific needs.
You can access all of this information via the export-dds
command. This command gives you access to everything that Alchemist has parsed from the code. It will create tables with the most detailed information down to every statement, variable, function, and even column used, as well as the links between them.
While Alchemist comes with a packaged configuration that exports most common data, a custom configuration can be created upon request to extend its capabilities.
Run alchemist export-dds -h
for usage instructions.
Run DDS Export¶
Run the following command to export the detailed data for each batch:
alchemist export-dds -p project_name_BATCH_N -o /path/to/output/dds/folder src/path/to/folder/batch_N
Run Analyzer¶
Run the following command to analyze all exported batches at once:
alchemist analyze dds -p project_name -f parquet -o /path/to/output/analyze/folder /path/to/output/dds/folder
Troubleshooting¶
If you encounter issues, follow these tips:
Installation issues¶
If you can't install Alchemist, please contact us.
Invalid value for '-o' / '--output-path': Directory is not readable¶
In macOS Catalina and later versions, restricted folders include Documents
, Desktop
, Downloads
, among others. To solve this issue, you need to grant Terminal access to these folders through privacy settings.
Granting Terminal Access
Follow these steps to allow access to the folder:
- Open System Preferences on your Mac.
- Go to Security & Privacy.
- Select the Privacy tab.
- Scroll down to Files and Folders or Full Disk Access (depending on your macOS version).
- Look for the Terminal application (or the terminal you are using, such as iTerm) in the list.
-
Check the box next to it to grant access. If the application is not listed, you can add it by clicking the plus sign (+) and locating it in Finder.
-
Source skipped due to errors: Alchemist may report various errors during execution. Most of them will provide necessary information to resolve the issue, e.g. incorrect configuration or invalid input file. However, some source files may be skipped due to unexepected errors. If some files were skipped - most probably they were not supported by Alchemist, it will not impact the results for the remaining ones. Please, report this and we'll work on adding support for them.
App freezes¶
In rare cases, some source files may freeze the process. You may try to identify these and remove from sources and run operation again. If you can't identify the problematic files, please contact us.
Execution crashes or takes too much time¶
If you have hundreds or even thousands of files, you may run out of memory or the output may be too large. In this case, you can:
- Run in smaller chunks, i.e. running alchemist multiple times, each time providing a different set of input files or folders. Remember to use a different project name after the
-p
option on each run. - Use a different output format, such as
-f parquet
.
Command not found¶
If typing alchemist
doesn't work (program not found), make sure you've finished our tutorial. Use the full path to the executable. For example, on Windows:
```
C:\Program Files (x86)\Alchemist\Alchemist\alchemist.exe analyze src -p project_name -f xlsx path/to/a/file_or_folder
```
Remember to consult the built-in help (alchemist -h
or alchemist <command> -h
) for the most up-to-date information on using Alchemist CLI. For more information on how to use Alchemist, refer to the usage page.