Skip to main content
Version: v2.6

Application Overview

To find the Process Mining App, login to your Databricks workspace and navigate to Compute ยป Apps. Clicking on the app's name will take you to a page where you can find the app's URL. Click on the URL to open the app.

Please note that only users who have been granted permission can see and use the app.

App overview
App overview
App details
App details

All Scenarios Pageโ€‹

The Databricks Process Mining App starts with the All Scenario page. Here you can view all the available process mining scenarios, create new process scenarios or edit and delete existing ones. Each scenario contains all the necessary details for mining your data, including data source definitions within Databricks and the process mining parameters. You can review and edit existing scenarios, delete them, or add new ones as needed.

The "โœŽ" button takes you to the Data Source page, where you can start editing an existing Process Scenario.

The "โฟ" button takes you to the Data Source page, where you can view the Process Scenario configuration. You cannot make any changes to the Process Scenario in this mode.

The "๐Ÿ—‘๏ธ" button allows you to delete an existing Process Scenario. You must confirm your intention.

All Scenarios
All Scenarios

Create a Process Scenarioโ€‹

The "โž•ย Create a Process Scenario" button takes you to a new page.

Create a Process Scenario
Create a Process Scenario
Create a Process Scenario
Create a Process Scenario

Here you enter the Process scenario short name and Process scenario description of the new Process Scenario.

You can also decide whether you want to create an OCPM Process Scenario.

Delta Lake time travel is not supported at the moment. See What is Delta Lake time travel?.

Data Sources Pageโ€‹

Data sources are the starting point for process mining.

Each data source must adhere to a defined data structure, and our validation process ensures that everything is formatted correctly before moving forward. If you encounter any validation issues while configuring the data sources, see Troubleshooting Data Sources Validation for help.

Buttons
Buttons

Select an Event Logโ€‹

An event log is the central and mandatory input for process mining. It is a table that contains case and event information. To further enrich the event log with your domain specific data, you can also add additional Case and Event Dimensions.

The "โž• Select Table as Event Log" button allows you to reference a table that you can access as an event log.

note

The Select a Table dialog displays only those tables for which you have both SELECT and MANAGE privileges. In addition, you must have at least USE CATALOG and USE SCHEMA permissions to view and select a table.

Select an Event Log
Select an Event Log

Click the "๐Ÿ—‘๏ธ" button to remove an existing reference which then allows the selection of a different Event Log.

Data Source page
Data Source page

Dimensionsโ€‹

The "โž• Select Table as Case Dimensions" and "โž• Select Table as Event Dimensions" buttons allow you to reference tables that you can access as additional case and event dimensions respectively.

Click the "๐Ÿ—‘๏ธ" button to remove an existing reference which then allows the selection of different Case or Event Dimensions.

Passthrough Tablesโ€‹

The "โž• Select Table to pass through" buttons allow you to reference tables that are passed through the Databricks Process Mining App as-is, meaning a view is created for each passthrough table.

Click the "๐Ÿ—‘๏ธ" button to remove an existing reference which then allows the selection of a different Passthrough Table.

Grouping Pageโ€‹

Grouping page
Grouping page

By defining groups, you can combine related activities to get a better overview of your processes in the process analyzer. Groups can be defined in a hierarchy by configuring one group as the parent of another.

To add a new group, click the "โž•" button.

Fill out the Name field with the name of the group.

The Parent Group field lets you define a parent group. Parent groups are optional. A group cannot reference itself as its parent.

Use the Activities field to define which activities belong to your group.

  • Click on the "ห…" to open a list of activities to choose from.
  • Select as many activities as you like, but each activity can only belong to one group.
  • Click the "X" next to an activity name to delete it.

To edit an existing group, simply change the field values as required.

To delete an existing group, click the "๐Ÿ—‘๏ธ" button next to the group.

Subprocess Leadtime Pageโ€‹

Subprocess Leadtime page
Subprocess Leadtime page

Define subprocesses to calculate lead times and other time-related measures are calculated for partial processes.

To understand why this is useful, take a look at the Subprocess Leadtime ยป Use Cases.

Add a New Subprocessโ€‹

To add a new subprocess, click the "โž•" button and fill out the fields appropriately.

The Subprocess field lets you define a name for your subprocess.

Use the Start Activity field to select all activities that mark the start of the subprocess.

The Include Start Activity check mark below defines if the duration of the activity will be included in or excluded from the calculated lead time.

Sometimes you may have more than one activity marked as the Start Activity show up in a process variant - either because two different start activities appear or a single start activity is repeated. The Min/Max Start Activity field is then used to define if the first or last matching activity will be used for lead time calculation. Min uses the first matching activity (resulting in a longer process leadtime), while Max uses the last matching activity (resulting in a shorter process leadtime).

Use the End Activity field to select all activities that mark the end of the subprocess.

The Include End Activity check mark below defines, if the duration of the activity will be included in or excluded from the calculated lead time.

By configuring the Min/Max End Activity field for the End Activitiy you can define if the first or last matching activity will be used for lead time calculation. Here, the resulting impact is the opposite of that of the Start Activity. Min uses the first matching activity (resulting in a shorter subprocess leadtime), Max uses the last matching activity (resulting in a longer subprocess leadtime).

The Target Time [d] and Target Time Operator fields define the desired lead time for the subprocess. They are used to calculate whether the lead time has been missed.

End of Process Pageโ€‹

End of Process page
End of Process page

By default, the mpmX app shows leadtimes and other information from an average of all cases, but this can sometimes be misleading.

  • For example, cases that are not finished will have shorter leadtimes just because they have not reached the end.
  • The number of process variants (specific path that each case takes) will be greater if you include both open and closed cases, than if you only counted closed cases.

On the End of Process page you can define when a process is considered to be completed.

  • If the End of process condition is set to None, all processes are considered to be completed.
  • Select using Activity to then select an activity which marks a process as completed.
    End of Process - using Activity
    End of Process - using Activity
  • Select using Custom Field to define more complex conditions.
    • On the left, you can define which field of the activity log should be used.
    • In the middle select the appropriate operator (= or IN).
    • On the right, define the value (in case of =) or the comma-separated list of values which mark a process as completed.
      End of Process - using Custom Field
      End of Process - using Custom Field

Time Travel Pageโ€‹

info

Delta Lake time travel is not supported at the moment. See What is Delta Lake time travel?.

Additional Customizations Pageโ€‹

On the Additional Customizations page you can configure some rework and automation settings as well as some miscellaneous settings.

Additional Customizations page
Additional Customizations page

Reworkโ€‹

A rework event is any activity that indicates an unexpected change, such as a Purchase Order being adjusted or deleted.

By selecting a Rework definition option you can define which kinds of events should be taken into account when calculating rework related measures. Options are

  • <empty> - rework related measures will not be calculated
  • ReworkEvent - events are considered if they are defined as rework (see Rework Event Expression (by activity type) below)
  • RepeatedEvent - events are considered if they are repeated in a loop
  • ReworkAndRepeatedEvent - events are considered if they are defined as rework AND are repeated
  • ReworkOrRepeatedEvent - events are considered if they are defined as rework OR are repeated

An event is regarded as rework if its activity type matches the Rework Event Expression (by activity type).

Automationโ€‹

  • The Automation limit [%] indicates the percentage of automated events that a case needs to have to be labeled as an automated case.
  • An event is regarded as automated if its user name field matches the Automated Event Expression (by user name).

Miscellaneousโ€‹

  • If Reduce timestamps? is checked, the number of distinct timestamps is reduced by flooring seconds and milliseconds to minutes. The lead and process times are not influenced, just the final timestamp output format is shortened to minutes.
  • If Suppress loops in process paths? is checked, process variants are considered equal if they only differ in how often an activity occurs in succession. All repetitions are kept in the event log but there will be fewer distinct variants.
  • Check Run pareto analysis to enable Pareto/ABC Analysis.

Conformance Checking Pageโ€‹

Conformance Checking is an optional module in mpmX and will unlock the analysis in the Conformance sheet.

On the Conformance Checking page you can define your Happy Paths, or ideal process paths, as well as configure other mining related settings.

Conformance Checking
Conformance Checking

Conformance Checking Parametersโ€‹

  • Use the Only finished cases? field to control whether only finished cases should be regarded or not.
  • The Optimization Potential Threshold field defines which cases will be analyzed for process governance optimization potential by comparing the happy path fitness of the case to the specified threshold.

Happy Pathsโ€‹

This is where you define your happy paths.

To add a new happy path, click the "โž•" button and fill out the fields appropriately.

  • The Process Path field contains the ideal process paths.
  • Use the Description field to describe what the happy path represents.
  • The Condition field is used to limit the cases the happy path is checked against.
  • If Active is checked, then this happy path will be used in the process mining.

To edit an existing happy path, simply change the field values as required.

To delete an existing happy path, click the "๐Ÿ—‘๏ธ" button next to the happy path.

Task Execution and History Pageโ€‹

Task and Execution
Task and Execution

Task Execution SQL Warehouseโ€‹

The "Overwrite default SQL warehouse" button allows you specify a SQL warehouse that is used solely by the process mining tasks. You can specify a different SQL warehouse for each scenario, enabling you to select the most suitable SQL warehouse size for your dataset.

Clicking the "Use default SQL warehouse" button resets the SQL warehouse reference so that the default SQL warehouse is used by the process mining tasks once again.

The "Change SQL warehouse" button allows you to update the referenced SQL warehouse alongside with your evolving dataset.

Task Execution Schedulingโ€‹

  • This section allows you to configure a schedule for regular automatic process mining runs by defining a Cron expression and a time zone. To understand how a cron expression is structured, please refer to the Quartz Cron Syntax documentation. Please note that a timezone specification is required in addition to the Quartz Cron Expression.

Ad-hoc Task Executionโ€‹

  • The Start Mining Task button allows you to manually start a process mining task.

Mining Summary of last executionโ€‹

  • This section shows the basic KPIs of the process model after a successful mining run.

Task Execution Historyโ€‹

The Task Execution History section displays the status of the most recent process mining runs.

  • The Start (UTC) and End (UTC) columns contain the start and end timestamps of the mining run
  • Duration (hh:mm:ss) contains the total time of the mining run.
  • The States column contains one of the following values:
    • Running - The process mining task is running.
    • Failed - The process mining task has failed. Error details can be found in the Error Mesage column.
    • Succeeded - The process mining task was successful.
    • Scheduled - The process mining task is scheduled to run at the displayed Start (UTC) time.
  • An Error Message will appear if anything went wrong.

Permissions Pageโ€‹

Permissions page
Permissions page

The Permissions page allows you to manage access control for the process scenario. You can grant or revoke permissions for Databricks users, groups, and service principals at the scenario level.

To grant permissions, click the "Grant" button and fill out the required fields:

  • The Principals field allows you to select the group, user, or service principal to grant permissions to
  • The Privileges field lets you choose between CAN USE (data consumer access) or CAN MANAGE (administrative access)
Permissions Grant
Permissions Grant

Once you have configured the permission details, click the "Confirm" button to apply the changes.

To revoke existing permissions, select them using the checkbox next to the permission entry you want to remove, then click the "Revoke" button. You will be prompted to confirm the revocation.

Permissions Revoke
Permissions Revoke
info

For detailed information about the different permission levels and their capabilities, see Security ยป Permission Levels.

App Settings Pageโ€‹

This page allows you to configure all operational parameters for the app.

App Settings
App Settings

Default SQL Warehouseโ€‹

The "Change default SQL warehouse" button allows you to change the referenced SQL warehouse. It is used as a fallback for the process mining task if a scenario specifies no other SQL warehouse. We recommend that you select a warehouse of size 2X-Small and overwrite the process mining warehouse per scenario with a SQL warehouse of an appropriate size for your dataset.

See also: Installation and Update ยป Compute

Permissionsโ€‹

The Permissions section allows you to manage access control for the application. You can grant or revoke permissions for Databricks users, groups, and service principals at the application level.

To grant permissions, click the "Grant" button and fill out the required fields:

  • The Principals field allows you to select the group, user, or service principal to grant permissions to
  • The Privileges field lets you only choose CAN MANAGE (full administrative access) for now

Once you have configured the permission details, click the "Confirm" button to apply the changes.

Permissions Grant
Permissions Grant

To revoke existing permissions, select them using the checkbox next to the permission entry you want to remove, then click the "Revoke" button. You will be prompted to confirm the revocation.

Permissions Revoke
Permissions Revoke
info

For detailed information about the different permission levels and their capabilities, see Security ยป Permission Levels.