Extract Phase
In the extract phase the TaskApp with 01qvdbuilder scenario is employed. The app extracts tables from a database, like SAP, and stores them into a dedicated folder.
- Perhaps tables have already been extracted but are not yet in .qvd-format or stored in multiple files for sizing reasons.
- Files that are not ready for further modeling, because they are disperse or not in .qvd-format, are stored in 01_Extract/extract and then converted by the TaskApp into .qvd-files that are stored under 01_Extract/transformed.
QVD is a native Qlik format that can only be written and read by Qlik Sense® or QlikView. The file format is optimized for speed when reading data from a script, but is still very compact. For more information, see the Qlik Sense® help.
Use the new variables to store and load the tables:
LET mpmXvData = 'lib://mpmX_data/';
LET mpmXvETLFolder = '$(mpmXvData)process_analytic_models/$(mvProcessDescription)/'
Example
To create the event “Create Purchase Order” from an SAP database, we need to gather all related tables.
An event has three key bits of information that are needed to do process mining:
- CaseID
- event timestamp
- event definition
The CaseID is the same for all events, but the timestamp is unique for each event.
-
Think about how to define an event and which tables will deliver the required information (e.g.timestamp or Case ID). We need to define a special Case ID. In this example, the definition of a Case ID in a purchase process can be the following:
- CaseID: InvoiceNr_Client_CompanyCode_FiscalYear
-
Next, we need to extract the tables where the fields that are required to create the Case ID and to get the timestamps are located. A CaseID can combine fields from multiple tables. For the event “Create Purchase Order” we use these tables:
-
Change log: CDHDR, which is used to store Change document header data.
-
Change log: CDPOS, which is used to store Change document items data.
-
RSEG: It is used to store Document Item: Incoming Invoice data.
-
-
As different tables can contain the same field names (i.e. CLIENT, or MANDANT in German in the script below), we qualify the tables with the function below to avoid synthetic keys and circular references.
MW_qualifyFieldNames('TableName','Qualifier','Unqualify') -
Then, the extracted table is stored as .QVD file.
-
Finally, we dropped the table to clear the RAM for better performance.
Example extraction CDHDR table
//Name the table and then insert the load script with the data connection dialog on the right site
LET mpmXvData = 'lib://mpmX_data/';
LET mpmXvETLFolder = '$(mpmXvData)process_analytic_models/$(mvProcessDescription)/'
SET mvDateFilter = '2018-07-01';
SET mvYearFilter = '2018';
sub mpmX_execute
CDHDR:
LOAD MANDANT & '_' & OBJECTCLAS & '_' & OBJECTID & '_' & CHANGENR as %CDHDR_PKEY, timestamp(date(UDATE) + time(UTIME)) as AETST,
MANDANT,
OBJECTCLAS,
OBJECTID,
CHANGENR,
USERNAME,
UDATE,
UTIME,
TCODE,
PLANCHNGNR,
ACT_CHNGNO,
WAS_PLANND,
CHANGE_IND,
LANGU,
VERSION,
_DATAAGING;
SELECT MANDANT,
OBJECTCLAS,
OBJECTID,
CHANGENR,
USERNAME,
UDATE,
UTIME,
TCODE,
PLANCHNGNR,
ACT_CHNGNO,
WAS_PLANND,
CHANGE_IND,
LANGU,
VERSION,
_DATAAGING
FROM CDHDR
WHERE OBJECTCLASS EQ 'EINKBELEG' AND UPDATE GE '$(mvDateFilter)';
// The function MW_qualifyFieldNames is used to qualify the table MW_qualifyFieldNames('TableName','Qualifier', 'Unqualify')
call MW_qualifyFieldNames('CDHDR','CDHDR', '%*');
// The qvd is stored in the respective folder structure (defined in section main)
Store CDHDR into [$(mpmXvETLFolder)01_Extract/transformed/CDHDR_EINKBELEG.qvd] (qvd);
// Table is dropped to clear RAM (performance)
Drop Table CDHDR, temp;
end sub