What We Learn
Basic hub / satellite functionality, and the difference between green Party-Place-Thing- and blue Description-classes. We see how satellites and PiT-tables behave and how the view layer handles multiple sources and history data.
Preparations – The DS_DemoData Database
Locate the simulated source system data from the PersMaster person register and the ProjMaster Project Management software in the DS_DemoData database. For this first tutorial, only consider Person and Gender data from PersMaster and Person data from ProjMaster. Also, pretend that PersMaster’s Person data does not have the email and Home phone attributes, we will add those to the model as part of the next tutorial.
Inspect the data and get to know it.
Modeling
Model the classes Person and Gender like below. Refer to this to match the colors.
You are now building a Raw Model, so the modeled classes should closely correspond to the structure of the source data that you will be loading. As the structure of this tutorial data is normalized and makes sense, the Conceptual model will closely resemble the imported table structure.
For first-time Visual Paradigm users, here are step-by-step instructions how to get to this point, and beyond. Ellie users, see here for details.
Now proceed with the following steps in the modeling tool:
Do this… | …and this happens |
---|---|
Set both classes’ Implement metadata parameter to True | DW objects will be generated for the classes.
Note: ALWAYS do this for all classes you want to deploy, even if not specifically mentioned. |
Set Person.Person Id as Business key. | The Person class will get a hub. References to the Person class will be implemented as hashes in link tables. |
Set the Person.Date of birth attribute’s Rate of change to Never. | Mapping a source table’s column to this attribute will result in that column’s value being loaded into a satellite containing data that should never change. The default value for Rate of change is Low. |
Set the Person.Gender relation end’s Rate of change to Never. | As the Person -> Gender “soft reference” is implemented in a satellite, the Gender code will always be mapped to a satellite marked as containing never-changing data. |
Set Gender.Code as Business key. | The Gender class will get a hub. Blue Description classes work logically as lookup tables, so references to Gender will be code-based, from satellites. In the view layer, however, a runtime generated hash will be used as a consistent way of referencing views within the view layer. |
Now export the model to the working directory.
Mapping
At this point, there exists no DemoDW database to do anything in. In order to let D# Engine do all the hard work, we can use mapping files to create the empty tables in the Staging Area where the source data will be available for the hashing process. The tables will be installed during our first full DW install. To make this happen, we will need to create new mapping files for each simulated source system. Use D♯ Engine’s Develop.Mappings.New Mapping File command to create one file for PersMaster and one for ProjMaster. After this, the files will be open in the D♯ Engine UI.
Next, add all the columns to be mapped to their respective mapping file, one file at a time. See here for a more detailed explanation.
For reference, mapping file structure is described here.
Omit the email and phone number columns for now.
At this point you should also mark the primary key columns with an “x” and the foreign key columns with a dot-notation syntax to the referenced primary key column. This is not always necessary, but highly recommended, since D# Engine may need this information in more complex situations where it derives the source table relationships from the Conceptual model’s association data. Besides, you also get a source ER model with references if you do this, so it will be worth it.
Your completed 01_Mappings_PersMaster.csv mapping file will look like this:
Note that RAW_Person.ID is a purely technical surrogate key, and as such it is not interesting, so it has not been mapped and will not end up in the DW.
Now do the mapping for the ProjMaster Person data, the second “source system”, using the same approach. The end result should look like this:
Notice that not all attributes will always be present in all sources where a class is loaded from, and the mapping files reflect this as well. Also, in time, Person (and other classes) may gain new attributes when new source systems are added.
Save the mapping files.
Generating the DW
Refresh D♯ Engine. If the model and mappings are in sync, D# Engine will proceed to generate the Data Vault implementation. If not, you will get a list of issues to correct.
When the DW has been generated inside D# Engine, you can double-click the Server.DemoDW.dwRaw-schema in the Server Model pane and take a look at the entire schema structure. Verify that you have the following Raw Vault structure (click image to enlarge).
Install the DW
Run the Full DW Install command. You will get a hierarchical structure containing the deployment process. Press the Deploy button, and D# Engine will by default generate the SQL code, save it as a file, and then execute it. SQL Server Management Studio should start with the generated file opened, so you should just need to execute the generated code by pressing the Execute button. If not, open the file manually from the Workspace\Output\SQL folder. Alternatively, you can in D# Engine click on SQL -> SQL Mode -> Set Copy to Clipboard, which copies all SQL output to the clipboard instead of writing it to a file. In this case, you can just paste the SQL wherever you think it should go.
After a successful install, do some surfing in the newly created database structures. Browse the SQL code for:
- the different levels of orchestration procedures
- hash procedures
- error procedures
- load procedures for each table type
- the view layer in the dwBV-schema
Run Tutorial Scripts
Run each of the tutorial script commands from the Help -> Tutorials -> Intro Course -> Person Tutorial menu, and inspect the results.
Script | Source data | Main points of interest |
---|---|---|
Step 1: Load some base data | Two source systems, 3 persons in each, 1 person in both. | The log has been populated. Person and Gender related hubs and satellites all contain data. The views publish the data correctly. |
Step 2: Source data updated and deleted | Danielson changed to Danielsson. Eriksson has been removed. |
The satellite contains two versions of Danielsson’s name. The Point-in-Time table reflects the change. The K_Person view shows the latest name version for Danielsson, and Eriksson is present, as is correct. |