What We Learn
We can automatically apply a source-aware algorithm for deletion detection to mark data as deleted.
Scenario
We want to start soft deleting Projects. In a full load situation where we possibly load the same class from several source systems, we can activate the enhanced deletion detection mechanism by only setting the corresponding parameter to true for the class for which we want to activate the algorithm.
In short, if State tracking has been turned on:
- Algorithm off: “Don’t mark any class’ hashes as deleted if any source system for that class has zero rows to load in this batch. If all sources have some data to be loaded, mark all hashes as deleted that are missing from all of the source material”
- Algorithm on: “Mark those hashes as deleted that don’t exist in any source that the particular hash has been loaded from before, but not those that only come from sources that have zero rows to load in this batch”
Modeling
Soft delete requires the State tracking method parameter value set to Simple for the class we want to apply Soft delete to. We’ll just do this:
do this… | …and this will happen |
---|---|
For the Project class, set the State tracking method parameter value to Simple. |
The loading mechanism will start tracking how previously loaded hashes disappear and reappear in the source data. |
Deploy the Changes And Try It Out
Deploy the Project class, then run the test script. During the script, when required, the deletion detection parameter can be set like this:
Script | Source data | Main points of interest |
---|---|---|
Step 1: Go through functionality
|
Several batches with source data being deleted | Run the script step by step and familiarize yourself with the algorithm’s behaviour. |