Activity
Mon
Wed
Fri
Sun
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

Data Innovators Exchange

Public • 322 • Free

2 contributions to Data Innovators Exchange
Do you have a Deletion/Masking Strategy?
I am wondering if you have a data deletion or data masking strategy in your projects if you have to delete/mask (personal) data. If yes, how do you find all the places where you have to delete/mask? The place where we usually tag data (as personal or not) is usually done in the metadata (example in Data Vault: when defining the Satellites). So, this could be used as the basis. when using column-level lineage you could figure out where the data is coming from (in case you have a PSA) and where the data goes to (Business Vault - Information Mart). Based on this, a procedure can do the delete/masking/NULLing ... What do you think about this and do you have a tool/mechanism which does that? btw. what would you prefer? 1) NULL the values, 2) remove the whole row in the personal Satellite (but then you have to consider to re-create the PITs as some pointers to the Satellites do not exist anymore) or 3) mask it with a static value (not simple hashing of course), also to see that there was something before and to differentiate to "normal" NULLs? Thanks for your thoughts!
6
2
New comment Oct 11
1 like • Oct 11
I like the approach not to delete the whole record (Option 1) in order to be able to distinguish between "not delivered by the source" and "deleted". However the question then arises how to distinguish between "delivered with null values" and "deleted" without adding additional information from the deletion mechanism. Curious to see how you implemented the deletion process!
Datavault Builder Insights
Hi automation experts, Over the last few weeks, I've had the opportunity to work with the automation tool Datavault Builder. It's a new tool for me, but it looks really promising. I would like to give you some insights into some of the features that I particularly liked. 🔑 Finding the business key An important step when developing an integration flow within Data Vault modeling is to define the business keys. The business key has to be unique, as it identifies the object. Sometimes this process is not as easy as it seems. Datavault Builder supports you in this regard with its Data Viewer. The Data Viewer will give visual feedback, whether the chosen combination of columns results in a unique identification. Additionally a Heatmap will support you to find a proper business key. An even faster way to check, if your composed key is unique can be done by using the uniqueness check while creating a new hub load. 🌐 Using the working canvas As your Data Warehouse grows over time, it can be really challenging to keep track of your Data Model. The working canvas of Datavault Builder can be used to display only some parts of the model or also extend the model. By double-clicking onto a Hub, it is possible to load everything related to that element. This interactive canvas enables you to browse through the whole model step by step. 🏁 End-to-end automation Many tools are able to generate the Raw Data Vault automatically, but the challenge is to automate the Business Vault as well. In Datavault Builder’s business objects layer, you can prepare a denormalized output based on the Raw Data Vault. The Business Object generator will take away the work of manually joining the Data Vault elements and generate an as-of-now-view, on top of which the business logic can be applied in the business rules layer. Also PITs are created implicitly when creating a business object of SCD type 2. In this process, the relevant fields of the used objects are automatically added into the PIT table.
6
1
New comment Sep 13
1-2 of 2
Julian Brunner
2
13points to level up
@julian-brunner-7531
Data Engineer/Consultant with focus on Data Vault 2.0 and Automation Tools

Active 6d ago
Joined Jul 31, 2024
powered by