Activity
Mon
Wed
Fri
Sun
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

Data Innovators Exchange

Public • 322 • Free

16 contributions to Data Innovators Exchange
Agile with lnk tables
Hoping I can get some feedback on an issue we are really struggling with. We are trying to work through adding to our DataVault by adding small pieces at a time which works great with hubs and sats, but breaks down when you get to links. We have some idea of the full build but need to work on projects without having the entire picture. How do we handle adding new keys to an existing Link? Is it better just to add a new object? But then what about all the relationship history that was collected?
4
4
New comment 6d ago
3 likes • 17d
Hi @Stefanie Culley , if you don't change the structure of the link (same keys, same grain, same meaning), I would simply add new sources to this already existing Link table. The "high-water-mark" (represented by the load date timestamp) should consider the record source in the group by while loading the Link. This will keep the already loaded history and you will load the history from the new source as well, if exist in a Data Lake for example. With an additional Effectivity Satellite you can still differentiate beween the different sources when it comes to source separated interpretations.
How to connect multiple dbt core projects
dbt Cloud has the "dbt mesh" to connect your dbt projects. In the core version, there is no native dbt functionality for that, but there is a Python package called "dbt-loom", developed by Nicholas Yager, to connect your dbt-projects. We have it in place for almost one year and it works really well. This fuctionality is crucial when implementing a Data Mesh paradigm. Have a nice weekend! https://github.com/nicholasyager/dbt-loom
5
2
New comment Oct 21
1 like • Oct 21
Not that I am aware of (for the core version). But yes, the interfaces should be documented, at least in a data catalog tool.
Do you have a Deletion/Masking Strategy?
I am wondering if you have a data deletion or data masking strategy in your projects if you have to delete/mask (personal) data. If yes, how do you find all the places where you have to delete/mask? The place where we usually tag data (as personal or not) is usually done in the metadata (example in Data Vault: when defining the Satellites). So, this could be used as the basis. when using column-level lineage you could figure out where the data is coming from (in case you have a PSA) and where the data goes to (Business Vault - Information Mart). Based on this, a procedure can do the delete/masking/NULLing ... What do you think about this and do you have a tool/mechanism which does that? btw. what would you prefer? 1) NULL the values, 2) remove the whole row in the personal Satellite (but then you have to consider to re-create the PITs as some pointers to the Satellites do not exist anymore) or 3) mask it with a static value (not simple hashing of course), also to see that there was something before and to differentiate to "normal" NULLs? Thanks for your thoughts!
6
2
New comment Oct 11
Data Innovator Community Meetups
To all Data Innovators, Where do you want to see the next Data innovator events? Due to the huge success of the community meetups in London and Munich we want to keep them coming. In these events we got very good feedback on the focus of the workshops, so we are going to keep the focus on them and stay with the 1 day format with: - Success story - 2 Workshops - Panel discussion The team currently is brainstorming locations and would love to hear your thoughts on this 🙂 So please let us know in the comments where you want a Data Innovators Event to take place and we will give our best to make it happen 🙂 Thank you for your input!
6
9
New comment Oct 10
0 likes • Oct 9
Vegas!
I got my ticket for the Data Dreamland! You also?
Excited to be presenting alongside @Christof Wenzeritt at Data Dreamland! Main topic is Data Mesh and how to do one. It offers an effective approach for managing large-scale data, but the transition isn't always smooth sailing. Join us when we delve into the challenges of designing ownership structures within a Data Mesh framework. We'll explore how to achieve the optimal balance between technical expertise and organizational design. Let's share our knowledge and navigate the exciting world of Data Mesh together! Get your ticket here: https://scalefr.ee/d7goui
8
2
New comment Sep 12
I got my ticket for the Data Dreamland! You also?
1-10 of 16
Marc Winkelmann
4
83points to level up
@marc-winkelmann-2004
Hi, my name is Marc and I am implementing Data Platforms with the focus on Data Vault 2.0. Looking forward to chat/talk with you :)

Active 11d ago
Joined Jun 27, 2024
Hanover, Germany
powered by