Marc Winkelmann

Data Innovators Exchange

Activity

Mon

Wed

Fri

Sun

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

What is this?

Less

Memberships

Data Innovators Exchange

Public • 322 • Free

16 contributions to Data Innovators Exchange

Stefanie Culley

21d ago in

Ask your community

Agile with lnk tables

Hoping I can get some feedback on an issue we are really struggling with. We are trying to work through adding to our DataVault by adding small pieces at a time which works great with hubs and sats, but breaks down when you get to links. We have some idea of the full build but need to work on projects without having the entire picture. How do we handle adding new keys to an existing Link? Is it better just to add a new object? But then what about all the relationship history that was collected?

New comment 7d ago

Marc Winkelmann

3 likes • 18d

Hi @Stefanie Culley , if you don't change the structure of the link (same keys, same grain, same meaning), I would simply add new sources to this already existing Link table. The "high-water-mark" (represented by the load date timestamp) should consider the record source in the group by while loading the Link. This will keep the already loaded history and you will load the history from the new source as well, if exist in a Data Lake for example. With an additional Effectivity Satellite you can still differentiate beween the different sources when it comes to source separated interpretations.

Marc Winkelmann

Oct 18 in

General

How to connect multiple dbt core projects

dbt Cloud has the "dbt mesh" to connect your dbt projects. In the core version, there is no native dbt functionality for that, but there is a Python package called "dbt-loom", developed by Nicholas Yager, to connect your dbt-projects. We have it in place for almost one year and it works really well. This fuctionality is crucial when implementing a Data Mesh paradigm. Have a nice weekend! https://github.com/nicholasyager/dbt-loom

New comment Oct 21

Marc Winkelmann

1 like • Oct 21

Not that I am aware of (for the core version). But yes, the interfaces should be documented, at least in a data catalog tool.

Marc Winkelmann

Oct 9 in

Ask your community

Do you have a Deletion/Masking Strategy?

I am wondering if you have a data deletion or data masking strategy in your projects if you have to delete/mask (personal) data. If yes, how do you find all the places where you have to delete/mask? The place where we usually tag data (as personal or not) is usually done in the metadata (example in Data Vault: when defining the Satellites). So, this could be used as the basis. when using column-level lineage you could figure out where the data is coming from (in case you have a PSA) and where the data goes to (Business Vault - Information Mart). Based on this, a procedure can do the delete/masking/NULLing ... What do you think about this and do you have a tool/mechanism which does that? btw. what would you prefer? 1) NULL the values, 2) remove the whole row in the personal Satellite (but then you have to consider to re-create the PITs as some pointers to the Satellites do not exist anymore) or 3) mask it with a static value (not simple hashing of course), also to see that there was something before and to differentiate to "normal" NULLs? Thanks for your thoughts!

New comment Oct 11

Christof Wenzeritt

Sep 30 in

General

Data Innovator Community Meetups

To all Data Innovators, Where do you want to see the next Data innovator events? Due to the huge success of the community meetups in London and Munich we want to keep them coming. In these events we got very good feedback on the focus of the workshops, so we are going to keep the focus on them and stay with the 1 day format with: - Success story - 2 Workshops - Panel discussion The team currently is brainstorming locations and would love to hear your thoughts on this 🙂 So please let us know in the comments where you want a Data Innovators Event to take place and we will give our best to make it happen 🙂 Thank you for your input!

New comment Oct 10

Marc Winkelmann

0 likes • Oct 9

Vegas!

Marc Winkelmann

Sep 11 in

General

I got my ticket for the Data Dreamland! You also?

Excited to be presenting alongside @Christof Wenzeritt at Data Dreamland! Main topic is Data Mesh and how to do one. It offers an effective approach for managing large-scale data, but the transition isn't always smooth sailing. Join us when we delve into the challenges of designing ownership structures within a Data Mesh framework. We'll explore how to achieve the optimal balance between technical expertise and organizational design. Let's share our knowledge and navigate the exciting world of Data Mesh together! Get your ticket here: https://scalefr.ee/d7goui

New comment Sep 12

I got my ticket for the Data Dreamland! You also?

1-10 of 16

Level 4 - Innovator

83points to level up

Marc Winkelmann

@marc-winkelmann-2004

Hi, my name is Marc and I am implementing Data Platforms with the focus on Data Vault 2.0. Looking forward to chat/talk with you :)

Active 12d ago

Joined Jun 27, 2024

Hanover, Germany

Contributions

Followers

Following