Michael Müller

Data Innovators Exchange

Activity

Mon

Wed

Fri

Sun

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

What is this?

Less

Memberships

Data Innovators Exchange

Public • 179 • Free

4 contributions to Data Innovators Exchange

Tim Kirschke

Aug 15

General

Data Vault Modeling: Where are you doing it?

At some point in each Data Vault project, the source data is analyzed, and a Data Vault 2.0 model is drafted. I saw a lot of different locations where teams draft their Data Vault model, Excel sheets, drawing tools like Draw.io or Miro.com, dedicated modeling tools, directly inside their automation tools, or on pen and paper. My past experience highlighted three most important things to consider when deciding where to model: - Must fit into the tool stack: if your automation tool offers Data Vault modeling, do it in there - Ease of use: Simplicity is important to streamline initial modeling, changes and additions - Persistent and centralized: All components of the whole model should be stored in a central place, to help other modelers identify what has been done already. Where are you currently designing the Data Vault model? What are your experiences? Let me know!

New comment 21d ago

Michael Müller

2 likes • Aug 15

@Tim Kirschke that is more complicated. I usually model a business object model. Like the one in mermaid. Then I map the sources to it. Very much like John Giles suggested it. Talk to the business, look at the sources, I had a presentation on a more detailed view on this kind of work (early vs. late integration) detailing what John did and standardize it further. You could have seen that kind of generation also in the Innovator solution we build ages ago. The rest is more or less generated...

Michael Müller

1 like • 21d

@Jaroslaw Syrokosz chatGPT can do it, too. I once gave it all the description of willibald and asked for a data model in mermaid. It worked, although it missed a lot of the model....

Michael Müller

Aug 5

General

Architecture: Local optima versus maintainability

Hello I am new to this forum. I love data modelling, data vault and architecture. So I thought I make my introduction with some thoughts on architecture (just because someone compared data modelling with pineapple on pizza - not for everyone - and yes I disagree). A data warehouse has an architecture and ideally exactly one location for each activity. The activity is carried out for all sources at this point. This reduces the maintenance effort, because you don't have to know the data intimately or remember all the details. It is sufficient to visualise the activity in order to determine where the necessary corrections need to be made. With Data Vault also comes an architecture. A good architecture that includes configuration options for the interfaces, the Business Vault, the Data Mart and - if you are particularly adventurous - special Data Vault patterns. And no matter how good the architecture is, there are often 'local optima' during development. If we only deviate from the warehouse guidelines for this small activity, we save ourselves a lot of work. And the exception for this source is small. Actually non-existent. Everyone can remember that. Two years later, there are exceptions for every data source and you have to know the data exactly again to be able to make changes successfully. The maintenance effort increases. And after 5 years at the latest, only a few people will still be able to make changes and everyone will be talking about a redesign. All the small gains in speed are forgotten. This does not mean that the architecture must not change. In fact, it should change. The only question is: will everyone benefit? Or just this data source alone. If everyone benefits, this change should also be made for everyone. Only then is it a universal change. If you only change this for specific situations, then there are many individual customised solutions. Special solutions that need to be known before the existing implementation is adapted. Maintaining an architecture requires a lot of effort. It is the only way to ensure that processes and the allocation of tasks to individual areas are clear. This effort is worth it, as it brings disproportionately high gains in terms of maintainability, the onboarding of new or temporary employees and enables quick or temporary changes between teams in the DWH.

New comment 28d ago

Michael Müller

2 likes • Aug 14

The less a colleague has to think, before he can start changing stuff, the less the maintenance cost. You always have to think a lot before taking action in our line of work, but that should be because of the content and not for the environment. And -again- change things, if they are not good, but make it work for all. This will help productivity as well.

Michael Müller

1 like • 28d

@Volker Nürnberg that is right. However in case of business rules automation is still lacking. And especially on the differentiation of hard/soft rules aka as earlyx vs. late Integration, there is a lot that can go wrong....

Samuel Williams

Aug 6

General

Comparing 3 Types of Data Modelling

What are the pro's and con's in your experience? 📊 **Normalized Data Modeling** ⭐ **Star Schema** 🔗 **Data Vault** Michael Kahn breaks down the differences in these 3 approaches. His channel has racked up over 4m views and he is in the Top 5 online educators for data pipeline engineering for small teams to deliver better analytics, faster. 👉 [Watch and Comment Now!] #DataModeling #DataEngineering #BigData #Analytics #TechTalk

New comment Aug 9

Michael Müller

5 likes • Aug 6

The versus part is strange, but it seems that this video is one in a longer row, where he also compares the architectures of the approaches. And then it's 3NF Warehouse vs. Star Schema vs. Data Vault. This video is very high-level. Funny thing, when I do a data warehouse I use all 3 modelling techniques. A 3NF Model of the core warehouse for general aspect and talking to business, data vault for the core warehouse and a star schema for the data mart....

Richard Sklenařík

Aug 5

Ask your community

"Jedi" test?

Hello everyone. I've one question, what accompanied my whole DV journey. Maybe I didn't clearly understand but I live under the belief that the test, when you must be able to reconstruct every data extract from RDV layer, is called Jedi test. Is it correct? And WHY? Why is the test called as it is?

New comment Aug 17

Michael Müller

2 likes • Aug 5

Yes, that's a Yedi Test. And for the name, I liked it, so I didn't ask Michael why...

1-4 of 4

Level 3 - Investigator

39points to level up

Michael Müller

@michael-muller-2499

Data Vault Professional with both certificates, Board of Directors Deutschsprachige Data Vault User Group (DDVUG e.V.)

Active 15d ago

Joined Jul 9, 2024

Contributions

Followers

Following