Activity
Mon
Wed
Fri
Sun
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

Data Innovators Exchange

Public • 322 • Free

4 contributions to Data Innovators Exchange
A novices guide to Data Vault
Thanks to @Dan Linstedt for this great overview video. This was the inspiration for this new page on perplexity.ai that provides a deep dive into Data Vault. Check it out here https://www.perplexity.ai/page/the-power-of-data-vault-d4CxnNfmRRmAJOwfGxm1Fw Love to hear your comments on how well or otherwise LLM's are able to collate and interpret presentations like this one. FY this used the new Llama 405 billion parameter model to do its thing. Thanks to Wherescape for sponsoring this post. @Melissa Zuro
9
6
New comment Aug 23
A novices guide to Data Vault
2 likes • Aug 15
Thanks @Dan Linstedt great intro
"Jedi" test?
Hello everyone. I've one question, what accompanied my whole DV journey. Maybe I didn't clearly understand but I live under the belief that the test, when you must be able to reconstruct every data extract from RDV layer, is called Jedi test. Is it correct? And WHY? Why is the test called as it is?
7
8
New comment Aug 17
9 likes • Aug 6
Funny your question is, unfamiliar to me is that name.
Comparing 3 Types of Data Modelling
What are the pro's and con's in your experience? 📊 **Normalized Data Modeling** ⭐ **Star Schema** 🔗 **Data Vault** Michael Kahn breaks down the differences in these 3 approaches. His channel has racked up over 4m views and he is in the Top 5 online educators for data pipeline engineering for small teams to deliver better analytics, faster. 👉 [Watch and Comment Now!] #DataModeling #DataEngineering #BigData #Analytics #TechTalk
7
7
New comment Aug 9
Comparing 3 Types of Data Modelling
7 likes • Aug 6
1. Normalisation is designed to avoid *update* anomolies, so its raison d'être is OLTP systems. Therefore not that suited to 'downstream data consumption' like warehousing and analytics. 2. Star Schema is design for high performance analytics within a defined scope. It's the job of the ETL processes to manage update anomolies. It's an analytics and BI focussed modelling language 3. Data Vault is neither of these but enables these. It is the hub in the middle, ever watching what is going on in the source system and acting as the audit record, essentially capturing a portion of the corporate memory that may be overwritten by source systems. Design your DV model with star schema entities and hey presto, you can rebuild your star schema repo at will, or even virtualise, hardware permitting. So they're designed for different purposes and should not be considerad as alternatives: 1. Normalised models for your OLTP apps 2. DV for your hub 3. Star schema for analytics and BI with fit for purpose data flows from 1 to 2 to 3
Data Mesh vs Data Vault?
Do you agree or disagree with Perplexity.ai 's analysis? 1. Ownership and Structure Data Mesh is fundamentally about decentralising data ownership and treating data as a product. In this approach, data is managed by cross-functional, domain-specific teams, each responsible for their own data products. This decentralization aims to eliminate bottlenecks and empower teams to innovate independently. On the other hand, Data Vault is a centralised data modelling methodology designed for building scalable and flexible data warehouses. It organizes data into three core components: Hubs, Links, and Satellites, which help maintain historical tracking and ensure data integrity. 2. Use Cases and Flexibility Data Mesh is best suited for large-scale, domain-diverse organisations that need to democratise data ownership and processing. It promotes agility and adaptability by allowing each domain to manage its data independently, adhering to company-wide standards. Conversely, Data Vault excels in environments with complex, evolving data landscapes. Its modular design allows for easy integration of new systems and changes, making it ideal for companies that require a robust, adaptable data warehousing solution. 3. Governance and Complexity Governance in Data Mesh is federated, meaning that while each domain has autonomy, there are overarching standards to ensure consistency and compliance. This approach can introduce organizational and cultural complexity due to the shift towards decentralization. In contrast, Data Vault focuses on technical governance through its standardized modeling techniques, ensuring data lineage, auditability, and resilience. While Data Vault can be technically complex due to its specific modeling requirements, it provides a structured approach to managing data integration and historical tracking. Agree or disagree? Let us know your point of view Citations: [1] https://atlan.com/data-mesh-vs-data-vault/
Poll
1 member has voted
6
18
New comment Aug 17
Data Mesh vs Data Vault?
3 likes • Aug 2
@Samuel Williams Firstly, Don't accept what an AI throws up to you, it is a very poor teacher, it can't (yet) be trusted. You need the prior knowledge of the topics to validate what its saying. I sometime use AI to help me with first drafts of what I'm writing, but you need to know when it's wrong. Secondly, in this case, learn the fundamentals of both ideas and then reconcile them: - Data Vault: There are some excellent (and very detailed) books on the topic. (Start with Dan Linstedt and Michael Olschimke, and Patrick Cuba. There are others – Kent Graziano, John Giles) Although you can't (unfortunately) get all of the knowledge from books, they are a great place to start and cheap. The certification process does impart further useful knowledge, but that costs $$. This will explain how to combine the DV approach within a Data Mesh - Mesh: Read the original book that started it all – Dehghani, Zhamak. "Data Mesh" - With this knowledge you will be able to critque various opinions such as those cited by your AI. Unfortunately, your prompt contains the logical phallacy, they are not alternatives to one another. Your question is like asking for the differences between your car and the destination. A mesh describes a destination, DV describes how you might arrive at your destination. A Mesh is one of many destinations. A DV can get you to a Lake, a Lakehouse, and a Warehouse. It can also get you to a Mesh, which could be a collection of all the above styles of data platform, each of which may be built using DV or not.
2 likes • Aug 6
@Samuel Williams It's an interesting AI experiment. Given the range of differing opinions, sometimes incompatible and competing opinions, How does it respond? Reflect the most popular (dominant, loudest, or prevalent) opinion? I doubt it would reason over the alternatives and provide a new opinion that is the best of all of them with compelling arguments. It is only a LLM after all.
1-4 of 4
Russell Searle
3
41points to level up
@russell-searle-7000
information architect

Active 63d ago
Joined Jul 9, 2024
powered by