Where's the best place to store your metadata?

🔥

Jul 29 (edited) in Technical

Happy Monday everyone! For those of you exploring metadata-driven architectures (which I think is quite a lot of you!)... here are some ideas for you:

As a quick recap: the metadata-driven data pipeline is a technique commonly used in data engineering. Rather than explicitly declaring the source and the destination for a Copy Data Activity (for example), we instead design our pipelines so that the Source and Destination can be passed in dynamically. This means we can store details of the Source/Destination connections in another location, which is read at Execution time. This adds a lot of benefits: scalability, maintainability, and many more.

However, the point of this post was to start a discussion about how/ and where you can store such metadata. The two most common ways you see metadata stored (in a Microsoft environment) are

In structured tables (like the Data Warehouse)
In a JSON File (perhaps in your Lakehouse Files area).

However, I'd like to throw in a third option for discussion: storing your metadata in a Notebook (and passing it into your pipeline using mspsarkutils.notebook.exit().

Pros of this appoach: