7 Comments
Oct 24Liked by Daniel Beach

As an an old-school data person and big fan of Kimball - mainly for having amazing success in delivery of data on a repeatable Lego like framework.

That said - way before data lakes and clouds - most MPP Data Warehouse databases did not support primary keys and foreign keys for performance reasons. It allowed you to define them in the DDL for documentation purposes, and even some BI / ETL tools used that metadata to assist the tools in coding. Kimball even included de-duplication checking in within the recommended subsystems for a EDW framework. The more things change, the more things stay the same. ;)

Expand full comment

I miss real real databases. And I do love the Kimball (and Ross, and Adamson and Reeves).

Expand full comment

"move on with life"

But what if I can't?! Hashs are far too clever for me at the moment. It's good to know Data Lakes have things covered though.

Expand full comment

All I read is. They do not support primary keys, but you need them so lets solve it with home made code instead…brilliant right?…ehhh No

Expand full comment

<Data Vault has entered the chat>

Expand full comment

But it doesn't explain how this data lake solve this issue?

Expand full comment
author

The question is then ... was there an issue to solve in the first place? Data Lakes don't solve this problem, at least not yet. The user has to manually solve it.

Expand full comment