The final screenshot that indicates reading and looking at the table in code. Is that also via Daft? What have you found the best way to manage querying and rolling back to versions of delta tables?
Yes via Daft. I manage and use Delta Lake tables that are in the 300TB+ range, multiple; I've never had to roll back versions, or go hardcore up front on the data quality, and reliable pipelines to avoid having to do such things. Otherwise use MERGE statements to to create idempotent pipelines.
I have limited experience with Delta Lake tables, certainly not working with that scale of 300TB , but compared to raw storage, isn't one of the advertised benefits the ability to have history? What would be the use of history if not to query or restore to an earlier version ?
Great read, but you don't need to (and absolutely should not) embedd AWS credentials into your container at build time, if you need credentials at that time use Docker Secret instead.
All AWS services will automatically load credentials into your container provided you attach an IAM role. If running locally you can inject credentials at run-time via a .env file
Good stuff…nice and clean and “real time baby!”
The final screenshot that indicates reading and looking at the table in code. Is that also via Daft? What have you found the best way to manage querying and rolling back to versions of delta tables?
Yes via Daft. I manage and use Delta Lake tables that are in the 300TB+ range, multiple; I've never had to roll back versions, or go hardcore up front on the data quality, and reliable pipelines to avoid having to do such things. Otherwise use MERGE statements to to create idempotent pipelines.
I have limited experience with Delta Lake tables, certainly not working with that scale of 300TB , but compared to raw storage, isn't one of the advertised benefits the ability to have history? What would be the use of history if not to query or restore to an earlier version ?
Great read, but you don't need to (and absolutely should not) embedd AWS credentials into your container at build time, if you need credentials at that time use Docker Secret instead.
All AWS services will automatically load credentials into your container provided you attach an IAM role. If running locally you can inject credentials at run-time via a .env file
Awesome article! I could better understand how to work with a image through your article!