2 Comments
User's avatar
Vincent gautier's avatar

thank you for this effort Daniel ! i cloned the project and added endpoint_url in the s3client and could perform some sanity checks on iceberg tables on Minio and IBM Cos (S3 compatible) object storages. This will be very helpful in my future lakehouse adventures.

Expand full comment
Neural Foundry's avatar

This is brilliant - finally someone is addressing the 'dark matter' of data lakes! Everyone obsesses over query engines but ignores the foundational health issues that kill performance. The health score approach (0.0-1.0) is genius because it gives executives a simple metric while giving engineers actionable recommendations. The integration plans with Datadog and CloudWatch make perfect sense - observability shouldn't be siloed between infra and data. Question: how does Drainage handle the eventual consistency of S3? When you're scanning thousands of objects, do you ever run into issues where manifest files havent propagated yet? The Rust choice was smart - I've seen similar tools in Python that choke on large tables. Really excited to try this on our Iceberg tables!

Expand full comment