Lance file format. Parquet killer?
time will tell
Who doesn’t love to play with something new? At least new to us. The Lance file format has been around since … well, I’m not sure, but the PRs in the GitHub repo start around 2022. Young in digital years. Just a little ‘fella.
There isn’t really anything to do but jump in the deep end when it comes to file formats. It’s hard to imagine anything unseating Parquet file, the proverbial Altas carrying Iceberg and Delta Lake on its shoulders. But, parquet files started from nothing too.
I’m not going to waste my time with nonsensical deeply technical questions about how this Lance file format works. No one cares.
What do we care about?
What big name Data Engineering frameworks support Lance?
How well does it work with S3?
Is the performance at least on par with Parquet?
So today I hope we can leave with a little bit of knowledge about the Lance file format, its use cases, and if it’s worth your trouble to play around with it.
Keep reading with a 7-day free trial
Subscribe to Data Engineering Central to keep reading this post and get 7 days of free access to the full post archives.




