Your OOM example is running in in-memory mode. You're simultaneously cutting yourself off from using disk and constraining your memory limit. From the docs - "If DuckDB is running in in-memory mode, it cannot use disk to offload data if it does not fit into main memory." (https://duckdb.org/docs/guides/performance/how_to_tune_workloads#spilling-to-disk)
The Delta Lake example is uninformed. The support for Delta Lake is through a DuckDB extension which clearly states it's current limitations - no write support, for example. There's zero promise or expectation that the Python deltalake library will now accept an object from the duckdb Python library. The actual details can be found here (https://duckdb.org/2024/06/10/delta.html). You'll see the example in this article and what was actually released are drastically different things.
According to those docs, it appears that setting the temp directory should allow spilling to disk, even when it's not connected to a persistent database file. Although it is written in a somewhat ambiguous way. In any case should be simple enough for OP to repeat the test with a persistent database file, if he hasn't given up on DuckDB entirely at this point...
Your OOM example is running in in-memory mode. You're simultaneously cutting yourself off from using disk and constraining your memory limit. From the docs - "If DuckDB is running in in-memory mode, it cannot use disk to offload data if it does not fit into main memory." (https://duckdb.org/docs/guides/performance/how_to_tune_workloads#spilling-to-disk)
The Delta Lake example is uninformed. The support for Delta Lake is through a DuckDB extension which clearly states it's current limitations - no write support, for example. There's zero promise or expectation that the Python deltalake library will now accept an object from the duckdb Python library. The actual details can be found here (https://duckdb.org/2024/06/10/delta.html). You'll see the example in this article and what was actually released are drastically different things.
Thanks for this. I was hoping to find a comment like this.
According to those docs, it appears that setting the temp directory should allow spilling to disk, even when it's not connected to a persistent database file. Although it is written in a somewhat ambiguous way. In any case should be simple enough for OP to repeat the test with a persistent database file, if he hasn't given up on DuckDB entirely at this point...