Thanks for the dedication to test the new shiny tool. We need people like you who is skeptical of enterprise blog posts and test yourself using your own platform. Im sure they are happy with the feedback and more people are aware of the new integration.
I'd be interested in a comparison without any index. You don't always have the right index(es) at hand in all kinds of situations, and indexes come with some overhead too. There could be huge value in supporting analytical workloads without needing any index.
Thanks for trying! As we said, index is not supported, so to do a fair comparison of full scan, thats why we didnt create an index. In practice this is not really true, you would have index but again... they may not support your analytical queries and usually where created to support transaction workloads. We could have been more clear in the blog announcement - thats good feedback
I really had a good first impression about DuckDB, but later I tried to compare both using the IMDB dataset with indexes (min-max for DuckDB and B Tree for Postgres) and my queries only involved the "=" operator, for both of the databases I used the default settings, not only did Postgres finished the queries faster but also used much less memory than DuckDB, mind you I even used the optimized DuckDB data storage format that they advertise, so yeah, I will definitely use Postgres over DuckDB.
Super interesting. So as I understand it - no columnar storage means duckdb has to do full table scan with all columns on each query, which makes it slow compared to the native postgres engine which can take advantage of indexes.
I wonder why 100M rows is so much slower than 50M rows for both engines.
Thanks for the dedication to test the new shiny tool. We need people like you who is skeptical of enterprise blog posts and test yourself using your own platform. Im sure they are happy with the feedback and more people are aware of the new integration.
Awesome article, thoroughly enjoyed it. Thank you
I'd be interested in a comparison without any index. You don't always have the right index(es) at hand in all kinds of situations, and indexes come with some overhead too. There could be huge value in supporting analytical workloads without needing any index.
Thanks for trying! As we said, index is not supported, so to do a fair comparison of full scan, thats why we didnt create an index. In practice this is not really true, you would have index but again... they may not support your analytical queries and usually where created to support transaction workloads. We could have been more clear in the blog announcement - thats good feedback
FYI - We've updated both the blog and YouTube video to better explain why we didn't use indexes. ☝️✅
I really had a good first impression about DuckDB, but later I tried to compare both using the IMDB dataset with indexes (min-max for DuckDB and B Tree for Postgres) and my queries only involved the "=" operator, for both of the databases I used the default settings, not only did Postgres finished the queries faster but also used much less memory than DuckDB, mind you I even used the optimized DuckDB data storage format that they advertise, so yeah, I will definitely use Postgres over DuckDB.
Super interesting. So as I understand it - no columnar storage means duckdb has to do full table scan with all columns on each query, which makes it slow compared to the native postgres engine which can take advantage of indexes.
I wonder why 100M rows is so much slower than 50M rows for both engines.