DuckDB is amazing for any sort of fast data analysis when the data is small enough that it can fit on your laptop
Recently at work I've been using it to analyse the Claude code sessions of every engineer at our company (that we upload to S3) and it's been extremely helpful to help us find gaps in devex and have clear metrics to back up the impact of fixing them
Another thing it's been really useful for has been getting metrics on Claude skills usage and then dive into use-cases by looking at the transcripts
Other engineers that had never touched DuckDB were so impressed with how easy it is for AI agents to write queries on our dataset
Like sqlite, duckdb is underappreciated as a production database. You can totally run it on servers or even "serverless" and do some heavy data transformations or with the right server size work with large scale datasets (up to a TB compressed seems fine).
Agree, in addition to that DuckDB also works quite well for data that is too big to fit in memory or on the machine DuckDB is on (predicate push down, out of core processing, …).
>Recently at work I've been using it to analyse the Claude code sessions of every engineer at our company (that we upload to S3) and it's been extremely helpful to help us find gaps in devex and have clear metrics to back up the impact of fixing them
Nice! How do you set things up so that your engineers's claude code sessions upload to S3? Thanks for the help in advance
Can you please expand more on the claude analysis part. What exactly you analysed and what outcome it helped with ?
[flagged]
>> DuckDB is amazing for any sort of fast data analysis when the data is small enough that it can fit on your laptop
I agree, and the dirty (not so) secret big data providers like Snowflake try to hide: the majority of your work is not big data and WILL fit on your local machine. My last company was spending $2M/yr on contract with Snowflake, and another million between Fivetran and Matillion. Of the 1200 clients using analytics maybe 2 had enough data to warrant "infinite scalability" and a dozen wanted Snowflake because they already had corporate warehouses in Snowflake (they probably didn't need it either). Turns out the Extract and Load could be handled by bog-standard C# code and a bunch of SQL, while almost everyone was better off with a DuckDB database running locally, often in the browser. You've probably heard YAGNI before (You Ain't Gonna Need It) but it's even more likely with "Big Data". #SmallDataConvert