More

data_ders · 2025-05-28T16:39:37 1748450377

dbt Labs employee here -- happy to answer any questions!

data_ders · 2025-05-27T16:16:49 1748362609

the manifesto [1] is the most interesting thing. I agree that DuckDB has the largest potential to disrupt the current order with Iceberg.

However, this mostly reads to me as thought experiment: > what if the backend service of an Iceberg catalog was just a SQL database?

The manifesto says that maintaining a data lake catalog is easier, which I agree with in theory. s3-files-as-information-schema presents real challenges!

But, what I most want to know is what's the end-user benefit?

What does someone get with this if they're already using Apache Polaris or Lakekeeper as their Iceberg REST catalog?

[1]: https://ducklake.select/manifesto/

peterboncz · 2025-05-27T18:53:08 1748371988

https://x.com/peterabcz/status/1927402100922683628

it adds for users the following features to a data lake: - multi-statement & multi-table transactions - SQL views - delta queries - encryption - low latency: no S3 metadata & inlining: store small inserts in-catalog and more!

tishj · 2025-05-27T19:41:27 1748374887

One thing to add to this: Snapshots can be retained (though rewritten) even through compaction

As a consequence of compaction, when deleting the build up of many small add/delete files, in a format like Iceberg, you would lose the ability to time travel to those earlier states.

With DuckLake's ability to refer to parts of parquet files, we can preserve the ability to time travel, even after deleting the old parquet files

ryanworl · 2025-05-27T22:22:03 1748384523

Does this trick preclude the ability to sort your data within a partition? You wouldn’t be able to rely on the row IDs being sequential anymore to be able to just refer to a prefix of them within a newly created file.

anentropic · 2025-05-27T16:31:05 1748363465

they say it's faster for one thing - can resolve all metadata in a single query instead of multiple HTTP requests

data_ders · 2025-05-12T13:31:12 1747056672

I hope he gets the help and relief he needs. In someways a brain with a penchant for software causes more trouble during mental health crises than one who doesn’t. Especially if they’re into crypto.

pc86 · 2025-05-12T13:54:30 1747058070

Software engineers are just normal people. Our brains are not "special." This (objectively incorrect) mental model of software devs is very harmful if you want to actually understand things going on around you.

MyOutfitIsVague · 2025-05-12T14:34:43 1747060483

People are attracted to things based on their personality traits. The average software engineer is not the same as the average person, because the average person is not attracted to software engineering.

In my observations, the average software engineer is more likely to be persnickety, caught up in small details, and obsessive than the average person. We're more likely to split hairs than most people would care to, and proper labeling and categorization is much more important to us than average person.

It's not that software engineers are somehow magical or special, it's that there is a selection bias to become a software engineer. It's extra not special, because the same thing happens for virtually any specialized field. There are famous stereotypes about the personalities of psychologists, for instance.

pavel_lishin · 2025-05-12T15:08:27 1747062507

> In my observations, the average software engineer is more likely to be persnickety, caught up in small details, and obsessive than the average person. We're more likely to split hairs than most people would care to, and proper labeling and categorization is much more important to us than average person.

It sounds like a fancy way of saying that people on the spectrum are more likely to become software engineers than some other arbitrary person.

MyOutfitIsVague · 2025-05-12T18:09:39 1747073379

It had occurred to me, but I didn't want to say it that way because I'm not a psychological professional and it seems out of my area of expertise to make that assumption based on my passive observations.

xvector · 2025-05-12T14:13:30 1747059210

Nah. I assume you have friends inside and outside of software?

Software people are just different from the normal person. Way different. I couldn't put my two friend groups together.

Are they "special?" Maybe, maybe not.

pavel_lishin · 2025-05-12T14:32:49 1747060369

I have software friends and non-software friends. There is no particular correlation. This is some weird flavor of biological essentialism going on here.

watwut · 2025-05-12T14:29:01 1747060141

No, we are not different then normal people.

data_ders · 2025-04-25T19:45:59 1745610359

Hell yeah Pontiac Vibe! My 2008 is at 308k! I’ll drive into the ground

stantaylor · 2025-04-25T19:52:42 1745610762

My 30-year-old daughter is still driving the Toyota version, the Matrix, also 2008, that we bought in about 2013. She loves the thing. If she didn't have it, I'm sure I would still be driving it.

I find it hilarious that it's a limited-edition M Theory model. It has a badge glued to the dash that says "1926 of 5000." For a Toyota econobox.

thederf · 2025-04-25T20:56:57 1745614617

Niice, giving me hope! My '06 is showing its age, but I hope it's got another 100k in her!

data_ders · 2025-04-10T17:31:05 1744306265

> how so many organizations end up investing humongous amounts of effort rolling their own databases from scratch because none of the off-the-shelf solutions meet all their requirements. But in most of these cases, it's because some of the "requirements" were actually "nice-to-haves" and they could have gotten by fine with an off-the-shelf database, but they talked themselves into building one from scratch.

I love the term "arbitrary uniqueness" for this too. Like how different are your needs, really?

data_ders · 2025-02-19T21:22:22 1740000142

Thought an update on where we're at post acquistion of SDF would be interesting to this group!

data_ders · 2025-02-19T02:22:22 1739931742

conda user for 10 years and uv skeptic for 18 months.

I get it! I loved my long-lived curated conda envs.

I finally tried uv to manage an environment and it’s got me hooked. That a projects dependencies can be so declarative and separated from the venv really sings for me! No more meticulous tracking of a env.yml or requirements.txt just ‘uv add` and `uv sync` and that’s it! I just don’t think about it anymore

synparb · 2025-02-19T03:03:12 1739934192

I'm also a long time conda user and have recently switched to pixi (https://pixi.sh/), which gives a very similar experience for conda packages (and uses uv under the hood if you want to mix dependencies from pypi). It's been great and also has a `pixi global` similar to `pipx`, etc the makes it easy to grab general tools like ripgrep, ruff etc and make them widely available, but still managed.

data_ders · 2025-02-19T03:08:42 1739934522

whoa! TIL thanks will check it out

data_ders · 2025-02-16T19:11:10 1739733070

Dope project!

FYI on your LLC page has “queries” spelled “qurries”. Not sure if it’s intentional or not though.

vonadz · 2025-02-16T19:30:36 1739734236

Thanks for catching that! Will get it fixed.

data_ders · 2025-02-14T04:24:32 1739507072

Great story and interesting product!

Reminds me of NextMv [1] loved their episode on SWE daily. Can anyone compare them to this and how they’re doing?

[1]: https://www.nextmv.io/

data_ders · 2025-02-10T00:35:23 1739147723

Heavy recommend the biography of Stewart Brand that’s quoted throughout!