Please be careful. I'd love to adopt Dagger, but the UI in comparison to GHA, is just not a value add. I'd hate for y'all to go the AI route that Arc did... and lose all your users. There is A LOT to CICD, which can be profitable. I think there's still a lot more features needed before it's compelling and I would worry Agentic AI will lead you to a hyper-configurable, muddled message.
Thank you. Yes, I worry about muddling the message. We are looking for a way to communicate more clearly on the fundamentals, then layer use cases on top. It is the curse of all general-purpose platforms (we had the same problem with Docker).
The risk of muddling is limited to the marketing, though. It's the exact same product powering both use cases. We would not even consider this expansion if it wasn't the case.
For example, Dagger Cloud implements a complete tracing suite (based on OTEL). Customers use it for observability of their builds and tests. Well it turns out, you can use the exact same tracing product for observability of AI agents too. And it turns out that observability is huge unresolved problem of AI agents! The reason is because, fundamentally, AI agents work exactly like complicated builds: the LLM is building its state, one transformation at a time, and sometimes it has side effects along the way via tool calling. That is exactly what Dagger was built for.
So, although we are still struggling to explain this reality to the market: it is actually true that the Dagger platform can run both CI and AI workflows, because they are built on the same fundamentals.
Hmmmm... so I think the crux of the matter is here: that you clearly articulate why your platform (for both containers and agents) is really helpful to handle cases where there are both states and side-effects
I can understand what you're trying to say, but because I don't have clear "examples" at hand which show me why in practice handling such cases are problematic and why your platform makes that smooth, I don't "immediately" see the value-added
For me right now, the biggest "value-added" that I perceive from your platform is just the "CI/CD as code", a bit the same as say Pulumi vs Terraform
But I don't see clearly the other differences that you mention (eg observability is nice, but it's more "sugar" on top, not a big thing)
I have the feeling that indeed the clean handling of "state" vs "side-effects" (and what it implies for caching / retries / etc) is probably the real value here, but I fail to perceive it clearly (mostly because I probably don't (or not yet) have those issues in my build pipelines)
If you were to give a few examples / ELI5 of this, it would probably help convert more people (eg: I would definitely adopt a "clean by default" way of doing things if I knew it would help me down the road when some new complex-to-handle use-cases will inevitably pop up)
Yes, but we're working on a converter from DDL sql to pgroll json.
The reason for JSON is because the pgroll migrations are "higher level". For example, let's say that you are adding a new unique column that should infer its data from an existing column (e.g. split `name` into `first_name` and `last_name`). The pgroll migration contains not only the info that new columns are added, but also about how to backfill the data.
The sql2pgroll converter is creating the higher level migration files, but leaves placeholder for the "up" / "down" data migrations.
I completely understand the need for a higher-level language above SQL, but straight JSON is a deal-breaker. It's not just comments, it's also that editors won't understand that JSON should have syntax-highlighting to help people catch trivial typos. A configuration language that allows for importing a file as a string would allow users to write the SQL in files with .sql extensions and get syntax-highlighting.
pgroll is written in Go, so if you were to accept configuration written in CUE, you would get the best of all worlds:
Ha ha, we're not wedded, I was only explaining why a .sql file is not quite enough. Using some json equivalent should be fine, thanks for pointing to the issue.
default values! Since type hints are *hints*, it is difficult to set default values for complicated types. For instance, if you have lists, dicts, sets in the type signature, without a library like pydantic, it is difficult and non-standard. This becomes even more problematic when you start doing more complicated data structures. The configuration in this library starts to show the problems. https://koxudaxi.github.io/datamodel-code-generator/custom_t...
The issue very much is a lack of a standard for the entire language; rather than it not being possible.
We've migrated our pyenv poetry application - that's pretty complex with data pipeline flows and apis. The only issue we had was loading .env files - we had done some custom env var scripting as a workaround to an AWS issue and that was hard to migrate over. However, once that was done (and was due to bad implementation initially outside of poetry), moving from poetry to uv was rock solid. No issues and it just worked. I was surprised.
I would really encourage you to think more on this. Good enough for you... is not really the goal. We want standard tools that help with packages and virtual environments that can scale to an organization of 2 python devs... all the way up to hundreds or thousands of devs. Otherwise, it fragments the ecosystem and encourages bugs and difficult documentation that prevents the language from continuing to evolve effectively.
It would be great if that tool existed, but it doesn’t seem to right now. I can appreciate the instinct to improve packaging, but from an occasional Python developer’s perspective things are getting worse. I published a few packages before the pandemic that had compiled extensions. I tried to do the same at my new job and got so lost in the new tools, I eventually just gave up.
One of Python’s great strengths is the belief there should be one, obvious right way to things. This lack of unity in the packing environment is ruining my zen.
From ecosystem point of view I think all language specific package managers are crap. Instead of fragmenting the wider ecosystem on language boundaries, we should be looking more for language independent solutions. Nix is cool, as are buck2/bazel.
Normally I’d agree with you, but they’re talking about the built in tooling. That isn’t going away anytime soon. If it’s good enough for their purposes they should continue using it, especially with no clear consensus on a “better way”.
I think this is a great take. I'm on board on the HCL hate - it really reminds me of shoe-horning a templating solution into a complex, code level concern - but just removing YAML is not enough. My one caveat to this is a lot of the "platform engineering" products out there right now seem to want to abstract away all concerns behind a UI, which is just replacing YAML with UI problems. What we really need are open-source, cross-language sdks that allow self serving by developers using the tools they already know, but with additional configuration being able to color that settings by more specialized folks in their areas. For instance, an sdk that lets developers say ram is light, medium, heavy, etc. in whatever language they operate in; followed by a review of the sdk and some additional configuration layered on by the ops folks who monitor the entire company's spend and define what light/medium/heavy is. Too many of the platform engineering "solutions" seem to be about vendor lock-in.
BUT importantly... it REALLY does not work well for lots of teams. For me, this setup has caused production outages multiple times across multiple teams. Maybe the root python ecosystem should learn and adopt from other ecosystems that have figured out complex deployment in a much easier way.
I've seen people use that setup but not freeze the dependencies and so have errors in production that didn't exist in development and waste days trying to figure out what was going on.
I've seen people use this setup and then struggle to deploy in different environments, especially when a dependency updates and no longer works correctly on a particular device, or where there are differences in behaviour or packaging or something in two different machines.
I've seen people accidentally install packages locally and not add them to the requirements file (especially when they're less experienced with Python), and cause outages by having the application crash on startup.
I've seen people freeze the dependency list and then have excess dependencies floating around because they couldn't differentiate between dependencies that were being used, and dependencies that were previously transitively installed and no longer needed. This doesn't necessarily cause outages, but does slow everything down over time, either in continual package maintenance or in downloading excess packages.
Most of the time, when I've seen teams use this sort of "simple" packaging process, they end up writing a bunch of scripts to facilitate it (because it's rarely so simple in practice). I have seen these scripts fail in almost every possible way. Often this happens in a development environment or before production, but I've seen production issues here as well.
To be clear, I think there are some situations where .venv and requirements.txt really are all you need. But I don't think going down that route removes complexity or makes things easier. Instead, it means you need to manage that essential complexity yourself. There are sometimes advantages to that, and reasons why it might make sense to take that option, but they are relatively rare. And given that pip/venv are right now the most official way of handling packaging in Python, that raises a massive red flag for the entire ecosystem.
The problem mentioned with pyenv is that people accidentally develop/test on the wrong version of python itself. But that's specific to pyenv, and I don't actually see where the article discusses problems with venv. So again: What exact steps would a team take using just pip+virtualenv or pip+conda (the comment you responded to didn't mention pyenv or venv) that would lead to production outages?
It feels like you've determined there's nothing wrong with pyenv, pip, and virtualenv so any issues brought up, you will reject.
If that's not the case, here's the issue - someone used pyenv and did not exactly specify the python type - I believe we were on 3.9 and prod was 3.9.11 and the current python version was 3.9.12. There was a downstream package that had an OS dependency - I believe it was pandas - that conflicted and needed to be updated locally to work with 3.9.12. This broke and raised an error in production that was not reproducible locally - and when you deploy on AWS, reproducing can be a pain in the butt. I'm sure if the data scientist had used perfect pyenv, virtualenv, and pip commands; we would have caught this. However, they're very complicated - especially for people who focus on math - so requiring full knowledge of these tools is unrealistic for most data scientists.
> It feels like you've determined there's nothing wrong with pyenv, pip, and virtualenv so any issues brought up, you will reject.
Alternatively, I'm rejecting your claims because you keep making them and then not providing evidence. Now that you've actually described the problem, I can agree that that's a footgun, and pyenv should start to strongly discourage setting a global version in much the same way that pip has started to protect against people using `sudo pip install` to trash their systems.
I REALLY want to use mojo...but they still don't have basic backwards compatibility features like lambda done. I worry mojo won't get off the ground because they're too focused on hardware performance and aren't building enough features to justify the switch... to get that hardware performance.
I wish there were fixes to devcontainers before doing adding copilot. I really want declarative, repeatable builds that are easily used in both Codespaces AND Actions. I know all of the functionality is theoretically there in devcontainers.json, it is so manual to configure and confusing, that anytime I've done it, I use it for 2 weeks and then just go back to developing on local because I don't have time to keep that up. ESPECIALLY if you're deploying to AWS cloud and also, want to use alternative package managers like poetry, uv, yarn, jsr, etc.
I love Ruff and I am so excited to see python ecosystem developers tackling some really big and core table stakes problems with python. Especially now that it is being used beyond scripting and has become foundational to lots of apps.