YAML is fine for what it is: a markup language. I have no problem with it being ...

LeonM · 2025-03-20T09:57:34 1742464654

> CI is not "configured", it is coded.

Finally! I was always struggling to explain to others why YAML is OK-ish as a language, but then never seems to work well for the things people tried doing with it. Especially stuff that needs to run commands, such as CI.

> How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.

Agreed. GitHub actions, or any remote CI runner for that matter, makes the problem even worse. The whole cycle of having to push CI code, wait 10 minutes while praying for it to work, still getting an error, trying to figure out the mistake, fixing one subtle syntax error, then pushing the code again in the hope that that works is just a terrible workflow. Massive waste of time.

> You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.

Even for configurations YAML remains a pain, unfortunately. It could have been great for configs, but in my experience the whole strict whitespace (tabs-vs-spaces) part ruined it. It isn't a problem when you work from an IDE that protects you from accidentally using tabs (also, auto-formatting for the win!) but when you have to write YAML configuration (for example: Netplan) on a remote server using just an editor it quickly becomes a game of whack-a-mole.

motorest · 2025-03-20T11:20:17 1742469617

> Especially stuff that needs to run commands, such as CI.

I don't understand what problem you could possibly be experiencing. What exactly do you find hard about running commands in, say, GitLab CICD?

cmsj · 2025-03-20T11:56:43 1742471803

So, I'm not interested in the debate about the correctness (or otherwise) of yaml as a declarative programming language, but I will say this...

iterating a GitHub Actions workflow is a gigantic pain in the ass. Capturing all of the important logic in a script/makefile/whatever means I can iterate it locally way faster and then all I need github to do is provision an environment and call my scripts in the order I require.

motorest · 2025-03-20T12:00:19 1742472019

> iterating a GitHub Actions workflow is a gigantic pain in the ass. Capturing all of the important logic in a script/makefile/whatever means I can iterate it locally way faster and then all I need github to do is provision an environment and call my scripts in the order I require.

What's wrong with this?

https://docs.github.com/en/actions/writing-workflows/choosin...

kbolino · 2025-03-20T13:27:43 1742477263

When it gets realistic, with conditions, variable substitutions, etc., it ends up being 20 steps in a language that isn't shell but is calling shell over and over again, and can't be run outside of CI. Whereas, if you just wrote one shell script, it could've done all of those things in one language and been runnable locally too.

motorest · 2025-03-21T04:19:50 1742530790

> When it gets realistic, with conditions, variable substitutions, etc.,

What exactly do you find hard in writing your own scripts with a scripting language? Surely you are not a software developer who feels conditionals and variable substitutions are hard.

> it ends up being 20 steps in a language that isn't shell but is calling shell over and over again, and can't be run outside of CI.

Why are you writing your CICD scripts in a way that you cannot run them outside of a CICD pipeline? I mean, you're writing them yourself, aren't you? Why are you failing to meet your own requirements?

If you have a requirement to run your own scripts outside of a pipeline, how come you're not writing them like that? It's CICD 101 that those scripts should be runnable outside of the pipeline. From your description, you're failing to even follow the most basic recommendations and best practices. Why?

That doesn't sound like a YAML problem, does it?

kbolino · 2025-03-21T15:45:29 1742571929

This is not about YAML in some general or abstract sense, it is about a YAML-based domain-specific language. If you think this is just about YAML, you are hyper-focused on the wrong detail.

In order to use this domain-specific language properly, you first must learn it, and learning YAML is but a small part of that. Moreover, it is not immediately obvious that, once you know it, you actually want to avoid it. But you can't avoid it entirely, because it is the core language of the CI/CD platform. And you can't know how to avoid it effectively until you have spent some time just using it directly. Simplicity comes from tearing away what is unnecessary, but to discern necessary from unnecessary requires judgment gained by experience. There is no world in which this knowledge transfers immediately, frictionlessly, and losslessly.

Furthermore, there is a lot that GitHub (replace with platform of choice) could have done to make this better. They largely have no incentive to do so, because platform lock-in isn't a bad thing to the platform owner, and it's a nontrivial amount of work on their part, just as it is a nontrivial amount of work on your part to learn and use their platform in a way that doesn't lock you into it.

maratc · 2025-03-20T16:25:18 1742487918

Q: How do you determine what date it was 180 days ago?

A: Easy! You just spin up a Kubernetes pod with Alpine image, map a couple of files inside, run a bash script of "date" with some parameters, redirect output to a mapped file, and then read the resulting file. That's all. Here's a YAML for you. Configuration, baby!

(based on actual events)

jiggawatts · 2025-03-21T08:02:24 1742544144

At first I assumed you were kidding, then I realised that sadly… you probably weren’t.

maratc · 2025-03-21T10:19:16 1742552356

I wasn’t. This goes to show that when all you have is a YAML hammer, every problem has to look like a YAML-able nail. Still there would be people who would say I’m “blaming my tools” and “everything is covered in chapter 1 of yaml for dummies.”

tom_ · 2025-03-20T15:44:32 1742485472

Nothing significant on the face of it and I think that's pretty much exactly what's being suggested: don't have anything particularly interesting in the .yml file, just the bare minimum plus some small number of uncomplicated script invocations to install dependencies and actually do the build.

(Iterating even on this stuff by waiting for the runner is still annoying though. You need to commit to the repo, push, and wait. Hence the suggestion of having scripts that you can also run locally, so you can test changes locally when you're iterating on them. This isn't any kind of guarantee, but it's far less annoying to do (say) 15 iterations locally followed by the inevitable extra 3 remotely than it is having to do all 18 remotely, waiting for the runner each time then debugging it by staring at the output logs. Even assuming you'd be able to get away with as few as 15 given that you don't have proper access to the machine.)

bastardoperator · 2025-03-20T16:27:40 1742488060

But GitHub recommends that, so if people don't follow best practices, and then complain when the docs are clear, who's at fault? The person writing against a system they don't understand because they haven't read the docs or the people who recommend what you're professing in the docs?

kbolino · 2025-03-21T15:30:48 1742571048

> But GitHub recommends that

Where?

bastardoperator · 2025-03-21T15:34:49 1742571289

Here:

https://docs.github.com/en/actions/writing-workflows/choosin...

kbolino · 2025-03-21T15:50:11 1742572211

Nothing in that document states anything remotely like the antecedent of "that", which was:

> don't have anything particularly interesting in the .yml file, just the bare minimum plus some small number of uncomplicated script invocations to install dependencies and actually do the build

It is a very basic "how to" with no recommendations.

Moreover, they directly illustrate a bad practice:

      - name: Run the scripts
        run: |
          ./my-script.sh
          ./my-other-script.sh

This is not running two scripts, this is running a shell command that invokes two scripts, and has no error handling if the first one fails. If that's the behavior you want, fine, but then put it in one shell script, not two. What am I supposed to do with this locally? If the first shell script fails, do I need to fix it, or do I just proceed on to the second one?

bastardoperator · 2025-03-21T16:24:47 1742574287

It does even if you don't like it. You can put your logic in a script and execute that. That is what is being conveyed here in a blistering simple fashion. You could also make it one script, or two or three, you could even break those out into steps.

This is invoking a shell and that's how shells typically work, one command at a time. Would it make you feel better if they added && or used a step like they also recommend to split these out? You can put the error handling in your script if need be, that's on you or the reader, most CI agents only understand true/false or in this case $?.

Nobody said they want that behavior, they're showing you the behavior. They actually show you the best practice behavior first, not sure if you didn't read that or are purposely omitting it. In fact, the portion you highlight, is talking about permissions, not making suggestions.

      - name: Run a script
        run: ./my-script.sh
      - name: Run another script
        run: ./my-other-script.sh

kbolino · 2025-03-21T16:42:28 1742575348

This is not a discussion about what's possible, it's a discussion about what's best. You can write your own opinion here, and it seems like we're in violent agreement, but that doesn't make our opinion GitHub's opinion.

That page is just one small part of a much larger reference document, and it doesn't seem opinionated at all to me. Plus there are dozens of other examples elsewhere in the same reference that are not simple invocations of one shell script and nowhere are you admonished not to do things that way.

bastardoperator · 2025-03-21T17:09:46 1742576986

And they show those patterns first. You had to take an example that is clearly about script permissions and misrepresent it. Yeah, it's not opinionated, it's fact. That's how it works...

kbolino · 2025-03-21T17:18:34 1742577514

At best, we are talking past each other. At worst, you are misreading everything I write to play gotcha games. Whatever, I'm glad you were able to figure out exactly the right things to do from a first read of a large and complex document that doesn't say anything of the sort. As for the rest of us mere mortals, we're stuck figuring these things out by trial and error, or even worse, having to pick up the pieces from somebody else's left-behind mistakes.

tom_ · 2025-03-22T02:00:59 1742608859

It's the reference manual. It's just a list of things you can do. If you like this specific thing, and think this should be the main way you express your build process, great. I think that too. Meanwhile with GitHub Actions you can also do this big pile of shit that the manual also describes: https://docs.github.com/en/actions/writing-workflows/choosin...

motorest · 2025-03-20T11:10:47 1742469047

> However, CI is not "configured", it is coded.

No, it really isn't. I'll clarify why.

Pretty much all pipeline services share the same architecture pattern:

* A pipeline run is comprised of one or more build jobs,

* Pipeline runs are triggered by external events

* Build jobs have contexts and can output artifacts,

* Build jobs are grouped into stages,

* Stages are organized as a directed graph,

* Transitions between stages in the directed graph is ruled by a set of rules, some supported by default (i.e., if a job fails then the stage fails) complemented by custom rules (manual or automatic approvals, API tests, baking periods, etc).

This is the textbook scenario ideal for DSLs. You already are bound to an architecture pattern, this there is no point of reinventing the wheel each time. Just specify your stages and which jobs run as part of each stage, manage artifacts and promotion logic, and you're done.

You do not need to take my word for it. Take a look at GitLab CICD for a pipeline with build, test, and delivery stage. See what a mess you will put together if you support the same feature set with whatever scripting language you choose. There is no discussion or debate.

baq · 2025-03-20T11:38:21 1742470701

I can’t understand how you can say DSL and YAML in the same sentence and say it’s fine. YAML is a serialization format. A bad DSL would be a welcome improvement over GHA pipelines in YAML. You’re fundamentally confusing concepts here, you want to restrict flexibility (I agree with that btw) by using a simplistic language, but what it actually does is increase complexity of the code comprising the pipeline with zero hard restrictions.

maratc · 2025-03-20T16:16:16 1742487376

> * Stages are organized as a directed graph

The problem starts when that graph cannot be determined in advance and needs to be computed in runtime. It's a bit better when it's possible to compute that graph as a first step, and it's a lot worse when one needs to do a couple of stages before being able to compute the next elements of the graph. The graph computation is terrible enough in e.g. Groovy, but having to do it in YAML is absolutely horrendous.

> Take a look at GitLab CICD for a pipeline with build, test, and delivery stage

Yeah, if your workflow fits in a kindergarten example of "build, test, and delivery", then yeah, it's YAML all the way baby. Not everyone is so fortunate.

duped · 2025-03-20T16:15:37 1742487337

It's funny how you say this is the textbook scenario ideal for DSLs and I see it as the textbook scenario ideal for a real programming language. Organizing stages as a DAG with "transition ruled by a set of rules" is bonkers, I know how to write code with conditional logic and subroutine calls, give that to me.

Wrapping it in a DSL encoded as YAML has zero benefit other than it being easier for a team with weak design skills to implement and harder for users to migrate off of.

deng · 2025-03-20T12:05:11 1742472311

We don't disagree here. There are tools which support you in doing this, and I mentioned a few of them in my post (Make, Just, doit, mage). There are many more. I also think that re-inventing these tools is a waste of time, but it is still better than shoehorning this into YAML. You seem to think YAML is some kind of DSL for pipelines. It really is not.

Lammy · 2025-03-20T20:21:05 1742502065

> YAML is fine for what it is: a markup language.

Pardon my pedantry, but the meaning of YAML's name was changed from the original “Yet Another Markup Language” to “YAML Ain't Markup Language” in a 2002 draft spec because YAML is, in fact, not a markup language :)

Compare:

https://yaml.org/spec/history/2001-12-10.html

https://yaml.org/spec/history/2002-04-07.html

adolph · 2025-03-20T15:17:25 1742483845

> However, CI is not "configured", it is coded. . . . YAML was continuously extended to deal with that, so it developed into much more than just "markup", but it grew into this terrible chimera.

Brings to mind the classic "Kingdom of Nouns" [0] parable, which I read to my kid just last week. The multi-line "run" nodes in GitHub actions give me the heebie-jeebies, like how MUMPS data validation was maintained in metadata of VA-Fileman [1].

0. https://steve-yegge.blogspot.com/2006/03/execution-in-kingdo...

1. https://www.hardhats.org/fileman/pm/gfs_frm.htm

bastardoperator · 2025-03-20T16:19:18 1742487558

I've created multiple actions, reusable, composite, along with multiple Jenkins plugins and CircleCI Orbs. I disagree, code your actions, your jenkins plugins, your orbs, whatever. Those are just code wrappers that expose configuration via YAML or Pipeline DSL. Agreed, coding in YAML is pretty bad, but ultimately it's a choice.

I will take the Actions path 100% of the time. Building your own action is so insanely simple it makes me wonder if the people complaining about YAML understand the tooling because it's entirely avoidable. It also coincides with top comments about coding your own CI, if you're just "using" YAML you're barely touching the surface.

ruuda · 2025-03-20T23:38:47 1742513927

If you have to deal with tools that need to be configured with yaml, give https://rcl-lang.org/ a try! It can be "coded" to avoid duplication, with real variables, functions, and loops. It can show you the result that it evaluates to. It can do debug tracing, and it has a built-in build command to generate files like GitHub Actions workflows from an RCL file.