You seem to know a lot about this area. I do not but I've heard explaining what deep learning models do is a black-box? If you work in a "misson critical" you'd have to explain all the math behind the model. Let's say in healthcare, finance, aviation, etc.
Also, the "big math brain"'s you're talking about probably read all the books your shutting down. I'd say their big math brains are the reason we have LLMs today.
> deferring all testing and linting to the CI process is an anti-pattern
I'm confused as to why this is an anti-pattern? My understanding is that the CI pipeline should run unit tests and linting for every commit. But at the same time, developers should run their tests before pushing code.
Of course the CI should always run them, but that should normally be as a confirmation/safeguard.
I've seen too many cases where the devs wouldn't even run the code locally. They would push it and expect the CI to do all the work. That's how you get shitty CI that is always broken.
That’s why you use branches though. You can break the CI on your own branch as much as you want, it’s nobody’s business. But a broken CI on a dev branch MUST prevent merging to a release branch.
If you allow devs to push directly on release branch, thus breaking the CI, you’re absolutely doing it wrong.
Well, I believe "absolutely doing it wrong" is a bit strong-worded.
Of course you can do it like you said but that means longer feedback loops in general. If the team wants to integrate more often and reduce feedback loops then that model evolves.
I'll give you an example.
In the team I mentioned in my top comment we were initially using a branching model with master, releases/, hotfixes/, dev, features/, which gradually evolved into master, dev, features/, which finally ended up as master, features/*. With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.
This allowed us multiple production deploys per day per developer with no risk. That's why I said I don't get the point of the article, you can absolutely get those short feedback loops and continuous integration if you want it, just need to setup the process that way.
> Of course you can do it like you said but that means longer feedback loops in general. If the team wants to integrate more often and reduce feedback loops then that model evolves.
Longer feedback loops =/= long feedback loops. You can definitely wait 5 to 10 minutes if it means doing it right.
> With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.
From my experience, the 1-2 lines fixes are the ones that benefit the most from automated CI because you’re doing it in a rush. In my team just last week a junior dev asked us to review their PR quickly because it was just 2 lines, and it didn’t even compile. We told them to be more careful in the future, but in the end it didn’t impact anything. It couldn’t possibly have impacted anything thanks to CI, it just makes it impossible to fuck up too recklessly.
Not that I don't see the value proposed by TBD, but I think you can have >90% of said value and none of the downsides using a well thought out branching strategy.
TBD doesn't mean you have a red CI main branch. Of course it is always green on main. It means you have short lived feature branch and rely on runtime checks for feature gating. A broken main will halt TBD. You are mischaracterize what TBD is.
> Depending on the team size, and the rate of commits, short-lived feature branches are used for code-review and build checking (CI). [...] Very small teams may commit direct to the trunk.
From the perspective of the dev, if my local CI isn’t worth anything towards a merge and upstream CI is gospel why run locally? If I’m reasonably certain that the two jobs are duplicative in output it could be seen as wasted time, especially if I have a PM hounding for features. I don’t call that lazy, I call that a trade off. (coming from someone who is constantly running tests locally before pushing upstream)
Isn't that slower and less efficient? Usually the CI has to run a full build from scratch before it can run the tests, but locally for me it's going to be an incremental build that takes a second or two. I can also run the subset of tests that I know are affected by my changes to get fast and reasonably reliable feedback vs waiting for CI to run the whole test suite.
Linters and tests should pass before code is rebased onto master. YOLOing code onto master is an antipattern because if your code breaks the build then you are holding up everyone else from being able to make changes until yours gets reverted. If linters and tests are already passing there is a very good chance the rebase won't break anything.
You can go a step further. Instead of merging anything, you can tell the ci to merge it. Then ci can make a "merge-test" branch, run all the tests on it, and if they pass, then ff-merge it to master for real. No need for "good chance" or rebasing just to keep up with master.
It does take some extra work though, because GH and others don't really support this out of the box.
Gitlab does support this with “Merged Result pipelines”[0]. We use them extensively alongside their merge train functionality to sequential is everything and it’s fantastic.
Isn’t this the way everyone works? Write code, run some tests/linting locally, then create a PR to master and the CI server runs all the tests and reports pass/fail. Changes without a pass can’t be merged to master.
The extra step is that once the change is accepted the process of rebasing it or merging it into master is done by a bot that checks if it would break master before preceding to do it.
My issue with this approach is that it becomes tricky to scale since you can only have one job running at a time. Allowing master to potentially break scales better because you can run a job for each commit on master which hasn't been evaluated. Technically you could make that approach work by instead of rebasing onto master rebasing onto the last commit that is being tested, but this adds extra complexity which I don't think standard tooling can easily handle.
https://zuul-ci.org/ and some other systems solve it by optimistic merges. If there's already a merge job running, the next one assumes that will succeed. And tests on merged master + first change + itself. If anything breaks, the optimistic merges are dropped from the queue and everything starts from the second chance only. Openstack uses it and it works pretty well if the merges typically don't fail.
This can result in a broken master if there were new commits added to master since the pull request was submitted (but before it is merged).
The solution is to require that all PR must be rebased/synced to master before they can be merged. GitHub has an option for enforcing this. The downside is that this often results in lots of re-running of tests.
Also, the "big math brain"'s you're talking about probably read all the books your shutting down. I'd say their big math brains are the reason we have LLMs today.