> All those big changes introduce commits that make git bisect generally slower....

sgtnoodle · on Dec 20, 2022

Perhaps the poster meant that the contents of the commits themselves make bisection slower? By touching a lot of files unnecessarily, incremental build systems have to do more work than otherwise.

eesmith · on Dec 20, 2022

Could be? I've not heard of that problem, but I don't know everything.

I'm not used to Python code (which is all that black touches) as being notably slow to build, nor am I used to incremental build systems for Python byte compilation.

And I expect in a project with 2 developers which is big enough for things to be slow, then most of the files will be unchanged semantically speaking, only swapping back and forth between two syntactically re-blackened representations, so wouldn't an caching build system be able to cache both forms?

(NB: I said "caching build system" because an incremental build system which expects time linear order, wouldn't be that helpful in bisection, which jumps back-and-forth through the commits.)

bombolo · on Dec 20, 2022

Both… and version jumps in formatting tool basically will touch every single file.

makapuf · on Dec 20, 2022

Every file that changed because rules changed, which shouldn't be frequent, I don't remember black changed radically since its creation, do you have an example of some widespread syntax change ?

bombolo · on Dec 20, 2022

Just take a codebase and run black from different ubuntu releases.

The funny thing is that if you run the versions backwards you will NOT obtain identical files with what you started with.

bombolo · on Dec 20, 2022

> Bisection search is log2(n) so doubling the number of commits should only add one more bisection step, yes?

And testing 1 extra step could only add a 1 hour build more, yes?

eesmith · on Dec 20, 2022

It could, certainly. But

1) you don't have one black commit for every non-black commit, do you? Because the general best practice is to do like kuu suggested and have a specific black version as part of the development environment, with a pre-commit hook to ensure no random formatting gets introduced.

2) assuming 500 commits in your bisection, that's, what, about 9 compilations you'll need to do, so it will take you 9 hours to run. So even with a black commit after every human commit, that yes, 1 hour more, but it's also only 11% longer.

Even with only 10 non-black commits and 10 black commits, your average bisect time will only increase from 3.6 hours to 4.6 hours, or 30% longer.

I'm curious to know what project you are on with 1 hour build times and the regular need for hours-long bisection search, but where there isn't a common Python dev environment with a specific black version. Are you using ccache and/or distributed builds? If not, why isn't there a clear economic justification for improving your build environment? I mean, I assume developers need to build and test before commit, which means each commit is already taking an hour. Isn't that a waste of expensive developer time?

And, I assume it's not the black formatting changes which result in hour-long builds. If they do, could you explain how?

bombolo · on Dec 20, 2022

As I said in other comments, if you try to force contributors to reproduce exactly your local setup, you will be left with no contributors. Which is why you set up a CI to run the tests… because people will most likely not.

As for build times, it was an extreme example. But even an extra step taking 5 extra minutes is very annoying to me…

eesmith · on Dec 20, 2022

> if you try to force contributors to reproduce exactly your local setup, you will be left with no contributors. ...

That's not been my experience. To the contrary, having a requirements.txt means your contributors are more likely to have a working environment, as when your package depends on package X but the contributor has a 5-year-old buggy version of X, doesn't realize it, and it causes your program to do the wrong thing.

In any case, your argument only makes sense if no one on the project uses black or other code formatter. Even if you alone use it, odds are good that most of your collaborator's commits will need to be reformatted.

> .. an extra step taking 5 extra minutes ...

How do black reformatting changes cause an extra 5 minutes? What Python code base with only a couple of contributors and no need for a requirements.txt takes 5+ minutes to byte-compile and package the Python code, and why?

Adding 5 minutes to you build means your bisections are taking at least an hour, so it seems like focusing on black changes is the wrong place to look.

bombolo · on Dec 20, 2022

> How do black reformatting changes cause an extra 5 minutes?

Did you even read my comments?

Black reformatting causes more steps in bisecting. It's quite easy that a test suite takes 5+ minutes.

eesmith · on Dec 21, 2022

None of your comments mention running the full test suite, only build.

When I've used bisection, I've always had a targeted test that I was trying to fail, not the entire test suite. This is because the test suite at the time of that commit wasn't good enough to detect this failure. Otherwise it would have failed with that commit.

Instead, a new failure mode is detected, an automated test developed, and that used to probe the history to identify the commit.

Why are your bisections doing the full suite?

> Black reformatting causes more steps in bisecting

Yes, of course it does. But it's log2(n).

The worst-case analysis I did assumed there was a black commit after every human commit. This is a bad practice. You should be using black as a pre-commit hook, in which case only your new collaborator's commits will cause re-formats. And once they are onboard, you can explain how to use requirements.txt in a virtualenv.

If only 10% of the commits are black reformattings, which is still high!, then a bisection of 100 human commits (plus 10 black commits) goes from about 6.64 tests to 6.78 tests, which with a 5 minute test suite takes an additional 42 seconds.

If it's 3% then your bisection time goes up by 13 seconds.

If you are so worried about 13 seconds per bisection then how much time have you spent reducing the 5 minute test suite time? I presume you run your test suite before every commit, yes? Because if not, and you're letting CI run the test suite, then you're likely introducing more commits to fix breaking tests than you would have added via black, and taking the mental task switch hit of losing then regaining focus.

bombolo · on Dec 21, 2022

> This is a bad practice. You should be using black as a pre-commit hook

I would reject such commits in review.

A human might add one or two items to a list and black might decide it's now too long, and make 1 line into 10 lines.

Now I have to manually compare the list item by item to figure out what has changed.

So I normally require formatting to be done in a separate commit, because I don't want to review the larger than necessary diffs that come out doing it within the same commit.

eesmith · on Dec 21, 2022

> A human might add one or two items to a list and black might decide it's now too long, and make 1 line into 10 lines.

A human might add one or two items to a list, decide it's now too long, and make 1 line into 10 lines.

Including the same hypothetical first contributor you mentioned earlier, who you think will find using requirements.txt as being too big a barrier to entry.

Onboarding occurs either way.

I get that you don't like using black - and that's fine! I don't use black on my project either.

But it seems like you're trying to find some other reason to reject black, and constructing hypotheticals that don't make any sense.

Just say you don't like black's choices, and leave it at that.

bombolo · on Dec 21, 2022

> A human might add one or two items to a list, decide it's now too long, and make 1 line into 10 lines.

At which point I tell him to split formatting and actual changes into different commits (see https://mtlynch.io/code-review-love/).

> I get that you don't like using black - and that's fine! I don't use black on my project either.

Well according to this comment, it's because we are noobs: "the people that disagree just haven't figured out that they're wrong yet"

> But it seems like you're trying to find some other reason to reject black, and constructing hypotheticals that don't make any sense.

After the n-th time I have to re-setup the entire virtual env on a different branch just to re-run black and isort to backport a fix to a release branch… it does get tiring.

I presume most people here just do websites and don't really have a product that is released to customers who pay to support old versions for years, so black changing syntax is a 1 time event rather than a continuous source of daily pain.

But it seems the commentators here don't have the experience to know there might be a use case they didn't think of.

eesmith · on Dec 22, 2022

> and don't really have a product that is released to customers who pay to support old versions for years

My main product is 12 years old, with paying support customers, and with bugfix branches for older releases.

> just to re-run black and isort to backport a fix to a release branch

Great! That's an excellent reason. But it has nothing to with bisection.

geysersam · on Dec 20, 2022

> if you try to force contributors to reproduce exactly your local setup

  python -m venv venv
  pip install -r requirements.txt

Do you consider that imposing? I assumed that was standard. Don't basically all Python projects in existence use something like it?