I always find committing to LLVM very nerve wrecking, because of the post-commit CI testing. LLVM has so many architectures that more often than not something I write will fail on one of them. And the only way for me to find out is to commit it, wait for the buildbot to fail (which can take a few hours, during which I really can't leave my computer lest I leave trunk broken on some buildbot, which is a big faux pas), revert it and then figure out what went wrong. I'm hoping that at some point this will be improved, such that I can run the whole buildbot army on my commit before putting it on trunk.
I get the feeling that there's parts of the community that feel the same way. I'm hoping that the planned move to github will naturally cascade into pre-commit checks.
A big big problem is having enough hardware to have a CI throughput one or two order of magnitude what it is now. Unfortunately that's not an easy thing considering how "heavy" it is to build and test the full toolchain.
Rust uses https://github.com/rust-community/bors which maintains a linear queue of PRs which can only land after being rebased onto the commits before the PR and subsequently passing tests.
dlang uses several. One notable one is the "autotester" written by Brad Roberts. It's built to ensure that breaking the D compiler (as tested by the test suite) does not result in commits.
Most of the .NET repos are structured to use inner and outer loop testing with Jenkins. Most tests on most architectures are run in the inner loop, which are kicked off in parallel as soon as you make a pull request to one of the .NET repositories on Github.
Some repositories, like CoreCLR, have outer loop testing that runs on a separate schedule (nightly, I think), but those tests are far less likely to break and are more devoted to finding rare and difficult to compute edge cases.
Interesting to see screenshots of LCOV. I'm hoping to get an intern to work on test coverage this summer, and I wondered whether LCOV is still current. Looks like the latest release is from December 2016.
I can say that for C and C++, the compilation is very often parallelized at the translation unit (file) level, by starting multiple instances of the compiler either locally or over a network with something like distcc.
This is simple and effective enough that there wouldn't be much gain in parallelizing the compilers: all the cores are already busy most of the time.
Certainly the Roslyn C# compiler is highly parallel. All files are parsed in parallel, then all classes are bound (semantically analyzed) in parallel, then the IL serialization phase is sequential.
I wouldn't say that's what most people mean by parallel, but in that case I think you're better off building a layer on top of the compiler for that.
For instance, provided deterministic compilation you could keep a networked cache of compiled libraries that would be delivered as needed.
Trying to be network-parallel at any finer level is probably a waste of time -- network and (de)serialization overhead would eat away all the advantages.
It's its size that makes it difficult to move. Some major ecosystem stuff is designed around the svn infrastructure. When the will arrived to make a change, it seemed natural to migrate not just to a different VCS but a different host. And this seemed to spawn a new debate: monorepo vs multi-repo. [still open AFAIK]
At the recent 2016 US Dev Conf, there was a consensus to move to git and that the new host would be github.
Really subjective IMO part: In general, there's tons of really smart folks working on really awesome stuff in LLVM+clang+etc. There's a handful of folks also focusing on the general "plumbing" software within and among those projects. The meta-plumbing job of the dev infrastructure is "kinda interesting" to several folks who want to improve the way the project is developed. But "kinda interesting" doesn't pay the bills and so it's a second (or nth responsibility) for the folks volunteering to work on it. Add to that the "no good deed goes unpunished" rule that they'll get the responsibility/blame after making a sweeping change, it means it will require extreme patience and caution.
Yes! Such work is all done on a volunteering basis, when we find time for it on top of the "real work" (bug fixes, features, ...). All the infrastructure work is not always rewarding, you just get the upset people yelling at you :)
For a project like LLVM, it just doesn't matter too much. git-svn or plain svn works pretty well for most people.
Certainly it matters, and it'll move eventually, but i'd rather see time spent on better testing tools than a "better" vcs.
When i moved GCC from CVS to SVN, it made life a bit easier but it's not revolutionary change.
Which is funny, considering how often people argue about VCS systems.
Before we moved to Git, Roslyn was on TFS, which was basically Perforce/CVS/SVN.
You're absolutely right that the distinction among the former VCS's is minimal. However, Git offers value that was transformative compared to the former. Namely,
1. Git allows you to easily switch between multiple work items while keeping track of the work done in each item.
2. Git allows you to easily integrate with people who have significantly diverged clones of your tree without too much trouble.
3. Git allows you to easily work offline.
(1) is definitely the largest benefit, but was mitigated with tools like g5 when I was at Google. However, the Google gravity well has its own drawbacks.
(2) is very important if you want to host rapid release schedules with divergence of features. It's especially useful if you want to have long stabilization periods and low-risk bug fixes delivered in parallel to multiple customers.
(3) is pretty self-explanatory, but for most people it's underestimated how much downtime your VCS has. I'd bet, for most people, it's significantly less than 5 9's. Not only is that wasted time, it's frustrating because it's usually consecutive and removes entire working days at random.
I take it you haven't actually used the tool that was mentioned in the comment you replied to, namely git-svn? My use of svn to interface with projects using Subversion has essentially entirely been replaced by git-svn, and I can say it is essentially
impossible for someone who has used it to not realize that at least offline now works like git. Taking a step back: at some point what you run on the server is just a storage format; unless you used some of the more advanced Subversion features (at which point you might actually like using it), it generally maps pretty directly to git semantics, at which point essentially all other functionality differences are mere porcelain.
Not a big surprise, considering that SVN is "CVS done right".
Compare that to Git's author: "Subversion has been the most pointless project ever started. There is no way to do cvs right."
Git and Hg (+ the many tools that surround them: GitHub, Bitbucket, Gerrit, GitLab, etc.) have a model that makes community contribution far easier than CVS and SVN.
The community contribution concepts in git are great, but it is confusing to then mention GitHub: their modus operandi is to provide tooling to make things easier that are only hard if you insist on misusing git as if it were Subversion (by for example having a single centralized repository with multiple committers, requiring complex and annoying access control and public key management). If someone had built tooling like GitHub around Subversion and then encouraged use of svk (note the "k"; this was a replacement client for Subversion that supported offline operation and had better merging support, but which worked with any svn server), things would have felt much more reasonable before; the irony is that if you follow the actual git workflow used by Linus for Linux (where everyone has their own repository, rather than at best their own branch and at worst trying to share master), you shouldn't even need any of that for git :/.
Yes, and the centralized version comes from a tree that in the Linux workflow is only able to be modified by one person. You submit patches via email or pull requests (literal ones, to pull from a repository); you don't share commit bits on a centralized repository.
Until semi-recently, Git[1] wouldn't let you do a shallow checkout and still do useful things. For a large project, for most purposes, downloading all of history is pointless and immensely wasteful. SVN handles that just fine, and people who want git locally can use git-svn.
(edit: LLVM is surprisingly small, actually - a git clone comes in at just under 900MB. for more painful examples tho, see repos that commit(ted) binaries, or the scale of Android's repos)
[1]: AFAIK Mercurial still has no built-in support, though extensions exist. Which is probably the right choice for Mercurial.
>LLVM is surprisingly small, actually - a git clone comes in at just under 900MB
That's a little bit on the small side, but it's still very manageable.
For comparison Linux's .git folder comes in at 1.3GB on my computer, and LibreOffice's repo which has git history going back to the year 2000 weights some 3.6 GB.
I can happily say that I haven't had any performance or space problem dealing with either full repos, even on my fairly weak laptop.