Git implemented in Rust

akdas on April 1, 2019 | [–]

I love to see people reimplementing existing tools on their own, because I find that to be a great way to learn more about those tools. I started on a Git implementation in Rust as well, though I haven't worked on it in a while: https://github.com/avik-das/gitters

giancarlostoro on April 1, 2019 | | [–]

Are you basing it off anything in particular? Like outside of just going through gits own source code?

kreetx on April 1, 2019 | | | [–]

Just reading source code is a whole different experience than actually implementing something yourself. Implementing something pretty much forces you to "be right" in your understanding, while reading can be anything from "really studying" to skimming.

tracker1 on April 1, 2019 | | | [–]

Completely agree, though sometimes you HAVE to go through existing source when you do something wrong. I've implemented libraries based off of specs and white papers, and even then there's some vagueness that doesn't work in practice when there are holes.

Love to see things like this all the same as they tend to solidify protocols/specifications. This of course can have both good and bad results.

akdas on April 2, 2019 | | | | [–]

I haven't even looked at the git source code. My implementation was based mostly on the official docs. The docs include a section on the internals: https://git-scm.com/book/en/v1/Git-Internals

jonty on April 1, 2019 | | [–]

If you're interested in this, you may enjoy "Building Git" by James Coglan - a comprehensive book that takes you through reimplementing git in Ruby.

https://shop.jcoglan.com/building-git/

abhijat on April 1, 2019 | | [–]

If you have read this book, would you care to comment about its content and quality?

It is something I am interested in buying but I have not been able to find reviews of it.

steveklabnik on April 1, 2019 | | | [–]

I haven’t read it yet, but I’m lucky to call the author a friend. I read all of his tweets about it while he was building it. I expect it to be of extremely high quality. I’ve got a lot of respect for his abilities.

The only reason I haven’t read it yet is that I think it deserves a lot of attention and I haven’t made time yet.

abhijat on April 1, 2019 | | | [–]

Thanks, I bought the book.

Still undecided as to which language to work in, I wanted to use rust but the one part I'm unsure about is possibly where tree like data structures will need to be implemented?

I guess that's tough to do in rust? I will give it a try.

steveklabnik on April 1, 2019 | | | [–]

I personally like to work on one thing at a time. If you're really trying to learn about how Git works, you should, in my opinion, use a language you already know well. That way, you're focused on learning one thing, not two things at once. So I would recommend using the language that you know best.

Not everyone agrees with me, of course.

As I said, I haven't read the book yet, so I can't tell you how easy/difficult it will be in Rust. The author is learning Rust now, incidentally. But if you need a graph (which you probably will), I'd advise you to use something like https://crates.io/crates/petgraph instead of building one yourself. The difficulty is in writing graphs, not using them.

Twisol on April 1, 2019 | | | | [–]

Graphs are harder if you try to use pointers to identify neighbors. In Rust, a pointer is not merely an index into memory; it is a statement and guarantee about how that memory will be accessed. These semantics are more than a graph structure needs.

To "get around" this (but see the next paragraph), we usually associate each node to a simple identifier such as an integer (usize), and store all nodes into a Vec. Neighbors are indexed by this identifier rather than a pointer. (Indirection? Maybe. But I'm pretty sure most processors support a base+offset mode just as easily as a direct reference mode.)

Honestly, I think this is good practice even outside of Rust. If you're using pointers, and you want to associate some extra data to a node that isn't part of its graph structure -- a label, say, or some other structure that you're modeling the relationships of -- there's no good way to add that data without modifying the definition of a node. A pointer indexes a _single_ point in memory. An abstract identifier may index any number of points in memory -- just create another table containing the new information you want to associate.

codetrotter on April 1, 2019 | | | [–]

> To "get around" this (but see the next paragraph), we usually associate each node to a simple identifier such as an integer (usize), and store all nodes into a Vec. [...]

> Honestly, I think this is good practice even outside of Rust.

It is, and they use this way of writing their code a lot in game dev. They call it Entity Component System, or ECS for short. The concept of ECS extends a bit beyond what you described above but fundamentally I perceive your description to fall in line with ECS.

https://en.wikipedia.org/wiki/Entity_component_system

An ECS library for Rust exists, named Specs. It might be of interest.

https://slide-rs.github.io/specs/

Twisol on April 1, 2019 | | | [–]

I am indeed alluding to the principles of ECS! I didn’t want to beat readers over the head with what might be perceived as “a whole new way to architect your software!!1!”, but this is the next step in that direction.

skybrian on April 1, 2019 | | | | [–]

There are benefits but they come with downsides: you can have node ids pointing to deleted nodes, and collecting unused nodes is up to you. If you want to avoid fragmentation, you need to recycle nodes and possibly node ids as well. These are things garbage collectors handle.

It's not all that different from a database, though.

Twisol on April 2, 2019 | | | [–]

> It's not all that different from a database, though.

Indeed! I'm tempted to coin a variant of Greenspun's Tenth Rule: any sufficiently complicated program contains an ad-hoc, informally specified, bug-ridden, incomplete implementation of a database engine.

> There are benefits but they come with downsides

Downsides relative to what? The downsides listed are all in common with traditional pointer-based references, so I would argue that garbage collection is rather orthogonal to the question of storing indices instead of pointers. Any allocation comes out of a memory arena of some kind, be it an explicit vector of slots or the implicitly-defined standard heap. The tools for solving these problems are the same in all cases.

Certainly, Rust's references avoid all of the problems you listed. Rust pointers essentially embed the semantics of a garbage collector at compile-time [0], in the domain where ownership patterns can be strictly verified. But nodes within a graph are already a poor fit for the ownership model -- the entity that "owns" a node is really the graph itself, not its neighbors -- so that's the level of granularity at which Rust's references are useful. You need something else within the scope of the graph.

EDIT: On reflection, you might be referring to a language like Java which has an ambient global garbage collector. Indeed, using indices instead of pointers means you're on your own -- you've allocated the memory arena through the standard means, but then you take on the responsibility of managing that memory yourself. This is a fair criticism! Purely in my experience, data modeled as a loose graph of directly-related objects is a lot more difficult to understand and maintain than data modeled indirectly using some form of identifier -- mostly because of the effects I mentioned in my earlier post on associating new information to an entity.

[0] https://words.steveklabnik.com/borrow-checking-escape-analys...

skybrian on April 2, 2019 | | | [–]

Yes, that's what I meant. Cleaning up unreferenced nodes (should you want to do that) would require some kind of mini-garbage collection algorithm. And indeed, git has a gc command to do this, even though reference counting would work for a DAG. It's doable, but it's not what I'd call simple.

But if you know through some other means the exact time when a node should be deleted, you can delete it at that time, and anyone following a soft reference will find that it's no longer there, which may be a way of catching a bug. This is how both databases and entity component systems work. But it does mean that resolving a reference can fail, and you have to handle that somehow.

steveklabnik on April 1, 2019 | | | | [–]

Recycling IDs is pretty easy with generational indices.

fulafel on April 4, 2019 | | | | [–]

Does specs provide answers to these issues?

taktoa on April 1, 2019 | | | | [–]

Trees are easy in Rust. It's graphs that are harder.

littlestymaar on April 1, 2019 | | | [–]

Git “trees” are graphs though, since you have merge commits.

FullyFunctional on April 2, 2019 | | | [–]

More to the point, they are DAGs. Git objects cannot form a loop, thus they are acyclic. </pedantic>

littlestymaar on April 3, 2019 | | | [–]

That's absolutely correct. And your emphasis on this is on point, since DAG are relatively easy to implement in Rust: You just need to use ref-counting (instead of plain `Box` for a Tree) whereas general purpose graphs are harder to implement (and can lead to run-time bugs if node deletions aren't managed properly).

kbknapp on April 1, 2019 | | | | [–]

It's pretty good, I'd recommend it. I haven't finished the book yet, but what I've read thus far has been good.

Something I'd wish I'd known (although it wouldn't have changed my decisions to purchase) is that it's not an exploration style book (i.e. "Let's cat this file and find out what it contains and why.") it's more of an explanation (i.e. "When I cat this file, it outputs XYZ which means ABC which I know from my research of the git source."). So the author isn't taking you along on their research, but rather coming back to you after the research is done to explain their findings from the ground up.

This means early chapters have a lot of, "You'll just have to trust me XYZ means ABC." But this is also understandable given the complexity of git; there isn't really a square one.

I also would have preferred the author use something like Python instead of Ruby for the reference implementation. IMO Python is a little more ubiquitous and easier to install/setup than Ruby. Ruby also leaves Windows devs at a disadvantage. But that's just me being pedantic.

Overall I'd give the thumbs up.

abhijat on April 1, 2019 | | | [–]

Thanks, I bought the book after finding a sample chapter on the author's website.

ernst_klim on April 1, 2019 | | [–]

Also a git implementation in pure OCaml, used for irmin git-based kv storage:

https://github.com/mirage/ocaml-git

https://github.com/mirage/irmin

empath75 on April 1, 2019 | | [–]

Man I was thinking last week that something like this based on git or pijul would be a better backend for kubernetes than Etcd is.

ioquatix on April 1, 2019 | | | [–]

Here is an implementation of a Ruby based kv store build on top of rugged, which is built on top of libgit2:

https://github.com/ioquatix/relaxo-model

It's so much fun, but not that practical for scalable websites.

ernst_klim on April 1, 2019 | | | [–]

>It's so much fun, but not that practical for scalable websites.

Git based kv has a bit different purpose than the regular kv storage. They are intended for communication between entities, running in parallel, sort of transactional memory. They are not intended for users' data storage.

seanmcdirmid on April 1, 2019 | | | [–]

That sounds more like a tuple space than a KV store?

ernst_klim on April 1, 2019 | | | [–]

Not sure, but the idea is that you could not only read and write, but write in parallel so the keys are merged according to the merge rule you've provided.

shafte on April 1, 2019 | | [–]

Not really related, but a quick plug for the work that Mercurial is doing to port substantial portions of its main binary to Rust: https://www.mercurial-scm.org/wiki/OxidationPlan

My understanding is that they want to get it fully ported before Python 2 EOL.

hannob on April 1, 2019 | | [–]

I don't see any info about a license.

Strongly recommend using some standard FOSS license before plenty of people add commits and it gets a big mess clearing up the licensing situation later.

littlestymaar on April 1, 2019 | | [–]

> Implementing git in rust for fun and education!

Also, not having a license file isn't a messy situation, that means “this project is protected under Berne Convention copyright“: the author is the only one holding every rights on the code and every use that is not explicitly allowed is a copyright infringement (unless it's fair use).

CamouflagedKiwi on April 1, 2019 | | | [–]

The author doesn't hold copyright on code other people submitted unless they explicitly give ownership via a CLA or similar. A licence would make it clearer, or at least a lot easier for other people to consume.

littlestymaar on April 1, 2019 | | | [–]

Exactly, but since it's a learning project it's not obvious it's supposed to receive and accept code contribution anyway (so far it didn't receive any, the only two commits are fixes to typos in the README).

That being said, it would be nice from the author to put the code under a permissive license to allow other people to play with his code too (at the moment, even forking it is a copyright infringement…).

slavik81 on April 1, 2019 | | | [–]

> at the moment, even forking it is a copyright infringement

Not that it changes much, but from the GitHub Terms of Service:

> By setting your repositories to be viewed publicly, you agree to allow others to view and "fork" your repositories https://help.github.com/en/articles/github-terms-of-service#...

littlestymaar on April 1, 2019 | | | [–]

Yes you're right, github forking is allowed. Pulling the code on your computer, changing it and contributing changes on your fork is still illegal though.

GoblinSlayer on April 1, 2019 | | | [–]

Not illegal, but rather not granted (and not denied either). Only the author decides, as it's his right, not someone's else.

Zarel on April 1, 2019 | | | | [–]

That sounds nice and all but it's wrong in two major ways.

First: GitHub has a Terms of Service which was somewhat-recently amended to make this license grant explicit:

https://help.github.com/en/articles/github-terms-of-service#...

"Any User-Generated Content you post publicly, including issues, comments, and contributions to other Users' repositories, may be viewed by others. By setting your repositories to be viewed publicly, you agree to allow others to view and 'fork' your repositories (this means that others may make their own copies of Content from your repositories in repositories they control)."

(Crucially, it doesn't require an open-source license, though.)

Second: even without that, there's such a thing as an implied license:

https://en.wikipedia.org/wiki/Implied_license

Like, if you write something down on a piece of paper, you can't then sue the owner of the paper for copyright infringement.

Similarly, if you upload code to GitHub, and tell it to share your code, you can't then sue them for sharing your code, ToS or no ToS.

masklinn on April 1, 2019 | | | [–]

Pretty much all the TOS says is there's an implicit reproduction license (other users can see & fork the work) and possibly broadcast (the fork itself has the visibility of the original). Not adaptation, not use, not exploitation, …

And that license grant is solely through github as a service, it's unclear that a local clone is even permitted.

blattimwind on April 1, 2019 | | | [–]

> And that license grant is solely through github as a service, it's unclear that a local clone is even permitted.

That license grant has been added specifically to make GitHub itself waterproof (AIUI), so it makes sense it doesn't extend to user's rights. Look, but don't touch.

Zarel on April 1, 2019 | | | | [–]

That's... somewhat true. My main objection is to "the author is the only one holding every rights on the code and every use that is not explicitly allowed is a copyright infringement (unless it's fair use)".

The ToS doesn't say there's an implicit reproduction license, though; it says there's an explicit reproduction license.

The other licenses can still be argued to be implicit. For instance, you have a decent argument that local clones are an implicit license – GitHub provides a "Clone or download" button directly on the repo page, and it's one of the main use cases of GitHub. (Other arguments exist.)

littlestymaar on April 1, 2019 | | | [–]

> The ToS doesn't say there's an implicit reproduction license, though; it says there's an explicit reproduction license.

Then it's totally excluded from my claim which aims «every use that is not explicitly allowed». :)

littlestymaar on April 1, 2019 | | | | [–]

Thanks for the clarification on the “implicit license” point, I glossed over a little bit quickly. I should definitely have said “every use that is not explicitly or implicitly allowed is a copyright infringement”.

Your first point doesn't really bring much though, since it falls in the “explicitly allowed” part of my comment.

Overall, my whole point stands still: if anyone went on GitHub, downloaded the project and did anything with it that went beyond fair use, that would be a copyright infringement because neither the author nor GitHub granted you any permission to do so.

fulafel on April 2, 2019 | | | | [–]

Aside: "you agree to.." is not automatically enforceable in this kind of ToS.

adrianN on April 1, 2019 | | | | [–]

Fair Use doesn't exist in many jurisdictions.

littlestymaar on April 1, 2019 | | | [–]

It isn't called that way and doesn't offer the same amount of protection everywhere, but the Berne convention itself includes copyright exceptions [1] that I included in the broad “fair use” phrase.

[1] https://en.wikisource.org/wiki/Convention_for_the_Protection...

fulafel on April 2, 2019 | | | [–]

Not the op, but it seems your linked bit of the BC justsays that countries are allowed to legislate exceptions. The only hardcoded exception is the short citation one.

devit on April 1, 2019 | | | | [–]

"Use" is allowed, redistribution of the original or derivative works is essentially what is not permitted.

littlestymaar on April 1, 2019 | | | [–]

I guess it depends on what you call «use». You can read the code, for sure, probably even save it on your disk (it may depend on your jurisdiction though).

But, can you compile it ? I'm not sure… Better ask your lawyer. And what about running the compiled binary ? I don't think you're allowed to do that.

kevin_thibedeau on April 1, 2019 | | | | [–]

That's how copyleft licenses work. Base copyright law is more restrictive and possessing an unlicensed copy can be infringing in certain countries. This is why public domain is sometimes problematic.

mnd999 on April 1, 2019 | | | | [–]

Arguably by downloading it from github you’ve made an illegal copy.

louiz on April 1, 2019 | | | [–]

No, you copied on your hard drive something that was offered to you for free on github.

Hosting that code yourself somewhere else would be illegal though.

masklinn on April 1, 2019 | | | [–]

> No, you copied on your hard drive something that was offered to you for free on github.

It wasn't offered for free local reproduction since that right was not explicitly granted, and Github's license grant does not grant it either (as far as my reading goes). Though the country you're in may have a private copy exception, in which case you'd be in the clear I think (depending on the specifics of that exception).

littlestymaar on April 1, 2019 | | | | [–]

It might depend on the legislation though. In France at least it would fall behind the «Exception de copie privée».

ndnxhs on April 1, 2019 | | | | [–]

By making it public to view you agree others can load the page and view the code. Loading the page and viewing it is downloading it.

kevin_thibedeau on April 1, 2019 | | | [–]

That's not how copyright law works. The particulars vary by country but the share everything internet culture doesn't automatically grant permission to make copies.

fxfan on April 1, 2019 | | | | [–]

since you sound knowledgeable- if I go and without any contract "donate" some of my code to the repo- what becomes of the right to that code (the patch I contributed)?

EwanToo on April 1, 2019 | | | [–]

You retain the copyright of the patch and can re-use it somewhere else under a different licence.

And in turn, the project you submitted it to cannot re-licence that patched section of code (e.g. become either GPL licenced) without your permission, as it does not belong to them.

(Edit for clarity)

masklinn on April 1, 2019 | | | [–]

One issue is under the Berne baseline, even given github's license grant[0], there is no license to make adaptation, arrangement or derivative work. So it's unclear that the patch would even be legal to start with, in the sense that it's either a modification / adaptation of the work or a derivative of it.

[0] https://help.github.com/en/articles/github-terms-of-service#...

littlestymaar on April 1, 2019 | | | [–]

Yes, the patch would be illegal, but the original author still wouldn't have all the rights on it. And it could also be illegal if the original author distributed the amended code elsewhere.

Edit: I edited my comment to say “could” instead of “would” because the original author could argue that the author of the patch implicitly gave him the right to redistribute his patch by contributing it to a public repository. I'm not sure it would stand in court, but I'd say it would have a non-null chance of success.

But If the author, who initially claimed the project was a learning project, decided to use it commercially, he clearly wouldn't be allowed to use the patch. (and again, it could be different if the patch author willingly contributed to a commercial product).

GoblinSlayer on April 1, 2019 | | | | [–]

Git derives patches from files on disk, so the patch inherits the file's license. Logically you commit files or even the whole source tree, a patch is an optimization detail, which is used to rebuild the source tree at a given point, so it's a matter of what license the source tree has at that point.

cyphar on April 1, 2019 | | | [–]

This isn't really relevant to the legal status of a change (i.e. patch) to a proprietary work where the source code is public.

The answer is that making the change is already usually copyright infringement (though I think some countries have a concept of private copies being exempt from these types of restrictions). But redistribution of your patch would definitely be copyright infringement because a license to create a derived work was not given to you -- and patches are by definition derived works.

josteink on April 1, 2019 | | | [–]

Easiest way to fix that is often just sending a PR with whatever is the “default” license for that particular eco-system.

For Rust that is (afaik) MIT. Why don’t you go try it? ;)

aorth on April 1, 2019 | | | [–]

As of a few minutes ago the project now has an MIT license. So that clears that up!

https://github.com/chrisdickinson/git-rs/blob/master/LICENSE...

steveklabnik on April 1, 2019 | | | | [–]

(MIT/Apache2, yes)

e12e on April 1, 2019 | | | [–]

Fwiw license (mit) has been added in the past 7 hour: https://github.com/chrisdickinson/git-rs/commit/2be8314b9b1c...

_bxg1 on April 1, 2019 | | [–]

Rust seems like a good language for git given its performance and memory-safety, no?

aidenn0 on April 1, 2019 | | [–]

Actually any high performance GC'd languages would be fine too because latency is a non-issue for long running git operations (you won't notice if your git clone pauses for 100ms, whereas you will notice if your UI does). Throughput of malloc() and GCd languages tends to be similar when latency isn't a concern.

_ivvf on April 1, 2019 | | | [–]

GC is only part of the equation. I have always found this article informative: https://marc.info/?l=git&m=124111702609723&w=2

Performance for higher-level languages is usually great in-so-far as you're able to essentially write C code in that higher level language. When the language's limitations inhibit you from writing the C code you want to write, performance usually suffers. In java's case, lack of value types and stack allocation can be a major performance hindrance. Boxing is also a problem, although, as the mailing list post notes, this is easily overcome-able via manual specialization.

_bxg1 on April 1, 2019 | | | | [–]

I was speaking more to stability. Rust is designed to be an incredibly safe language without sacrificing any performance; that seems like a good match for a version-control system.

aidenn0 on April 1, 2019 | | | [–]

Most GC'd languages are also memory safe.

iknowstuff on April 2, 2019 | | | [–]

FYI, Rust's safety goes beyond that. Its ownership model keeps you safe from data races, unlike, to my knowledge, GC languages.

aidenn0 on April 3, 2019 | | | [–]

IIRC Rust's safety is provided by affine types; all languages with affine or linear types can provide the same guarantees Clean and Mercury both come to mind of the top of my head (IIRC Clean had "Concurrent" in its name at one point), and I think there are both Haskell and F# variants with either affine or linear types.

In addition there are many other solutions to safe parallelism and/or concurrency, some of which don't require a type system at all; Erlang is famous for safe concurrency and is untyped.

Lastly, there's good old fashioned multiprocessing which can be safe just by not sharing memory.

There is no one feature that is new in Rust, but it has a relatively unique set of features in the non-GC language world; ATS is the only one coming to mind, though I'm sure there are some other niche ones.

I love this combination in rust because latency sensitive operations in GCd languages are notoriously hard to achieve. Lisp was able to be an operating system because nobody needed to run quake at 100fps on a lisp machine. With GC you can pick latency or throughput but can't reliably get both without coding around the GC.

This does mean for me that when considering things that rust is particularly good at, latency sensitive applications stand out; this is not to say it's bad at non-latency sensitive applications, just that one has a lot more choices when latency is a non-issue.

lelanthran on April 2, 2019 | | | [–]

Just how big is your repo that, other than cloning, git operations induce user-noticable performance problems?

leonardmh on April 4, 2019 | | | [–]

On the other hand, if you can make git faster and safer, why not?

d33 on April 1, 2019 | | [–]

I like the changelog. Looks like a zero-effort way to publish something other people could make use of.

kasbah on April 1, 2019 | | [–]

I believe that's done through conventional commits.

https://conventionalcommits.org

Chris2048 on April 1, 2019 | | | [–]

To be honest, I wish there was a movement for something like "literate changelog", where the CL was propely linked with something like a blog post, and specific repo versions. I guess Pull Requests sort of take that role.. But sometimes PR diffs aren't the same as annotated code in a blog post.

_ugfj on April 1, 2019 | | | [–]

No, currently they can't as there is no open source license.

cyborgx7 on April 1, 2019 | | | [–]

I used it anyway. Sue me.

mises on April 1, 2019 | | [–]

Can somebody tell me what is behind the recent rewrite-all-the-things-in-rust craze? I get it can have some benefits in terms of security, but it seems rewriting so many things in it just for the sake of it is a bit excessive.

I understand some of these are very likely for educational purposes (like this one and others; it's good for getting more familiar with the language), but it still seems to be a bit of a strange trend (especially since people who don't need to learn are doing it, seemingly just because "yay rust").

emef on April 1, 2019 | | [–]

Rewriting existing software in a new language is one of the best ways to learn a language.

> especially since people who don't need to learn are doing it

Don't need to learn?? There's always something to learn.

adamnemecek on April 1, 2019 | | | [–]

I support RIIR 100%. It’s much easier for me to contribute to Rust projects. It’s not just the language (but that is definitely a part). I don’t need to worry about shit like “how do I build this project” or "my code might run on platforms x and y but I'm not sure about platform z". With c/cpp, the build process can be potentially very complicated. Missing headers, missing binaries. With Rust, I just run “cargo build”.

Furthermore, the resulting code feels just so sturdy. I can also expose it as C and it can be used from the likes of Python.

josteink on April 1, 2019 | | | [–]

> With Rust, I just run “cargo build”.

The value of simple things like this is immensely underestimated.

beatgammit on April 1, 2019 | | | [–]

Yeah, I just spent a couple hours configuring CMake when I was porting a Windows app to Linux, and it was a pain tracking down all of the dependencies. I had a similar app in Rust that worked with a `cargo build` out of the box (actually, my friend rewrote my Rust version to C++ because I was lax in updating it).

Say what you want about Rust v C/C++, but you cannot tell me that Rust's build process isn't easy. In fact, building for other platforms is pretty trivial, just `rustup add <target>` or wherever and you can target pretty much any common platform and many uncommon ones, and those targets will get updated with everything else. In fact, it's so nice that I have to convince myself to not use it as the way to distribute CLI apps (`cargo install <tool>`).

jchw on April 1, 2019 | | | [–]

I scrolled down for this comment.

Actually there's a lot of great reasons to rewrite everything in every language. Git is an especially good piece of software to implement everywhere because it's relatively stable and it's pretty useful.

As for actual reasons, one good example is so you can keep your dependencies in the language, using the language package manager. For Go nobody even questions that this is worth it; it enables painless cross compiling and completely static, libc-free binaries. For Rust that may not be a thing, but you do at least get the benefits that you could integrate Git functionality without having to hack around in porcelain.

This one here is a learning experience by it's own description, but I would suggest people stop complaining about "rewriting everything" in $LANGUAGE. The opposite complaint is often cited as a reason why to not use the language (that, for example, basic programs haven't already been ported.) If we did build an alternate world with feature parity, unit testing, optimizations, in a memory safe language, I doubt many people would be complaining about the strange trend of rewriting things anymore.

swsieber on April 1, 2019 | | | [–]

There are bindings to git in Rust (https://github.com/rust-lang/git2-rs ) - I've used it in a couple of my projects (https://github.com/samsieber/subgit-sync for example).

That said, even when using those bindings, I've had to drop down to wrapping the porcelain sometimes. I'd love to have a native rust git implementation. It'd be easier to hack on, and to abuse for the type of git interactions I've written.

And it'd be one less external dependency to worry about when cross-compiling. I love how easy it is to install things from source in Rust - and it's pretty easy to add flag to make the program use your CPU's special instruction set to make it even faster.

mises on April 1, 2019 | | | | [–]

That's a good point; you're right that it's helpful to be able to interface with such a ubiquitous program natively in a language of choice. I did see a mention on the page of deploying as a crate and using in another program; that seems very convenient.

> stop complaining

Not a complaint; more a question as to why there's a specific move around rust. I appreciate the reply; that's exactly the kind of response I was looking for.

jchw on April 1, 2019 | | | [–]

Fair enough. There's definitely a lot of folks who do get upset over this stuff, maybe as pushback to the new language hype.

beatgammit on April 1, 2019 | | | | [–]

Also, it would be pretty sweet to use git with Redox, and I don't think Redox works very well with things not written in Rust.

steveklabnik on April 1, 2019 | | | [–]

Redox can run a lot of non-Rust software, see https://www.redox-os.org/news/release-0.5.0/ a list of packages that work.

PrototypeNM1 on April 1, 2019 | | | [–]

In addition to what has already been said, I think there's a lot of interest in reimplementing command line tools in Rust because there are a lot of useful support libraries for TUI development. Because of the lower barrier of entry (no fussing with build/linker issues to use convenience libraries) you're seeing an increase in people trying things just to try them.

beatgammit on April 1, 2019 | | | [–]

For me, dealing with something written in Rust is less painful than dealing with something with bindings for Rust, and honestly I'd consider rewriting something for that reason. And that's the case for nearly any language.

For example, I rewrote `tar` in JavaScript because I wanted to use it in the browser, and I didn't want to fiddle with trying to compile the existing project to JS. It took me a weekend and it worked pretty well for the project I needed it for. That project has since died (completely redesigned), but the tar stuff still works.

These days I'm getting lazy, so for something like git, I'll often exec out to the CLI app instead of fiddling with bindings, especially if it's not a performance-critical part of my app. However, I'd definitely look for a rewrite first and clean bindings second to use as a lib, especially if the rewrite had a suitable license (anything not copyleft).

ryanolsonx on April 1, 2019 | | [–]

I started one in PHP (because why not). It's pretty fun!

chris_mc on April 1, 2019 | | [–]

I would like to see some sources like this that are language agnostic that give you the tools needed to implement your own popular tool. For example, where could I look to find a written description of the way git works from the ground up? Kind of like a "guide to implementing X" type of thing, but without code.

SilasX on April 1, 2019 | | [–]

What's with all those posts about "X implemented in Rust" today?

sneakernets on April 1, 2019 | | [–]

I'm still waiting on a C64 port.

qualsiasi on April 1, 2019 | | [–]

Am I the only one who thinks of this article after reading the whole thread? > https://overreacted.io/name-it-and-they-will-come/

HN discussion: https://news.ycombinator.com/item?id=19485609

johnklos on April 1, 2019 | | [19 more]

[flagged]

dang on April 1, 2019 | | [–]

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

giancarlostoro on April 1, 2019 | | | [–]

Rust is pretty darn portable as it is though. It runs on the major platforms. Are you thinking its Assembly or something? Everywhere FireFox is compiled to Rust has to run on, which is a lot of platforms.

Maybe it wont run on your toaster but if you are making git commits through your toaster you got other issues.

kbknapp on April 1, 2019 | | | [–]

> if you are making git commits through your toaster you got other issues.

This made me snort out some coffee on a Monday. Thank you!

I think your point is also a good one I don't see represented often, portability for portability's sake is kinda silly. It's use that's important.

nindalf on April 1, 2019 | | | | [–]

I agree with the sentiment behind your comment. Rust is very portable, but isn't portable to all platforms. Obviously it works on the big ones like Windows, macOS, most variants of Linux, BSD etc. but it doesn't on Alpine Linux. IIRC there was an issue compiling it without glibc (which Alpine lacks).

Edit - apparently rustc can now be linked to musl instead of glibc in nightly. Cool!

vbarrielle on April 1, 2019 | | | [–]

Rust can be compiled without glibc, by statically linking to musl, but it looks like it's still experimental as the only distribution of it is on the nightly channel: https://static.rust-lang.org/dist/rust-nightly-x86_64-unknow...

steveklabnik on April 1, 2019 | | | [–]

Compiling most Rust programs with MUSL is fine, and available on stable. But the rust compiler itself with MUSL had some issues, and this were worked out very recently, and so it hasn’t totally ridden the trains to stable yet. https://github.com/rust-lang/rust/issues/59302

giancarlostoro on April 1, 2019 | | | | [–]

https://pkgs.alpinelinux.org/package/edge/community/x86_64/r...

Is this not a successful build? At that point it seems like the OS is just not providing the necessary tooling, but Rust will run on the architecture.

MisterTea on April 1, 2019 | | | | [–]

Maybe he's a plan 9 user.

ufmace on April 1, 2019 | | | [–]

Eh, why not? Seems like a chicken-and-egg thing. If there's no Rust support for some platform, fewer people will want to write things in it because they won't be able to run them where they want. But the less Rust software there is, the less interest anyone would have in writing build tools for less popular platforms.

Break out of these patterns by writing useful software in it that people would like to have on less popular platforms, so more people feel motivated to build and maintain build tools for it.

foreahl on April 1, 2019 | | | [–]

Unless you're being sarcastic and I'm just missing it, I think you'r missing the point here.

The author says quite clearly that it's "for fun and education".

profquail on April 1, 2019 | | | [–]

If you really needed to run Rust code on some platform LLVM doesn’t natively support, it does allow you to compile to C. Presumably you could then compile that using whatever C compiler works for your platform.

Although Rust is not as portable as C, going through these hoops would mean that —- modulo codegen bugs —- the generated C code should still be as memory-safe as the original Rust code.

gameswithgo on April 1, 2019 | | | [–]

> it does allow you to compile to C

no, there is not any fully featured or official way to do that at the moment.

Ygg2 on April 1, 2019 | | | [–]

What do you mean by non-portable? Can't run on 32mb RAM?

beatgammit on April 1, 2019 | | | [–]

You can run Rust code on microcontrollers where 32mb of RAM is far beyond the amount available. You won't be able to compile on that platform, but you can certainly target it (but git might not fit).

GoblinSlayer on April 1, 2019 | | | [–]

BTW is there opensource implementation of C on Windows?

steveklabnik on April 1, 2019 | | | [–]

There are open source C compilers that work on Windows, yes.

GoblinSlayer on April 2, 2019 | | | [–]

If there's nothing to link the result with, that's not really a proper implementation.

steveklabnik on April 2, 2019 | | | [–]

Linkers exist too.

OpenBSD-supreme on April 1, 2019 | | [–]

>rust this

>rust that

Why re-invent the wheel? Ada is superior. :^)

k0t0n0 on April 1, 2019 | [–]

awsm I was really interested in how could one go about implementing git like system. nice work OP