Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Memory Safe Implementation of the Network Time Protocol (memorysafety.org)
101 points by SGran on Oct 11, 2022 | hide | past | favorite | 78 comments


I'm really looking forward to the client implementation. I worked on a Rust NTP server where I used to work. It was truly faster than ntpd or chrony, which is a meaningful benefit when you're talking about something that sends out data about clocks and time.

Unfortunately the server is relatively easy to build. The client, however, is where a LOT of the intelligence and difficulty lies.


Assuming a pure network client, there's not that much to build if you make simplifying assumptions, and build off the work of others.

You need something to get a list of servers, and maybe update the list overtime (if you're using a dns name).

For each server, you need to poll to accumulate a list of delays and offset. If you want to make the delay precise, you can try to get the NIC to timestamp the NTP packets. Once you've got sufficient samples from a server, you can use a regression to approximate the time offset and frequency offset for that server.

If you've got more than one server, you can choose one to follow (ala the ntp reference implementation) or merge the data from some selection. This is kind of tricky, but there are many examples to choose from. Some implementations use metrics from multiple servers to try to estimate how much of the delay is asymmetric and use that to enhance the clock precision; but the reference implementation didn't do that and most people didn't mind.

Once you've figured out which time and frequency offsets to do, send it to the OS via settimeofday and ntp_adjtime. This is a bit trickier if you don't want to step the clock, but you can calculate a frequency offset to slew the clock in the direction you want over the time you find acceptable. Reference ntp drops old measurements after an adjustment, but you can also scale them by your adjustments and keep using them.

You might want to have some feedback loop to adjust your polling rates, but either way.

None of this code needs to be particularly fast either. Ideally, very little in between timestamp and send, and receive and timestamp (NIC timestamping helps tremendously here, if that's possible); and you also want to have a minimum of delay during offset operations too. But the protocol parsing, and offset calculations can take as long as you like.

I've got a one-shot Erlang ntp client in 108 lines of Erlang, and 50 lines of nif (which is mostly unpacking the arguments) for an unpublished hobby os. It's not perfect, but it does pretty ok. If it ran continuously, it would probably meet or beat reference ntpd. Reference ntpd does a whole lot more cool stuff, of course.


The tricky part isn't the protocol, it's all of the interfaces with weirdo hardware clocks. Even just parsing GPGGA messages from a serial port can be tricky when you're trying to keep the timing tight.


There is immense value in replacing simple NTP deployments, which don't interface with weirdo hardware clocks, with a memory-safe alternative; those simple deployments dwarf the weird ones. It is fine (good, even) for there to be multiple viable implementations of NTP, fit for different purposes.


This work is the NTP client, with server promised later. https://github.com/memorysafety/ntpd-rs


Did your work get published anywhere on what you did to outperform chrony? By any chance was it clockwork.io?


> Another benefit of Rust is that we can use its standard library and package ecosystem, so our NTP implementation is much smaller (hence easier to validate) than the alternatives

It might be easier to validate the code in their repo, but I feel like they are ignoring the effort that would be needed to validate all of the very large number of dependencies.


What is the most fascinating thing you learned when you read ntpd's configure script? What is the most interesting thing you learned reading glibc?

Were you at all concerned when you discovered that the sources come from http-only servers and only have un-signed md5's for checksums?

Did you find the support for HP-UX distracting?


If their major concern is memory unsafety it's a lot easier. Most dependencies don't use any unsafe, and instead there's usually just a few libraries pulled in across them that do. One of the best parts of auditing rust (for memory unsafety) is that you can just "grep for unsafe" and know exactly where to start.


Most of the dependencies I see listed in this project are upstanding, household-name crates. Personally I'd feel more confident using those (which have many other eyes on them) than maintaining custom in-house implementations of complex (but standard) building-blocks


There is a bunch of well funded work to tackle validating various aspects of rust, the std library, and ecosystem. For example rust-belt led by Derek Dryer https://plv.mpi-sws.org/rustbelt/


I don't see the whole point. It's like creating yet-another-ntp-implementation while other well-known implementations are known to be working good and safely on billions of devices.

It is easier to report a security issue to ntpd or chrony instead of creating a new one.


The overwhelming majority of NTP deployments (by device count) don't benefit from any of the complexity or flexibility of chrony and ntpd, but do suffer from the memory-unsafety of those programs, and pass that unsafety on to the rest of us. The case for a memory-safe 80%-use-case NTP server is very strong.


> the memory-unsafety of those programs

How do you know they are unsafe? Have they been audited and memory un-safety been found? I don't understand why the Rust community automatically assumes there are always memory safety issues with C, for example. I get that it's possible to write unsafe programs in C, but not all programs are automatically unsafe just because they are written in C.

I need to do more research, because I don't actually trust that just writing a program in Rust will result in total memory safety, so these are actually questions I want answers to, not just me trying to attack Rust or anything. Thanks!


It's not just the Rust community. I don't especially like Rust, but I fully buy into the argument that code written in memory-unsafe languages is materially less safe than code that is. There are plenty of memory-safe options, and rewriting software to be memory safe --- especially when there's a clear, simple common case to seize on --- is a positive step for Internet safety.


I totally get that, but my stance on the matter is that some of the software we're talking about is just as memory safe as a new Rust rewrite because it doesn't do anything unsafe, but the rewrite could introduce other bugs and differences that could break things.

I would say I don't stand on the side of "rewrite nothing", but I'm more of a realist here, in that we absolutely cannot "rewrite everything" perfectly in a memory safe language, and we should first determine if a particular tool should be rewritten in a memory safe language by doing some analysis and testing on that tool.

Certainly, even though I know no Rust and am not an expert in memory safety, I would say that in the future we should try not to write totally new software in memory unsafe languages, but I'm not everyone so I can't make that rule and ensure it sticks.


Your stance, that audited C code is "just as memory safe" as new Rust, is wildly outside of the mainstream of software security. You're entitled to your opinions, but you're unlikely to find many qualified people to dive into a debate as unproductive as that. Plenty of carefully audited C code has later been found to have terrible flaws, and likely will again in the future.


Well then I guess "mainstream of software security" needs to do a better marketing job to explain to dumb programmers like me why I should be using memory safe programming languages!

Seriously, I would love to see a succinct explanation for the "rewrite everything" philosophy, but it seems religiously dogmatic to me so far and has done the opposite of convince me to use Rust (or another memory-safe language).

On the other hand, the "statically typed language" community has done a great job of convincing me that I should be using statically typed programming languages by showing many examples of where typing would help me in my day to day work, and now I like using Go a lot and I avoid tons of issues I had with Python in the past.


No, they don't. Not in this case, at least. They can just build better software, and it will get adopted. You're writing Go already, so I'm not sure why anyone would want to burn time in pointless debates about this stuff. Carry on! You're already doing it right.


I wanted to point you at this, which is example zero for why rewriting in "memory safe language number X" isn't always the best path: https://news.ycombinator.com/item?id=33171028

I'm definitely not debating that using better languages is better, but what I am saying is that some tools written in C have been effectively tested in the real-world by being on billions of machines and being used millions of times per day. I am not totally sure, but I think if there were major issues with the current NTP implementations we would probably have found them by now? Maybe not! But, in any case, rewrites need to be more carefully considered, planned, and executed than just some people I don't know writing a new NTP in Rust and stating it's fine to use for 80% of cases.

Thanks for having a nice discussion with me, I think I am a bit more convinced that we need to rewrite some stuff, but perhaps also more convinced that we need to do a better job of picking what to rewrite and why!


Yes, I understand the point you're making that sufficiently audited C code is as safe as Rust. No, it isn't. Where that sufficiently-audited C code is feasible to replace, it will all be replaced. Browsers may take a decade or two, but C/C++-language NTP servers had better head for the bunkers and hope for some global cataclysm to halt all progress in the industry.


> the point you're making that sufficiently audited C code is as safe as Rust

The point I am actually trying to make is that memory safety is not the only consideration for security and safety. If you make a memory safe NTP and there turns out to be bad logic inside that code that does something allowing a security hole, then that's just as bad, if not worse, than using the decades-old software that did not have that bad logic, because that software has been tested in the real world day in day out for DECADES.

I trust (somewhat) that Rust isn't going to allow unsafe memory access. I don't trust that Rust programmers can write code with no bugs that's better than a tool that has been refined over generations of programmers. Memory safe programming is a tech solution to a human problem (human imperfection).

Again, new projects should be written in a memory safe language, but that's a whole other topic, IMO.


The claim isn't that memory-safe languages foreclose on all security vulnerabilities; only the worst of them. As someone who has at various times had the job of carefully reviewing large C codebases for vulnerabilities, I'll attest: whatever the rest of the vulnerabilities may be, you spend most of your time trying (and failing) to hunt down all the memory corruption problems. Your confidence in the decades of review C programs have had, in all but a very few cases, is probably misplaced.


I’m not even saying reviews, I’m saying that code has been tested in “the arena” and been proven billions of times.

I used to work in incoming quality inspection, I know human reviews are at best 85 percent accurate. I’m arguing that decades of hammering on those old as hell programs have done the reviews for you.


I've spent most of my career, which started around 1995, as a security researcher. I think you'd be surprised by how poor a job "the arena" does at sussing out complicated memory lifecycle problems in C and C++ code. Some of the patterns we now look for to find exploitable conditions aren't even all that old; much of that "arena time" was spent not even looking for those problems. Security teams at places like Google subject C/C++ code to fuzzing at boggling scales, and people still find memory safety vulnerabilities. Shit's hard. It's better just not to have this problem in the first place.

I'm not stamping my feet demanding you track down a Rust web browser. But the thread started with someone asking what the point of a Rust NTP daemon was. I think that's been amply answered now.


Ah and here it is. You're just very, very blatantly taking your biases from writing C and applying them to Rust. Do you just... Not believe every single blog post where people say "it was a bit harder to write, but it basically does what you want once you get it to compile"?

Are you just ignoring the NVME kernel driver dev who said literally the same thing about a quick naive implementation that is nearly as performant without optimization as the one that has been in tree and tuned for years?

This insistence on naysaying and defending a categorically indefensible language is baffling to me. How many major companies and products have to move to Rust and adopt it? Do y'all really convince yourselves that everyone is just in some hype train? Is this what it's like to not being able to inspect a technology and disambiguate hype from something real?


Incidentally, I just happened across the thread you linked to (I didn't bother to follow the link before; it just isn't germane to the point I'm here to make).

Importing drama from a random thread into a Show HN is pretty rude, and you shouldn't do it again.

Not to me! I'm happy with the discussion here, and will participate as long as you keep providing opportunities to point out (read: preen about) other problems with memory-unsafe software.

But it's super rude to the person who submitted their code to "Show HN". The rules on "Show HN" aren't the same as the rules for the rest of the site, because people are vulnerable when you're sharing new work. This particular person wasn't submitting their "rm with trash" program as part of an argument against memory-unsafe software. For all we know, they just like Rust, which is a legitimate reason to write in Rust. Further, approximately nobody in the industry is worried about whether "rm" is implemented in C or Rust. Your dig had nothing to do with "rmt" and everything to do with wanting to score points on this thread.

You might consider apologizing to the "rmt" person.


I have considered it and chosen not to apologize, but I’ll delete my comment because it was just scoring points in my view, I felt really smug there. However, that example is still a good one for what I’m trying to say here. Rewrites are way harder than people think.


Fair enough! I didn't realize you were in the deletion window, or I'd have written a less† sanctimonious comment.

slightly


I deserved that so no worries.


The irony of you calling others dogmatic in this conversation is a cue for me to stop reading it, I guess.


What you're doing is just FUD bordering on some kind of weird concern trolling.

Gonna blow your mind when I tell you every single coreutil on my system is built with Rust and moreover nearly every program on my system was BUILT with that Rust-built coreutils.

Amazingly emulating some existing program behavior in Rust is easier than writing safe C! Who would've guessed? (besides eeveyone else watching and betting on Rust for the past 8 years)



>How do you know they are unsafe? Have they been audited and memory un-safety been found?

Security vulnerabilities caused by memory safety errors can be an indication.

The number of CVEs doesn't necessarily indicate the number of errors in the code, or whether or not something is secure, since there are a lot of factors at play. A project with many CVEs could be a good sign since it means people are actively looking for and reporting issues for example. Alternatively a project with few CVEs might have a ton of hidden bugs.

It's not a perfect measure for sure, but it can at least prove that there are real world memory safety bugs that can have disastrous effects if left unfixed.

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=ntpd

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=chrony


For those that did not click on the link, that lists 11 CVEs for chrony and 68 for ntpd


FWIW, from those 11 chrony CVEs:

- 8 were found within the project itself (mostly by me)

- none are memory-safety issues in the NTP-specific code

- the last memory-safety issue is from 2015 and it was in the custom management protocol (exploitable only by authenticated users), which was since then greatly simplified and made stateless

The project now has an excellent fuzzing coverage, it was audited, and I'm quite confident there are no remotely reachable memory-safety issues. I'll buy you a drink if you find one :).

NTP as a network protocol is extremely simple. There is no complex data, almost everything has a constant length. A minimal server+client implementation can be written in a few hundred lines of code. I wrote one in Rust, but the reason was server performance, not security.

Most of the complexity related to NTP is on the client side, in the processing of measurements provided by NTP. I don't think the language matters much here. However, if major operating systems will start switching to services written in Rust, I don't see a reason why chrony couldn't be rewritten in Rust, first the small parts related to networking and later everything.


Thanks for actually showing me some concrete examples of my wrongness instead of just attacking my character like a few others here (not my main thread here, that was a great discussion).


This is so cringe, to have this conversation constantly continue to crop up on Rust safety discussions when it's built on such an obviously falsifiable premise.

What C project is safe, exactly? You imply that the default almost is churning out safe C, so what projects with "competent" coders is producing vulnerability-less C code bases? I'll not be holding my breath because everytime, there's never an answer. My favorite person who spouts this off has multiple projects that have segfaulted from C memory programming related bugs.

Maybe, like, validate the basics of Rust functionality instead of constantly just casting doubt and aspersions on something while offering really weak defenses of what you justhl happen to be conditioned to.


I'm just trying to understand why I would want to use Rust. You Rust people constantly act like dickheads anytime anyone asks questions like I did. If you all are so goddamn sure that it's the best way, then how about not acting like a total asshole to me and point me to some things I can read that will convert me?

You folks are the reason no one likes Rust people and Rust adoption is lower than it should be; try being human for once.


With all due respect, I don't know what I could possibly point you to. Rust is just a better language than C or C++. It eliminates many classes of bugs that are pervasive in those two languages. It's a true 10x advancement over mainstream languages of the past. It also has been getting plenty of traction all across the industry, from Facebook and Amazon to small startups and open source projects like the Linux kernel.


The overwhelming majority of NTP deployments should probably not be running NTP in the first place.


Why do you think so? Once you need to sync with some external source occasionally anyway, why not just run ntp continuously and be actually in sync all the time? (or why would it be better not to do that?)


NTP is insecure, and it fundamentally solves a different problem than the "I need a precise to a second timestamp to validate certificates and update my RTC" need of the majority of devices.


> NTP is insecure

That's vague. What do you mean specifically? Hostile nodes joining the pools? Any issues with the protocol? Something else?


Yes, the base NTP protocol is unauthenticated UDP. So, that's pretty insecure.

Properly configured, with sufficient upstream time servers, etc... it's still pretty robust against DoS attacks and evil maid attacks, so you'll have to do some work to trick clients into following your fake NTP server. And it will be hard to hide what you're doing while you do it.

It took a while, but I think we've actually solved that security problem with NTS. Now we just have to get the vendors and the community to support and deploy NTS widely.


The blog say the example use is certificate expiry check. I think this only need ~1 minute precision. You can do away with a weekly cron job.


That assumes your clock is running at a reasonable rate, and that the clock speed doesn't vary, etc....

The more your clock is skewed, the more often you need to check to see how badly it's skewed and apply whatever corrections are necessary.

That said, I'm a big fan of running SNTP as a client, where you don't need the full power of the NTP protocol. There's no sense running a full blown NTP client on most systems.

I like the idea of memory safety, but that's just one category of potential security vulnerabilities. If Rust can solve for that and also give me other benefits like higher speed, then I'm all for it. But you have to give me a lot of good reasons, or a hell of a good single reason, if you want me to toss out everything and replace it with Rust. And IMO, memory safety alone is not a good enough reason for most use cases.


LE certs are normally cycled days ahead, so you need only +/-1 day precision for that specific purpose. But the question was: if you need to sync from time to time anyway, why is staying in sync using ntp a thing we wouldn't want?


Out of curiosity, would you ever support rewriting an existing memory-unsafe program in a memory-safe language?


Wellp, so much for the need for NTPsec.


[flagged]


> a rewrite for the sake of safety is not sufficient imo

I would disagree; I think a rewrite for the sake of safety, especially when we are talking about a piece of server software that deals with untrusted clients, is often worth it.

Certainly it isn't always: rewriting huge, complex pieces of software for the sake of anything (including safety) may just not be justifiable. Like, sure, maybe the Linux kernel would benefit from a rewrite in Rust, but I don't think that would be a good idea.

NTP is a fairly small protocol, and at least the server portion -- where I'd be the most worried about memory safety issues -- can be implemented in Rust without too much difficulty. As I understand it, NTP clients can be a bit more complicated. But I still expect they're much less complicated than many other types of clients.

> Are they going to maintain that forever and push Linux distro and the industry to move to that new solution?

Why not? That's how open source Linux software works. Someone builds it, and if there's enough interest, it gets adopted, and attracts new contributors and maintainers over time. If there isn't enough interest, it withers away.

That may not be your idea of the best use of your time (and I might agree), but who are we to tell others what to do with their time?


Please don't comment about the voting on comments. It never does any good, and it makes boring reading.

https://news.ycombinator.com/newsguidelines.html


They are justifying the rewrite by saying that it is memory safe. I don’t think its a bad goal but if they don’t do the work to maintain and package these for distros it wont make it very far.

However im always surprised by how many people are fanatics of rust and they do contribute a lot to packaging at least for Gentoo which is my daily driver. I could see a lot of these taking off if it does show a reduction in attack surface and it works seamlessly as a replacement for existing solutions hopefully not forcing a new config on everyone.


Doesn't Rust just panic and abort on runtime memory errors?

So, we maybe get memory security, but we seemingly do nothing for DoS. Isn't that a large attack vector for NTP, given it's primary utility in other protocols?


Rust the language doesn’t know anything about heap allocation, so in a strict sense, no.

The standard library provides a number of APIs that abort on allocation failures, yes. There are currently nightly-only APIs for some of them to return Result instead, and they’ll hit stable eventually. You could also not use them if you don’t want that behavior.


No, rust avoids most memory errors by proving at compile time that they don't exist. That's why it forces you to write code with lifetime annotations and such, so that thst proof becomes feasible.

I say "most" because it also allows you to get out the footguns, meaning you can sidestep this proof mechanism. But you do that by declaring a block/function as "unsafe", so if you do find a memory bug, you know exactly where to start your search.


That's right, DoS in Rust is still a thing you can have. But it's no worse than in memory unsafe languages, since memory unsafety can also lead to DoS (and in a much worse way - there are methods for managing panics, managing segfaults is much harder).


> Doesn't Rust just panic and abort on runtime memory errors?

Technically yes if you directly access an index that's OOB but stuff like Vecs and Arrays have the get and get_mut methods which allows you to try to retrieve an element (or slice). If the element is in bounds you get a Some<T> with a reference to the element (or slice) and if it fails it'll return an Error type which can be handled rather than panicking out.

It's not even slower to use the .get method rather than direct indexes because attempts to access indexes directly are implemented as .get.unwrap()


NTP takes in untrusted input from the net, does complex processing on it, and requires the ability to do something administrative on your system (adjust the time).

This gives NTP a lot of attack surface and makes it a likely vector for compromise.

Rewriting NTP in any language safer than C/C++ is likely to be a good idea.


That drives home the point that NTP would likely benefit from privilege separation too. One component talks to the network, another component adjusts time. Second component would enforce gradual skewing only. (Or a time jump once in early boot, per policy. And it could enforce the max size of that jump too, implement a policy where the machine boots into repair mode if it was more than 5 minutes off the desired time, etc.)

A very simple one-way data channel between the components would make attacking the second component near impossible.

With that, and an attacker that completely pwns component 1: At worst they could start a very gradual skew, which could then be picked up by monitoring before it causes much of an issue.


It's a networked, privileged process. If my goal were "try to ensure that more of my OS is memory safe" I'd probably start somewhere similar.

A lot of this post is specifically addressing the justification for this work so idk, I'd suggest responding to that directly.


There's no reason to believe there isn't some vulnerability lurking in chrony, which is a highly privileged process. These are exactly the starting points I would hope for.

On the other hand, the statement that they did not study chrony because they couldn't understand it does not exactly fill me with confidence.


To be fair, chrony, at least on my system, is not running as root. I'm not sure what mechanism it uses to set the system time, whether it's capabilities, or a setuid helper program, but I think it's safe to assume that, if compromised, the only malicious thing that it could do would be to set my system time to something incorrect. Which isn't nothing, but also isn't much, either.


People find vulnerabilities in old software all the time.


expat, sigh. openssl, sigh again.


It's unclear why Rust was chosen over OCaml or Haskell.


It's pretty obvious actually. They want memory safety and predictable, good performance. Rust is the goto language for that in sysdev today. Go is a much better candidate than the ones you mention, but there are legitimate reasons to want to avoid GC.

Additionally, the intersection of people who are interested in working on low level system daemons and people who prefer Haskell/OCaml must be pretty small compared to Rust.


[flagged]


I didn't downvote your original post, but I just think these "why did they use language X instead of Y?" questions are kinda boring and tiresome, for the most part.

If your goal is memory safety, Rust is a good choice. If you additionally want something mainstream (such that it's unlikely that you'll have trouble finding contributors or future maintainers), Rust is also a good choice, and Haskell and OCaml probably aren't. And who knows, perhaps the people building this were just familiar with Rust, but not Haskell or OCaml.

But really, "It's unclear why X was chosen over Y or Z" is just not interesting. If the article doesn't say why, then we're just speculating. And in this particular case, I think the answer is probably pretty simple, obvious, and boring, anyway; I don't think it's "unclear" at all.


The idea that Rust is more mainstream than Haskell or OCaml sounds pretty crazy. I like the language but it's much younger and you don't see established companies using it the same way.


I'm pretty sure all FAANGs have used Rust in prod by now (maybe not Apple).


Apple does use Rust in production. Netflix is the only FAANG I’m unsure of if they use Rust anywhere or not. Three of the five companies are platinum members of the foundation, even.


There are stirrings of Rust module support in the FreeBSD kernel, which is heavily used at Netflix, so it's possible there'll be some Rust over there too, soon.


Rust inherits mainstreaminess by having an execution model similar to mainstream languages like c++.


Performance being important in system development is pretty much a given, and doesn't necessarily even need to be mentioned. You could argue that systems software is well defined as software where security, performance, and stability are more important than all other concerns.

I think the fact that your comment didn't elaborate is why you're being downvoted, not for asking questions. It can be interpreted as a question, but it can also easily be interpreted as a shallow dismissal of their choice of Rust, which is a fairy common thing on HN. If you had laid out some details on why the choice puzzled you, your comment would probably have been interpreted more charitably.


Performance is especially important when dealing with time syncing.


I think consistency is more important than actual execution time. Python will certainly have more latency responding to time queries, but it will be consistent and likely indistinguishable from a faster program.


This is a good point, and using Rust makes a lot of sense then since you avoid GC messing with runtime consistency.


Rust is mainstream in a way that OCaml or Haskell will never be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: