Performance of Rust and Dart in Sudoku Solving

yaakov34 · on April 6, 2013

A small personal peeve: it doesn't make sense to say that Sudoku is NP-hard, as the linked post does (or NP-complete). First of all, this benchmark (as far as I can tell) uses only traditional 9x9 puzzles, of which there is a finite number. Problems with a finite input space are trivial from the point of view of complexity theory; in order to talk about any kind of completeness or hardness results, you need to generalize Sudoku by defining NxN problems in some way.

Moreover, once you define NxN Sudoku (the generalization is easy), you need to pick the problem for which you want to prove a hardness result. The decision problem is typically defined as "given an instance of a problem, determine if there is a solution". For Sudoku, you would need to decide if a partial grid can be completed to a full grid, i.e. if the given puzzle is solvable. This has been proven to be NP-complete. However, this is not how the puzzle is usually done, including in this benchmark: you are given a puzzle which is known to have a solution, and the solution is known to be unique. You then need to find the solution. It's been shown that such a Sudoku problem is equivalent to the unambiguous satisfiability problem (given a logical formula which has a unique variable assignment that satisfies it, find that assignment). Now the unambiguous satisfiability problem is not known to be NP-complete when phrased as a decision problem.

Anyway, don't just throw around terms like "NP-hard". :)

EDIT: edited "unique satisfiability" to "unambiguous satisfiability". I believe this is the correct terminology for the "promise" version of the problem which I described, and to which Sudoku is equivalent. (The unique satisfiability problem is "decide if a logical formula has exactly one solution", where you are not told if it has more than one, none, or just one).

attractivechaos · on April 6, 2013

Finding and understanding computational algorithms is part of my daily work. I know what time complexity means. By saying NP-hard, I mean an NxN sudoku cannot be solved in time polynomial to N.

Most sophisticated sudoku solving algorithms, including mine, do not assume the grid has a solution or one solution. They test and give all possible solutions. I am really trying to solve an NPC problem.

That said, your comment still tells me something I do not know - you seems to say that given a sudoku that is known to have one solution, there may be polynomial algorithms. This reminds me of someone who left a comment in my blog, saying that if the solution is unique, at each step there are at most two choices with the exact cover matrix. He couldn't prove that, but his program run happily on tens of thousands of Sudokus without problems. This does not necessarily solve sudoku in polynomial time, but may make solvers faster for unique sudokus.

yaakov34 · on April 6, 2013

Yeah, I have no problem with that if it's viewed as an NxN problem. I do have a slight tic about this since I've heard "NP-hard" misused so often, including by one guy at a former workplace whose program needed to test all 6 permutations of something-or-other to discover the optimum one, and that was "NP-complete", you see.

About the complexity of Sudoku which is known to have a unique solution: I didn't say there is a polynomial algorithm for it. It's equivalent to the Unambiguous SAT problem, and Valiant and Vazirani proved that it's in some sense almost as hard as the SAT problem itself. You can't call it NP-complete (because that's not known), but if there is a polynomial algorithm for it, then there is a probabilistic polynomial algorithm for every problem in NP; and since you can run a probabilistic algorithm as many times as you want and get independent results, you'd be able to solve every decision problem in NP with immense certainty (although less than 1).

About efficient algorithms for Sudoku, I think that the crux here is average case complexity (defined in some way over the sort of samples you are using). It's not clear to me that average case complexity for Sudoku is high, at least for the kind of puzzles we know how to generate. Certainly solvers based on backtracking seem to handle them very well. It's very hard to prove results about that, though. In any case, if someone has a polynomial time algorithm for Unambiguous SAT, or something equivalent to it, that would pretty amazing (but I doubt it - the average time might be very good, though).

attractivechaos · on April 6, 2013

Thanks for these info. I appreciate.

ot · on April 6, 2013

This. Most of the time when people say "NP-hard" they actually mean "no algorithm substantially better than an exhaustive search is known".

currymesurprise · on April 6, 2013

The Valiant-Vazirani theorem says that both Unique and Unambiguous SAT are NP-hard under randomized reductions.

While this is not NP-hard under the usual definition, modern complexity theorists believe derandomization to be possible to a degree similar to their belief that P is not NP.

pcwalton · on April 6, 2013

I added some optimizations to the Rust compiler on my branch, added inlining attributes to match the C version, removed I/O (we have performance problems there that are under active development) and reduced the integer sizes to 32 bit to match the C version more closely (see [1]) and got the time down to within 50%:

C version: 0m1.305s

Rust version: 0m1.871s

I think the remaining time may be attributed to the fact that you used smaller integer widths for the various internal arrays in the C version and larger integer widths in the Rust version. Remember that Rust `int` types are the size of a register (like Go 1.1), so they're 64 bit on x86-64. You can use `i8`, `i16`, etc. for smaller integer widths.

Bounds checks may also have some effect; we should add methods to allow them to be avoided.

Finally, thanks for trying out Rust. Do you mind if we add this benchmark to our test suite?

Edit: Updated the gist to use smaller integer widths, like the C version. Now Rust is within 33% of the C version:

C version: 0m1.205s

Rust version: 0m1.613s

[1]: https://gist.github.com/pcwalton/5327090

robotmay · on April 6, 2013

I'm really excited by Rust because it seems to be one of the most sensible language designs I've seen in a while. Go is nice but has some pretty odd syntax in places. I'm just waiting on Rust to hit 1.0 before I start playing with it more seriously.

Symmetry · on April 6, 2013

Eh, the only real surprise in Go's syntax is the odd ordering of type and variable names. That is, I think, a welcome change for those of us who sometimes have a hard time remembering if *a[] is an array of pointers or a pointer to an array. There wasn't anything I couldn't puzzle out without looking it up on the basic syntax side of things.

Rust, by contrast, introduces more new syntax for things like it's various sorts of pointers. And there are other things that look new to me too, thought that might be due to me not knowing ML. They're there for a good reason, but they add to the learning curve.

pjmlp · on April 6, 2013

A nice ML influenced system programming language, that even has generics, who would guess that?!

zura · on April 6, 2013

And deterministic memory management, which is quite rare these days... :)

But to be honest, skimming through Rust samples, I find its syntax somewhat noisy. It feels ad-hoc. Is there any document about justification of its syntax elements?

pcwalton · on April 6, 2013

We actually have partaken in long, long debates about the syntax, and the current syntax is one that everyone seemed to be OK with, modulo a few compromises here and there.

Usually people who say Rust looks like line noise are concerned about `@` and `~`. These are there for a reason: they're the Rust versions of the smart pointers `shared_ptr` and `unique_ptr` in C++. Unlike in C++, you have to use them: there is no other way to allocate memory. So `shared_ptr` and `unique_ptr` would be all over the codebase in Rust code if we didn't abbreviate them, and making them one character was the easiest way to do that.

I actually think that one of the reasons few people write exception-safe code in C++ is that `shared_ptr` and `unique_ptr` (especially if you don't use `using namespace std;`) are so long that `new` and `delete` end up being more convenient…

neilk · on April 6, 2013

Unlike most other language designers today, Rust's designer was more concerned about getting the concepts right than the syntax. So the syntax came late in the process, and as far as I know they did get inspiration from various sources. Possibly that is what bugs you.

I don't think there is a single document about syntax choices. It's evolved a lot in the past three years.

pjmlp · on April 6, 2013

> And deterministic memory management, which is quite rare these days... :)

Memory management should be automatic, either by reference counting or GC.

I think it is a generation thing until mainstream OS adopt such kind of system programming languages.

There are a few OS with such system programming languages, but it only counts when the likes of Apple, Microsoft and Google adopt such languages.

Objective-C with ARC, C++11, C++/CX are already steps in that direction.

>But to be honest, skimming through Rust samples, I find its syntax somewhat noisy. It feels ad-hoc. Is there any document about justification of its syntax elements?

My only issue is the Perl like prefixes for pointer types. I think it pollutes a bit the ML language influence.

On the other hand, I am starting to get used to it.

Symmetry · on April 6, 2013

Rust's memory management is both (relatively) deterministic and automatic, in that it's easy to figure out exactly when objects are being destroyed if you care but you don't have to do anything yourself to ensure that they're destroyed properly. This is in contrast to C or C++ where you have deterministic destruction but you have to clean up things you have to clean up things on the heap yourself, or Go or Java where you can't be sure at all when the garbage collector is going to harvest something.

pjmlp · on April 6, 2013

> This is in contrast to C or C++ where you have deterministic destruction but you have to clean up things you have to clean up things on the heap yourself, or Go or Java where you can't be sure at all when the garbage collector is going to harvest something.

This is a false belief many C and C++ developers have.

If you use the standard malloc()/free() or new/delete pairs, you are only certain at which point the memory is marked as released by the C or C++ runtime library.

At that point in time the memory can still be marked as in use for the OS and only be released at a later point, just like GC does.

This is one of the reasons why HPC makes use of special allocators instead of relying in the standard implementations.

In the end, this is no different than doing performance measurements to optimize the GC behaviour in GC enabled systems languages.

zura · on April 6, 2013

Besides pure memory, there are other types of resources as well, which are deterministically destroyed/closed/freed (automatically, in C++ RAII usage).

Also, in C/C++ your performance critical loop/execution is not interrupted by harvesting GC passing nearby.

AFAIK, one of the main reason people come up with custom allocators is that the system new/delete (malloc/free) are expensive to call. E.g. it is much faster to pseudo-allocate memory from system pre-allocated static memory, in your app.

pjmlp · on April 6, 2013

> Besides pure memory, there are other types of resources as well, which are deterministically destroyed/closed/freed (automatically, in C++ RAII usage).

In reference counting languages, the destroy method, callback, whatever it is named, takes care of this.

In GC languages, there is usually scope, defer, try, with, using, or whatever it might be called.

> Also, in C/C++ your performance critical loop/execution is not interrupted by harvesting GC passing nearby.

Code in a way that no GC is triggered in those sections, quite easy to track down with profilers.

Not able to do that? Just surround the code block with a gc.disable()/gc.enable() or similar.

> AFAIK, one of the main reason people come up with custom allocators is that the system new/delete (malloc/free) are expensive to call. E.g. it is much faster to pseudo-allocate memory from system pre-allocated static memory, in your app.

Which funny enough is slower than in languages with automatic memory management, because the memory runtime just do a pointer increment when allocating.

coldtea · on April 6, 2013

>There are a few OS with such system programming languages, but it only counts when the likes of Apple, Microsoft and Google adopt such languages.

There are a few OSs written with everything. It only counts when pragmatic, useful OSs are written with a language. Most of those OSs are unusable, slow, proof of concepts.

pjmlp · on April 6, 2013

Native Oberon and AOS were used for real work as desktop workstations at Zurich's Technical University.

Sadly no one in the industry decided to invest on them.

coldtea · on April 6, 2013

Sounds like natural selection.

PuerkitoBio · on April 6, 2013

Selection, yes. Natural? Do you somehow imply that technology that "wins" mass adoption is "better"? Or that it gains mass adoption based on rational analysis of its "value"? There are quite a few unnatural forces at play, here...

coldtea · on April 6, 2013

>Selection, yes. Natural? Do you somehow imply that technology that "wins" mass adoption is "better"?

No, just that it's more fit.

Which is the exact same thing natural selection in nature implies. An animal that spreads is not "better" -- it's just more fit for it's environment.

For an OS "more fit" can mean: faster, consuming less memory and with more control over it, usable in more situations and more hardware, cheaper to run, leveraging existing libraries, etc. It doesn't have to be "better" as in "less prone to crash", "safer" etc.

The parent mentioned SUN experimenting with a Java OS. What a joke would that be, given that SUN's experiments with a similarly needy application (a Java web browser) ended in utter failure, with a slow as molasses outcome.

Sure, it would run better with the resources we have now. But a C OS like linux also runs much better with the resources we have now -- so the gap remains.

It's not like we have exhausted the need for speed in an OS (on the contrary). It's not also like, apart from MS Windows of old, we have much problems with the core OS crashing or having security issues.

In fact, I haven't seen a kernel panic on OS X for like 3-4 years. And my Linux machines don't seem to have any kernel problems either -- except sometimes with the graphics drivers.

So, no, I don't think we're at the point where a GC OS would make sense for actual use.

ARC, an example the parent gives, fine as it might be, doesn't cover all cases in Objective-C. Tons of performance critical, lower level stuff still happens in C-land, and needs manual resource management, it's just transparent to the Cocoa level.

pjmlp · on April 6, 2013

No, it just means so far Apple and Microsoft were not that interested into doing that. I don't count with the commercial UNIX vendors, except Sun, because those only care about C and C++, mainly.

But this is going to change slowly anyway.

Sun did experiment using Java in the Solaris kernel, they just didn't went that far due to what happen to them.

Apple first played with GC in Objective-C, and now you have ARC both in Mac OS X and iOS.

Microsoft played with Singularity, only allowed .NET applications on Windows Phone 7.x and Windows (Phone) 8 uses C++/CX, which uses a reference counting runtime (WinRT). While C is considered legacy for Microsoft (official version).

C++11 has reference counting libraries and a GC API for compiler implementers.

Android has Dalvik and the native applications are actually shared objected loaded into Dalvik. Contrary to what some people think the NDK does not allow for full development of C and C++ code, except for a few restricted system APIs.

Sailfish and Blackberry, Qt makes use of C++ with reference counting and JavaScript.

FirefoxOS and ChromeOS are anyway only browsed based.

If there is natural selection in this, it might just be that manual memory management is going to be restricted to the same set as Assembly. Interacting with hardware or for special cases of code optimization.

So a systems programming language using either GC or reference counting, with the ability of doing manual memory management inside unsafe regions, might eventually be the future.

4ad · on April 6, 2013

You're complaining about Go syntax, which is one of the most readable languages out there but you like rust which looks like line noise?[1]

Sometimes I have a hard time convincing myself people don't post troll comments on HN.

[1]https://github.com/mozilla/rust/blob/master/src/libsyntax/pa...

coldtea · on April 6, 2013

>You're complaining about Go syntax, which is one of the most readable languages out there but you like rust which looks like line noise? Sometimes I have a hard time convincing myself people don't post troll comments on HN.

And then, as an example of "bad Rust syntax" you link to the "lexer.rs"?

As if a lexer in any language is a good example of it's everyday syntax?

How about the very first example they give:

http://www.rust-lang.org/

>My first thought was that it looked like CSS o.O

Because the top part of the source he linked to has just struct definitions, which is what CSS was designed like anyway. Would struct definitions in most common languages, like modern C or Go look any different? (including an {} object definition in JS)

octo_t · on April 6, 2013

Lexers are notoriously complex and difficult to read. The one in Go[1] is also really difficult to grok.

[1] - https://code.google.com/p/go/source/browse/src/pkg/go/scanne...

pcwalton · on April 6, 2013

That file is very ugly (it's one of the oldest files in the repository) and does not reflect the current programming style. The indentation is all over the place, the type names are not CamelCased, there isn't enough use of methods—`str::len()` is ugly compared to `"foo".len()`—and so on.

(Of course, this isn't an excuse: we sorely need to refactor the compiler.)

kyrias · on April 6, 2013

My first thought was that it looked like CSS o.O

And somehow felt a bit like VB

blinkingled · on April 6, 2013

Better stats than time using Linux perf utility. Looks like if you can exclude Java's startup time they are all pretty much very competitive. Clang SVN wins and Go 1.1 native build beats gccgo-4.8. Hmm.

* java version "1.7.0_17" - 485,284,671 cycles, 0.133074106 (0.0577s excl. startup) seconds time elapsed, 3.64% of all branches missed

* clang svn -O2 - 49,227,286 cycles, 0.029045632 seconds time elapsed, 6.70% of all branches missed

* gcc-4.8 -O2 -flto - 56,989,391 cycles, 0.029750578 seconds time elapsed, 6.48% of all branches missed

* gccgo-4.8 -O2 -flto - 114,583,934 cycles, 0.061428897 seconds time elapsed, 2.96% of all branches missed

* Go 1.1 Beta - 109,278,081 cycles, 0.048572937 seconds time elapsed, 2.62% of all branches missed

gillianseed · on April 7, 2013

Why are you using -O2 instead of -O3?

blinkingled · on April 8, 2013

-O2 seems to be the most commonly used optimization level. That's what most of the distro software and the kernel etc uses. In the past except for maybe a handful things, O3 was an overkill. (O3/Ofast remind me of my Gentoo days! :)

damien · on April 7, 2013

Or -Ofast

jeremyjh · on April 6, 2013

I'm really excited about Rust's performance here, considering how young the compiler and its runtime are. It has to get much faster but that seems likely to be possible.

SeanDav · on April 6, 2013

Since asm.js is also flavour of the month and not a million miles away from Rust and Dart in terms of target application, would be great to see this added.

azakai · on April 6, 2013

Results on my machine, normalized to C

Native (C): 1.00 Firefox/SpiderMonkey: 1.38 Chrome/v8: 1.97

Firefox and Chrome are running on asm.js code from compiling the C source.

edit: link: https://gist.github.com/kripken/5327216#file-sudoku-asm-js

hamax · on April 6, 2013

My results with newer versions of gcc and go:

gcc 4.7.2 o3 0.765s

java openjdk 7 1.066s

java openjdk 6 1.118s

go 1.1b 1.653s

rust 0.6 opt 3 1.659s

go 1.0.2 2.127s

dart 0.4.4.4 2.384s

pypy PyPy 1.9.0 5.094s

pypy PyPy 2.0b 5.319s

(Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz)

Edit: Added java

melling · on April 6, 2013

It looks like Go 1.1 might be faster than Java. That might help convince enterprise developers to adopt it.

jfb · on April 6, 2013

No way would a microbenchmark ever change any responsible person's mind about tooling. There are a lot of reasons (some of them nearly respectable) to use Java; "it's faster than <X>" isn't one of them.

darksaints · on April 6, 2013

Don't be silly. Java is extensively used for speed. It is an extremely fast language with a few caveats (such as startup/warmup time).

melling · on April 6, 2013

Of course, I didn't mean to imply that any microbenchmark would change someone's mind. It's kind of absurd to think that's what I meant. Go should approach C/C++ in performance and often Java is promoted as about 2x slower than these two languages. Fast compilation and linking is another big plus. The formatting consistency is another. It's not a verbose language either.

hamax · on April 6, 2013

Added java to the list. It's still faster then go.

elteto · on April 6, 2013

Just curious, are you factoring in the VM warmup time for Java?

raphael_kimmig · on April 6, 2013

I've tried it with go tip and go 1.0.

T(go tip) / T(go 1.0.3) = 0.8

Thus one might expect go to take third place after clang and gcc. But keep in mind that e.g. java isn't the newest version either...

kyrias · on April 6, 2013

Just tried it myself with the C version and with 1.0.3,

  C:  ./a.out < sudoku.txt      0.07s user 0.00s system 94% cpu 0.077 total
  Go: ./sudoku_v1 < sudoku.txt  0.06s user 0.01s system 88% cpu 0.079 total

Which seems a bit interesting...

acjohnson55 · on April 6, 2013

Man, PyPy got destroyed, let alone CPython! As a Python lover, that's pretty disheartening. I'm not an expert in many of these languages, but it would seem as though it should be possible to build a Python interpreter as fast as V8, even without using Numpy to accelerate the math. Anybody have any thoughts as to why PyPy is such a poor performer here?

RivieraKid · on April 6, 2013

Python just 8x slower than C and 4x slower than Java? That's actually pretty good result for Python.

cdavid · on April 6, 2013

I think LuaJIT is the outlier, not python. All other implementations got lots of resources to make them fast, except for LuaJIT.

copx · on April 6, 2013

PyPy simply isn't at the same level as V8 and LuaJIT yet.

V8 is a professional, full-time project employing some of the best programmers money can buy. And LuaJIT.. well LuaJIT is written by a fucking genius 10x programmer. Let me quote him:

" The reason why X is slow, is because X's implementation is slow, unoptimized or untuned. Language design just influences how hard it is to make up for it. There are no excuses. " Source: http://www.reddit.com/user/mikemike (that's his reddit account)

Python (and Ruby for that matter) could reach a comparable performance level if people with the necessary skills invested the necessary amount of effort.

cdavid · on April 6, 2013

Indeed, that's what I meant, even though my phrasing was unclear. Thanks for clarifying.

Symmetry · on April 6, 2013

That's actually an impressively good result. All that wonderful expressive power that Python has comes at a price in implementation complexity, and there are a lot of different cases that the JIT needs to handle with Python. Lua, by contrast, is pretty much Scheme in ALGOL's clothing and is ridiculously simple inside.

jayflux · on April 6, 2013

Python is a High Level language, it was never designed with the intention of being as fast as C or Java or other lower level languages.

As someone else pointed out the V8 has had a lot of backing and full time development, where as pypy, Cpython hasn't had as much. Pypy is still showing massive improvements despite being the slowest on that list.

lern_too_spel · on April 6, 2013

Java 1.6 is more than 6 years old now too. I half expected to see a Turbo Pascal 4 benchmark mixed in there before I realized I was joking.

codygman · on April 6, 2013

Update the Go example to use 1.1 :D

idupree · on April 6, 2013

Y'know,

That GCC is six years old.[1] Go, Rust, Dart and V8 didn't even exist when it was released. Maybe update that if anything?

[1] http://gcc.gnu.org/releases.html

danieldk · on April 6, 2013

And since it's a Mac, it is probably llvm-gcc (GCC with an LLVM backend).

A colleague/friend who is developing neural network software recently switched from gcc 4.2.1 to 4.5 (which is still old) and got a 50% speed increase without touching any code.

igouy · on April 6, 2013

It's been just 2 days since go1.1beta1 was released, the other language implementations seem more out-of-date than that.

reneky · on April 6, 2013

There are more low-hanging performance fruits in the immature languages.

That said, of course all should be up to date.

igouy · on April 6, 2013

See danieldk's comment -- https://news.ycombinator.com/item?id=5502984

codygman · on April 6, 2013

That doesn't negate the fact that I want to see the performance of go1.1 :D

Truthfully though, I didn't notice the others were out of date. I bet OP will update them all :P

SiVal · on April 6, 2013

That's the first thing I thought, too. I'd like to see both Go 1.1 and Go 1.0.3 for a comparison (on this benchmark). I'm hoping that before long we'll be able to count on Go being within a factor of 2 of plain C by most benchmarks.

And I'd like to see CPython 3.3 in addition to PyPy.

But it's quite interesting as is. Thanks.

shocks · on April 6, 2013

Obligatory programming language shootout link: http://benchmarksgame.alioth.debian.org/

igouy · on April 6, 2013

The project name changed 6 years ago.

"The Virginia Tech shooting in April 2007 once again pushed gun violence into the media headlines. There was no wish to be associated with or to trivialise the slaughter behind the phrase shootout so the project was renamed back on 20th April 2007 - The Computer Language Benchmarks Game."

http://benchmarksgame.alioth.debian.org/dont-jump-to-conclus...

shocks · on April 6, 2013

That is ridiculous.

igouy · on April 6, 2013

What is ridiculous? Understanding what a shootout is?

http://en.wikipedia.org/wiki/Shootout

shocks · on April 6, 2013

Changing the name of a programming competition to not cause some kind of offence is ridiculous in my book.

I understand what a shootout is. Do you understand what a programming competition is? I'll give you a clue, it does not involve murder nor does it advocate gun crime. Anyone that can construe a bad message from a programming competition called a 'shootout' is as bad as feminists claiming sciences lack of progress in the field of fluid mechanics is sexism.

This kind of behaviour only encourages the ridiculous "politically correct" world we live in today.

igouy · on April 7, 2013

> Do you understand what a programming competition is?

Is the benchmarks game listed somewhere as a programming competition?

http://en.wikipedia.org/wiki/Competitive_programming#Notable...

RivieraKid · on April 6, 2013

AFAIK it's not obligatory.

igouy · on April 6, 2013

And was already referenced in the blog -- "Other statically typed languages are about twice as slow, which broadly agrees with Computer Benchmark Games."

(Although I suppose that's some-kind-of appeal to authority?)

dom96 · on April 6, 2013

It would be nice to see a Nimrod benchmark.