Hacker News new | past | comments | ask | show | jobs | submit login

(1) Even in that benchmark, Haskell is over twice as slow as Rust on average.

(2) The Haskell code is nowhere near idiomatic.

It is still useful to know that Haskell can be quite fast if necessary, but it likely does not reflect what your actual experience using Haskell would be.




> (2) The Haskell code is nowhere near idiomatic

This isn't meant as a personal attack firstly. Secondly, I've heard this said more than a few times lately and am generally curious:

Do you know which Haskell examples are idiomatic, which aren't, and why?

I feel like in the case of some it's just parroting someone elses argument.


Just in the first two examples on that page, you'll find allocaArray, mallocByteString, writeAlu, unsafeUseAsCString, unsafeIndex, withForeignPtr, unsafeDrop, and more. You really don't have to know Haskell very well (and I'm not claiming to be an expert by any stretch) to understand that this is far from idiomatic in a lazily evaluated, pure functional language. No parroting required.


You should know it at least a tiny bit though, or else you lump perfectly "idiomatic" functions in with low level functions that aren't often used and pretend they're a problem. How is a indexing an array without a bounds check somehow counter to lazy evaluation or functional programming? It is the exact same thing as indexing it normally, just without checking the length.


How is indexing an array OK with functional programming of the level of purity Haskell strives for (not to mention without bounds checking)?

In "Real World Haskell" they advise you to forgo of arrays, and even has a chapter "Life without arrays", saying "arrays and hash tables are often used as collections indexed by key, and in Haskell we use trees for that purpose".


>How is it not? That is a complete non-sequitur and makes no sense. What sort of definition of functional programming are you using that you think it restricts what data types you are allowed to use?

The normal definition of functional programming, in which not all data structures used in imperative programming are idiomatic (or even considered "functional").

For starters, mutable data structures would be considered not purely functional. E.g:

"Most books on data structures assume an imperative language such as C or C++. However, data structures for these languages do not always translate well to functional languages such as Standard ML, Haskell, or Scheme. This book describes data structures from the point of view of functional languages, with examples, and presents design techniques that allow programmers to develop their own functional data structures" [1]

[1] http://www.cambridge.org/us/academic/subjects/computer-scien...


>The normal definition of functional programming, in which not all data structures used in imperative programming are idiomatic

I can't find any reference anywhere to any sort of definition like that. Or even one that mentions restricting data structures or memory layout at all. The only "normal" definitions I can find are: "functions are first class" which is a pretty weak definition and doesn't really exclude much these days, and: "the return values of functions depend only on the functions arguments" which is what everyone in the haskell world thinks of as "functional programming". Neither of these definitions exclude arrays.

>For starters, mutable data structures would be considered not purely functional

A data structure can not be functional or imperative. Only what you do with it. Mutation is perfectly fine if it is localized (remember the function's return value needs to only depend on its arguments, nothing more).


If you look at the context you can see that they are saying that using immutable arrays has a performance hit compared to mutable ones. That doesn't mean you shouldn't use arrays if they suit your purpose.


>How is indexing an array OK with functional programming of the level of purity Haskell strives for (not to mention without bounds checking)?

How is it not? That is a complete non-sequitur and makes no sense. What sort of definition of functional programming are you using that you think it restricts what data types you are allowed to use? Do you think you aren't allowed to use lists in C?

>In "Real World Haskell" they advise you to forgo of arrays

You might want to read it instead of misrepresenting something out of context. You might also wish to consider that many of the haskell shootout submissions came from the authors of RWH.


The point is that is unsafe. You don't need to do that in idiomatic Rust code, because the bounds-check-free indexing is encapsulated inside the iterator library. So to re-frame it: fast, idiomatic Rust code is safer and easier to write than fast, idiomatic Haskell code.


>The point is that is unsafe

No, the point was that it is "far from idiomatic". Did you read the post I was replying to?


>Do you know which Haskell examples are idiomatic, which aren't, and why?

Why wouldn't he?

With a little familiarity with programming at large and Haskell syntax can learn to discern the two cases even if you're not a Haskell programmer.


A glance at the source quickly reveals an unsafe implementation.


I didn't see an import for unsafe or foreign in:

fannkuch-redux pidigits binary-trees


fannkuch-redux heavily utilizes unsafe operations on a mutable vector.

binary-trees uses artificial strictness for the sake of the benchmark.

Nine of the ten Haskell implementations are demonstrations of writing C in Haskell (without reaching C's performance).


> binary-trees uses artificial strictness for the sake of the benchmark

Using strict evaluation when appropriate is most certainly idiomatic Haskell code.

> Nine of the ten Haskell implementations are demonstrations of writing C in Haskell (without reaching C's performance).

This (C in Haskell accusation) is debatable, though I'm not sure there's much to be gained even given a totally successful discussion where we both communicate our points 100% effectively.


There probably isn't a point to it, as we're likely trying to argue different points.

I posit that using Haskell in such a way is a disservice to the language. If you want to code in such a way, why use Haskell?

Haskell is plenty fast as is and doesn't need to resort to unsafe code. What's the point?


Mutability and strictness aren't a disservice to Haskell. They're both considered to be things which are available as choices and should be taken when needed. In particular, the entire ST monad is all about mutability and is very well behaved.

Furthermore, even if you write nearly everything in the IO monad Haskell you'll gain a lot from the type system and from easily peeling out small, meaningful pure segments.


>fannkuch-redux heavily utilizes unsafe operations on a mutable vector.

They are just not bounds checked. That is not "writing C in Haskell".

>binary-trees uses artificial strictness for the sake of the benchmark

What on earth is "artificial strictness"? Forcing evaluation when you need something to be evaluated immediately is not artificial, not unidiomatic, and not "writing C in Haskell".

>Nine of the ten Haskell implementations are demonstrations of writing C in Haskell

And end up shorter than rust? The idea that writing lower level code is "writing C in haskell" is nonsense, the ability to write more verbose but faster code is not harmful, it is useful. But the fact that lower level haskell is still shorter than rust and yet you want to act like it is some sort of horrible thing makes it hard to believe that you are really concerned about the horrors of writing slightly more verbose code.


The Rust code is both safer and faster at the cost of more keystrokes.

I'm not certain the point of writing unsafe code in Haskell when there are better tools for the job. Haskell is fast enough as is and using it to hammer screws just reeks of wrong tool for the job.


As an aside, if your interested in safer and faster I'm benchmarking some of the ATS2 code examples[0] which were created for (but not yet in[1][2]) the computer language benchmarks game. It seems to beat C and C++ in some (many?) cases and create TINY binaries. I was trying to benchmark all the examples but got tired of doing it ;) Here are a couple anyway:

Pidigits:

    $ patscc -I/home/cody/sources/ATS-Postiats-contrib/contrib -pipe -O3 -fomit-frame-pointer -march=native pidigits.dats -o bin/pidigits.ats_run -lgmp
    $ gcc -pipe -Wall -O3 -fomit-frame-pointer -march=native  pidigits.c -o bin/pidigits.gcc_run -lgmp
    $ time ./bin/pidigits.gcc_run 10000 > /dev/null
    
    real	0m0.969s
    user	0m0.963s
    sys	0m0.004s
    $ time ./bin/pidigits.ats_run 10000 > /dev/null
    
    real	0m0.972s
    user	0m0.968s
    sys	0m0.004s
    $ ls -larth bin/pidigits.*
    -rwxrwxr-x 1 cody cody 15K Dec 13 18:59 bin/pidigits.ats_run
    -rwxrwxr-x 1 cody cody 14K Dec 13 18:59 bin/pidigits.gcc_run

k-nucleotide:

    $ g++ -c -pipe -O3 -fomit-frame-pointer -march=native -std=c++0x k-nucleotide_gpp3.c++ 
    $ $(PATSCC) -DATS_MEMALLOC_LIBC -pipe -O3 -fomit-frame-pointer -march=native -std=c99 k-nucleotide.dats 
    $ patscc -DATS_MEMALLOC_LIBC -pipe -O3 -fomit-frame-pointer -march=native -std=c99 k-nucleotide.dats 
    $ time ./k-nucleotide_gpp3 < ~/Downloads/knucleotide-input.txt 
    
    real	0m0.177s
    user	0m0.300s
    sys	0m0.055s
    $ time ./k-nucleotide < ~/Downloads/knucleotide-input.txt 
    
    real	0m0.056s
    user	0m0.036s
    sys	0m0.020s
    $ ls -larth k-nucleotide k-nucleotide_gpp3
    -rwxrwxr-x 1 cody cody 38K Dec 13 19:07 k-nucleotide
    -rwxrwxr-x 1 cody cody 96K Dec 13 19:07 k-nucleotide_gpp3


[0]: https://github.com/githwxi/ATS-Postiats-contrib/tree/master/...

[1]: https://alioth.debian.org/forum/forum.php?thread_id=14942&fo...

[2]: https://groups.google.com/forum/#!topic/ats-lang-users/QdwKp...


I don't know much about ATS, but you've given me reason to investigate. Thank you!


Your comment says absolutely nothing, and that appears to be the goal. What is the point of writing any code in any language? It is a benchmark, the point of it is to write the fastest code you can. You aren't obligated to write lower level code in haskell, nobody has a gun to your head. So why do you pretend it is a problem, and claim other languages are "better tools" for that job despite haskell clearly being very good at that job?


>>It is a benchmark, the point of it is to write the fastest code you can.<<

The point of it is to show the performance differences, but that doesn't exclude a program like --

http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

-- or like --

http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

etc


It doesn't exclude "idiomatic" versions, but the data is presented so as to make them useless. How do I compare "idiomatic" rust to "idiomatic" haskell? All you can get is "here's the fastest versions from each language". If you want to compare any other version you have to do it on an individual basis, one at a time. So there's absolutely no reason for people to bother adding "idiomatic" versions.


"Idiomatic" in whose opinion? "Idiomatic" is a slogan not a well-defined property.

The reason for people to bother adding "idiomatic" versions, is that other people really do work through looking at the source code "on an individual basis, one at a time."


This exact thread came up last year when we started talking about the shootout and haskell. And Its always the same people who bring up the shootout and defend the existing entries.

Please stop bringing up the shootout (esp when haskell is mentioned) and dragging everyone through this conversation again.

Benchmarks are great to debate but this one is causing more harm than good; let's talk about other benchmarks.


>>Please stop bringing up the shootout…<<

I didn't!

Your accusation is completely wrong!

(And it hasn't been called that for over 7 years.)


>"Idiomatic" in whose opinion?

That's the point I'm making. People constantly dismiss the results they don't like as "unidiomatic" and embrace they ones they do as "idiomatic".

>is that other people really do work through looking at the source code "on an individual basis, one at a time."

Except that people don't bother adding them, so your reasoning for why isn't relevant. If you want to claim the shootout is useful for this, you need to allow for grouping benchmarks together to be presented against the "standard" ones. Let Bob and Sally submit "Bob and Sally version of Foo benchmarks" and let me compare them against Foo and against other languages.


>>Except that people don't bother adding them …<<

I already provided links to two programs and you're still saying that people don't bother adding them.


Do you think that is contradictory for some reason?


>I feel like in the case of some it's just parroting someone elses argument.

It usually is. The reality is that like most memes like this, there's a grain of truth at the center, but most of the people repeating it don't know where the truth ends and the exaggeration begins.

Regex-dna is using low level ByteString operations to squeeze out a bit of extra performance. This makes it quite a bit longer and would rarely be done in practice since the gain is small.

Fasta again does some low level ByteString operations, but not nearly as significant as regex-dna. Not that bad.

Others like fannkuch-redux are just using "unsafe" vector functions. All that is doing is leaving out bounds checks. That makes an miniscule difference in performance most of the time, but the way the shootout works there's only incentive to completely optimize for speed, so you get 1% optimizations you wouldn't normally see. I'd hardly call it unidiomatic though, you just normally use "read" instead of "unsafeRead", etc. In some cases I would say the unsafe calls are idiomatic. If you are going to be doing an operation repeatedly in a loop, doing the bounds checking once before looping and using the non-bounds checked calls in the loop is pretty standard.

Pidigits is normal haskell (although you could argue the lack of spacing isn't idiomatic, but that hardly matters).

I stopped there since the rest are getting enough slower than rust, but looking them over there's still nothing outrageous, just more low level ByteString usage. Overall none of them are that far from idiomatic, none of them do anything particularly crazy, and I think the speed gains from doing most of the optimizations they are doing are fairly small. I find it particularly odd that people act like the ability to write lower level code that is faster is a problem with haskell. Especially given that the "low level" haskell versions are often still shorter and no harder to read than the "idiomatic" rust versions.


>I'd hardly call it unidiomatic though, you just normally use "read" instead of "unsafeRead", etc.

Isn't "idiomatic" defined by what you "normally" use?


Does using a less commonly used function for its intended purpose make the code unidiomatic?


If the goal of using the less commonly used functions is performance, then it probably does.

I mean: it's not like the specific problem domain needed a "less commonly used function" (e.g. you need a function to do X, where X is something that you rarely need to perform).

Rather it's: "we need to code A, but we will use less commonly used functions instead of what we'd normally use for A, just to get more performance".

Plus, it's not like you merely exchange function f with g, while all other factors stay the same. The choice of those "less commonly used functions" also affects other aspects of the program's design (e.g. making it into a more imperative style, going for unsafe, opting for mutability, etc).

So the interesting question to call it idiomatic or not, for me, is:

"If ultimate performance wasn't a factor, would a Haskell programmer write this program in the same way?"


>Plus, it's not like you merely exchange function f with g, while all other factors stay the same

Yes, it is exactly like that. That was the point. It is literally changing out the name of a function. It has absolutely no effect on the rest of the code. Which is why the code is idiomatic by any reasonable definition.


If the only differences between the functions are the names and the performance characteristics, then why even bother having the slower versions? There has to be some difference somewhere, right?


Are you serious? Read the thread.


Ok, so perhaps I grant you the fact that unsafe indexing is idiomatic. The Rust code is still safer and more robust against buffer overflows in the long term. Also, isn't regex-dna using libpcre? That's a C library, no? Rust's regex library is pure Rust, with no unsafe code.

I love Haskell, but I still don't think it occupies the same space as Rust.


> I love Haskell, but I still don't think it occupies the same space as Rust.

My opinion is that Haskell can overlap, but something that occupies the same space as rust would be Ivory[0] or ATS[1].

0: http://hackage.haskell.org/package/ivory 1: http://www.ats-lang.org/


>Also, isn't regex-dna using libpcre?

It is in virtually every language on the shootout. That's one of the reasons the shootout is so terrible. A regex benchmark doesn't tell you much of anything unless the task itself is to write the regex engine.

>I love Haskell, but I still don't think it occupies the same space as Rust.

Neither do I. I am simply correcting this weird meme that "haskell is really slow and the shootout shows you have to write insane code to be fast" when the whole point of the shootout is to write the fastest code you can, the haskell code is pretty ordinary and still significantly better than most other languages, and an "idiomatic" version is not 10 times slower, it is 10% slower.


> A regex benchmark doesn't tell you much of anything…

And yet, the programs don't all perform the same.

One regex task doesn't tell you much of anything because so much can be different with a different task. Then again, people are usually surprised by V8 and Irregexp.

The GHC version has been updated so many times since the last programs were contributed, that I hope the code would look better if it was written to use the latest greatest Haskell.


>And yet, the programs don't all perform the same.

Because they don't all use pcre. Duh?

>The GHC version has been updated so many times since the last programs were contributed, that I hope the code would look better if it was written to use the latest greatest Haskell.

Why on earth would the code change because of a new compiler release? Do people rewrite the C one every time a new GCC is released? And the code looks fine, what "better" do you want?


>>Because they don't all use pcre. Duh?<<

"Duh?" because you haven't looked at the code?

http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

>>Why on earth would the code change because of a new compiler release?<<

New GHC releases (and libraries) have in the past provided new ways to write the code.

>>what "better" do you want?<<

Any "better" that 6 or 7 releases has provided.


>"Duh?" because you haven't looked at the code?

Me: "Not all use pcre" You: "Two that use pcre"

What on earth is that supposed to show?

>Any "better" that 6 or 7 releases has provided.

You have some weird ideas about haskell/ghc.


>>What on earth is that supposed to show?<<

Those two that both use pcre don't perform the same.

>>You have some weird ideas about haskell/ghc.<<

If I have, I got them from seeing programs being re-written when new stuff became available.


You got them from seeing someone writing a faster version. The part where you thought it had anything to do with "new stuff" is entirely your own imagination




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: