Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cross-compiling to different targets with `create-exe` command is a very intriguing idea.

> In Wasmer 3.0 we used the power of Zig for doing cross-compilation from the C glue code into other machines.

> This made almost trivial to generate a [binary] for macOS from Linux (as an example).

> So by default, if you are cross-compiling we try to use zig cc instead of cc so we can easily cross compile from one machine to the other with no extra dependencies.

https://wasmer.io/posts/wasm-as-universal-binary-format-part...

> Using the wasmer compiler we compile all WASI packages published to WAPM to a native executable for all available platforms, so that you don't need to ship a complete WASM runtime to run your wasm files.

https://wasmer.io/posts/wasm-as-universal-binary-format-part...



one thing to note is that the performance is still on the level of normal wasmer exexution and thus inferior to natively compiled code in most instances


> thus inferior to natively compiled code in most instances

Do you have a source for this claim? Using a JIT with AOT binaries (which is what this seems to be) can sometimes be very beneficial. It's like doing PGO without the manual work of doing PGO properly.

I'm sure the JIT can make poor decisions sometimes, but I would want to see a comprehensive set of benchmarks, like The Benchmarks Game and TechEmpower showing this. Benchmarks aren't always reflective of reality, but good ones are usually more insightful than random opinions.

The only Wasmer benchmarks I could readily find were from three years ago, and that was before Wasmer 1.0, let alone 3.0.

EDIT: Reading more... maybe `create-exe` doesn't include the JIT at all, so it is just making an AOT binary from the source WASM? In which case, benchmarks are just as necessary to understand how it affects performance compared to a normally-compiled native binary.


Here is an extremely unscientific benchmark using the Takeuchi function and Trealla Prolog:

   $ make wasm
   $ wasmer create-exe tpl.wasm -o tpl-wasm
   $ time ./tpl-wasm --consult < tak.pl -g 'time(run)' --ns
   '<https://josd.github.io/eye/ns#tak>'([34,13,8],13).
      % Time elapsed 1.39s
   real    0m1.403s
   user    0m1.333s
   sys     0m0.070s

   $ time tpl tak.pl -g 'time(run)' --ns
   '<https://josd.github.io/eye/ns#tak>'([34,13,8],13).
      % Time elapsed 0.448s
   real    0m0.473s
   user    0m0.463s
   sys     0m0.010s
WASM is about 3x slower. In general I've found it to be 2-3x slower, at least for my use cases. I also tried the LLVM compiler instead of Cranelift but it was slightly slower.


In my tests (simple rust benchmarks) it was about 30% slower. Ab large part of the 1.4 is probably startup time. So it's not that bad either.


Oh yeah, it's definitely not bad. I'm quite happy with it. I imagine I'm hitting on some corner cases here that aren't really indicative of the average user. Thanks for the comparison.

My reading of the numbers is that the startup overhead is about 13ms in my example, quite fast. Shout out to Wizer for greatly reducing the startup time.


Certainly interesting, but I'm not sure how well the performance of a prolog interpreter maps to other use cases.

That benchmark looks too short to be a useful measure of how JIT influences performance, otherwise I would ask how the regular Wasmer does in that benchmark too (since it seems like create-exe doesn't include a JIT).


Wasm also can't do computed goto for interpreters, so it's at a disadvantage for this benchmark in general


Is the used interpreter built with computed goto?


Switch tables tend to be computed goto based


Is the interpreter we're talking about built this way?


Honestly I can't really say (I just ported it to WASM :-)), but here's more or less the start of the query loop: https://github.com/guregu/trealla/blob/main/src/query.c#L173...

I believe it's a bytecode but I know that function pointers are involved at least with the built-in predicates (see predicates.c).


Even if you write if...else a smart compiler will turn it into a switch table.


Could you show an example? A Godbolt link would be nice!


I guess it's safe to assume that if they're not providing benchmarks claiming it's faster than native, then it's slower (and that's ok)


I don't commonly see mainstream programming languages, compilers, or runtimes providing comparative benchmarks for themselves these days, so I don't consider that a safe assumption at all. It would certainly be nice if Wasmer did provide some benchmarks.


There are lies, damn lies, statistics and benchmarks, but I do take you point. It does give an idea but is the main goal of the project speed or portability (genuine question)?


The usefulness of portability is generally less the more it sacrifices performance.

I also think of WASM as a tool that could provide a useful security boundary.


Which should be mostly fine unless you are doing some particularly slow things or need hardware acceleration that is difficult to get via wasm. For lots of code, your CPU is going to be idling while it runs. Wasm is fine for that type of code. It will idle slightly less. You won't notice the difference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: