I love Swift very much but every time I look at the disassembly view in Xcode wh...

skohan · on Nov 9, 2019

> For one thing, Swift is hardly ready for application domains like audio, video or games. No doubt it can make the development process so much faster and safer, but also less performant by exactly that amount.

I've done quite a bit of experimentation with the performance characteristics of Swift, and I think that's a slight mischaracterization of the situation.

For instance, I built a toy data-driven ECS implementation in Swift to see just what kind of performance could be squeezed out of Swift, and it was possible to achieve quite impressive performance, more in the neighborhood of C/C++ than a managed language, especially when dipping into the unsafe portion of the language for critical sections.

But it's a double edged sword: while it's possible to write high-performance swift code, it's really only possible through profiling. I was hoping to discover a rules-based approach (i.e. to avoid certain performance third-rails) and while there were some takeaways, it was extremely difficult to predict what would incur a high performance penalty.

Currently it seems like the main limiting factor in Swift is ARC: it uses atomic operations to ensure thread-safe reference counts, and this, like any use of synchronization, is very expensive. The ARC penalty can be largely avoided by avoiding reference types, and there also seems to be a lot of potential for improving its performance as discussed in this thread:

https://forums.swift.org/t/swift-performance/28776

zozbot234 · on Nov 9, 2019

> Currently it seems like the main limiting factor in Swift is ARC: it uses atomic operations to ensure thread-safe reference counts

This is exactly what Rust avoids by having both Arc and plain-vanilla Rc. Plus reference counts are only updated when the ownership situation changes, not for any reads/writes to the object.

3PS · on Nov 9, 2019

Rust also backs up this design with the Send and Sync traits, which statically prevent programmers from, say, accidentally sending an Rc<T> between threads when they really should have used an Arc<T> instead.

jb1991 · on Nov 9, 2019

Now I'm curious, what is the difference between automated ref counting and "vanilla" ref counting? And of these two, where does the C++ shared pointer fit?

skohan · on Nov 9, 2019

ARC as in "Atomic Reference Counting". ARC uses atomic operations to increment and decrement reference counts. That means these operations must be synchronized between threads. synchronization between threads/cores tends to be an expensive operation.

This is required for reference counting objects between threads. Otherwise, you might have one thread try to release an object at the same time another thread is trying to increment the reference count. It's just overkill for objects which are only ever referenced from a single thread.

jb1991 · on Nov 9, 2019

Well that is confusing. Any Apple developer has known for many years that ARC means Automatic Reference Counting:

https://docs.swift.org/swift-book/LanguageGuide/AutomaticRef...

I never heard this alternate and very different expansion for that term.

skohan · on Nov 9, 2019

It's an overloaded acronym to be sure. Atomic reference counting is a familiar concept in systems programming languages like C++ and Rust. It just so happens that Apple's automatic reference counting is also atomic.

MaulingMonkey · on Nov 10, 2019

It's a bit less confusing in practice - the full types are std::rc::Rc and std::sync::Arc (where std::sync is all multithreading stuff, and you have to actually use that name to get access to Arc in your code), and both are well documented (including spelling out the acronym):

https://doc.rust-lang.org/std/rc/struct.Rc.html

https://doc.rust-lang.org/std/sync/struct.Arc.html

...I could see this causing merry hell if trying to do advance interop between Swift and Rust, though, and it's admittedly probably going to be a minor stumbling block for Apple-first devs. (I managed to avoid confusion, but I just port to Apple targets, they're not my bread and butter.)

Razengan · on Nov 9, 2019

> GP: For one thing, Swift is hardly ready for application domains like audio, video or games.

> For instance, I built a toy data-driven ECS implementation in Swift to see just what kind of performance could be squeezed out of Swift, and it was possible to achieve quite impressive performance

I also have a pure-Swift ECS game engine [0] where I haven't had to worry about performance yet; it's meant to be 2D-only though I haven't really yet put it to the test with truly complex 2D games like massive worlds with terrain deformation like Terraria (which was/is done in C# if I'm not mistaken) or Lemmings, and in fact it's probably very sloppy, but I was surprised to see it handling 3000+ sprites on screen at 60 FPS, on an iPhone X.

- They were all distinct objects; SpriteKit sprites with GameplayKit components.

- Each entity was executing a couple components every frame.

- The components were checking other components in their entity to find the touch location and rotate their sprite towards it.

- Everything was reference types with multiple levels of inheritance, including generics.

- It was all Swift code and Apple APIs.

Is that impressive? I'm a newb at all this, but given Swift's reputation for high overhead that's perpetuated by comments like GP's, I thought it was good enough for my current and planned purposes.

And performance can only improve as Swift becomes more efficient in future versions (as it previously has). If/when I ever run into a point where Swift is the problem, I could interop with ObjC/C/C++.

SwiftUI and Combine have also given me renewed hope for what can be achieved with pure Swift.

I actually spend more time fighting Apple's bugs than Swift performance issues. :)

[0] https://github.com/InvadingOctopus/octopuskit

zozbot234 · on Nov 9, 2019

> translates into hundreds of executed instructions

My guess is that this would also be true under Rust, as soon as you start using some pretty common facilities such as Rc and RefCell. (Swift does essentially the same things under the hood.)

That said, "hundreds of executed instructions" are literally not a concern with present-day hardware; the bottleneck is elsewhere, especially wrt. limited memory bandwidth (as we push frequencies and core counts higher, even on "low-range" hardware), so it's far more important to just use memory-efficient data representations, and avoid things like obligate GC whenever possible - and Rust is especially good at this.

mojuba · on Nov 9, 2019

> "hundreds of executed instructions" are literally not a concern with present-day hardware

Depends on the context. I have that line in a very tight loop in a CoreAudio callback that's executed in a high-priority thread. It should produce audio uninterrupted, as fast as possible because the app also has a UI that should be kept responsive. Least of all I want to see objc_msgSend() in that loop. Of course I know I will remove all protocols from that part of the app and lose some of the "beauty" but then what's the point of even writing this in Swift?

For most applications Swift is good enough most of the time. No, it's excellent. I absolutely love how tidy and clever your Swift code can be. Maybe a few things you wish were improved, but every language update brings some nice improvements as if someone is reading your mind. The language is evolving and is very dynamic in this regard.

However, it is not a replacement for C or C++ like we were made to believe. And now that the linked article also explains the costs of ABI stability (even the simplest struct's introduce indirections at the dylib boundaries!) I realize I should re-write my audio app in mixed Swift + C.

zozbot234 · on Nov 9, 2019

> Of course I know I will remove all protocols from that part of the app and lose some of the "beauty"

Protocols/traits/interfaces are just indirection - we all know that indirect calls are expensive. Fixing this need not be a loss in "beauty" if the language design makes direct calls idiomatic enough.

> And now that the linked article also explains the costs of ABI stability

I definitely agree about this, though. ABI stability and especially ABI-resilience, have big pitfalls if used by default, without a proper understanding of where these drawbacks could arise. They are nowhere near "zero cost"!

yati · on Nov 9, 2019

Rust allows you to opt for static dispatch with traits when possible so that there is no runtime cost. See https://doc.rust-lang.org/1.8.0/book/trait-objects.html

mojuba · on Nov 9, 2019

> Protocols/traits/interfaces are just indirection

They are indeed. Look at how C++ handles multiple inheritance, for example: literally a few extra instructions for each method call, not more than that. Swift's cost of protocol method call and typecasting seems too high in comparison, and I haven't even tried this across dylibs yet.

zozbot234 · on Nov 9, 2019

> literally a few extra instructions for each method call, not more than that.

Yup, C++ does this by building in lightweight RTTI info as part of the vtable. Swift expands on this trick by using broadly-similar RTTI info to basically reverse excess monomophization of generic code. (Rust could be made to support very similar things, but this does require some work on fancy typesystem features. E.g. const generics, trait-associated constants, etc.)

saagarjha · on Nov 9, 2019

> even the simplest struct's introduce indirections at the dylib boundaries

Not if you freeze it. The indirection is only required for resilient structs.

emilfihlman · on Nov 9, 2019

>That said, "hundreds of executed instructions" are literally not a concern with present-day hardware

People really need to stop saying this and stop accepting it as a "truth". It only applies in _some_ applications, and even there it stops applying once you want to do it many times over and over again.

zozbot234 · on Nov 9, 2019

It is "truth" in many cases. On a reasonably high-frequency, high-core count chip, instructions are almost free once you've managed to saturate your memory bandwidth. (Of course, that assumes that the code itself is "hot" enough that it's in cache somewhere, but this is the common case.)

TheCoelacanth · on Nov 9, 2019

But more instructions means a smaller portion of your code will fit in cache which means its less likely for any given code to be "hot".

woadwarrior01 · on Nov 9, 2019

       (myObject as! SomeProtocol).someMethod()

The `objc_msgSend()` call you're observing in this case is likely a call to `NSObject::confirmsToProtocol`[1], are you absolutely certain that NSObject is not involved anywhere in your class hierarchy?

[1]: https://developer.apple.com/documentation/objectivec/nsobjec...

rimliu · on Nov 9, 2019

  > For one thing, Swift is hardly ready for application
  > domains like audio, video or games.

I am curious, is above based purely on what you said first (hundreds of instructions generated) or you have some evidence for that? I know nothing about audio processing but is not the bulk of the work done inside highly optimized Core Audio libs and Swift would not have big impact here? I am pretty sure SpriteKit/SceneKit/ARKit work fine with Swift.

andybak · on Nov 9, 2019

And as a counter example - the most widely used platform for mobile games is Unity and most Unity games implement their important stuff in fully-managed C# (which has among many other performance issues a fairly intrusive garbage collector).

Yeah there's a move away from C# towards a Burst-compiled unmanaged subset but it hasn't happened yet. And yes - Unity itself is C++ but all your game code is still in Mono/C# and calling in to the engine doesn't make all that go away. There's still plenty of tight loops in managed code.

In short - a lot of mobile game developers seem happy to sacrifice bare metal performance if they get something back in return.

mojuba · on Nov 9, 2019

As long as you use the standard AU components which themselves are written in C, you should be fine. However, just one step outside of the standard functionality, e.g. you want to process or generate the audio sample stream yourself in Swift, is where it can become troublesome. I tried to profile my audio processing loops and I saw the bottlenecks in some RTL functions that deal with Swift protocols. Like I said in the other comment, I will remove protocols from that part of my code and lose much of its "swifty-ness" but then why would I even write it in Swift?

bluk · on Nov 9, 2019

I think another way to view this is, while Swift could be performant, but idiomatic code in practically any other language may be utterly wrong in Swift.

For instance recently this nested data structure "issue" was brought up (again) in the Swift community: https://mjtsai.com/blog/2019/11/02/efficiently-mutating-nest...

If you had a nested set in a dictionary:

``` var many_sets: [String:Set<Int>] = ... let tmp_set = many_sets["a"] tmp_set.insert(1) many_sets["a"] = tmp_set ```

vs.

``` var many_sets: [String:Set<Int>] = ... many_sets["a"].insert(1) ```

The performance is entirely different (e.g. you are making a copy of the Set in the first example). Prior to Swift 5, you would have had to potentially remove the set from the dictionary in order to make sure there were no unintentional copies.

While the examples are contrived to some degree, I think at least a few new Swift programmers would lookup something in a dictionary, pass the value into a function thinking it's a reference, and then when they realize it isn't being changed in the dictionary, set the value in the dictionary after the function returns like:

``` var many_sets: [String:Set<Int>] = ... let changed_set = process(set: many_sets["a"]) many_sets["a"] = changed_set ```

It is "easy" to understand what is happening when you know Swift's collections are value types and about copy on write and value vs reference semantics, but it is also an easy performance issue.

Furthermore, when web framework benchmarks like: https://www.techempower.com/benchmarks/#section=data-r18&hw=... show Java's Netty vs. Swift NIO (which is based on the architecture of Netty), I think that it indicates that you cannot just port code and expect anywhere near the same performance in Swift.

mojuba · on Nov 9, 2019

Yes, collections within collections (or any structs for that matter) is another thing in Swift that you ignore at first, until you discover some side effects and realize how horribly inefficient your code might have been so far. But to be fair you are not protected from similar inefficiencies even in C, where structs can be copied without you fully realizing the associated costs, especially if the struct is declared by someone else, not you. And I like how C++ protects itself in this area: define constructors that you think are most suitable for the type.

I really wish Swift moved a notch towards C++ in some areas especially where the designer of the type can define the usage in very precise terms. Is it copyable? Is this method abstract? Maybe also struct deinit, etc etc.

skohan · on Nov 9, 2019

It does seem to me moving in that direction - not at the type level but at the member level.

Property wrappers already allow some interesting possibilities with customizing the storage and usage of particular member variables, and there was a thread today about exposing the memory locations of reference type members, which would unlock a lot of optimization opportunities:

https://forums.swift.org/t/pitch-exposing-the-memory-locatio...

I'm not sure whether swift can ever really get there with respect to performance, given the foundational decisions regarding ARC and copy-on-write. But I would love a language with Swift's type system and sensibilities and a bit more control over how memory is handled.

omaranto · on Nov 9, 2019

For future reference, code blocks here on HN use Markdown syntax (that is you indent them four spaces), not GitHub-Flavored Markdown syntax (triple backquotes).

NobodyNada · on Nov 9, 2019

FYI, that’s a pretty expensive line of code. The compiler has to search myObject’s type metadata and protocol conformance records to find the conformance, then it has to create a new SomeProtocol existential container, copy myObject to the new container (potentially incurring retain/release traffic), use a witness table to dynamically call the method, and finally destroy the existential container. Dynamic casts are slow; if you can restructure your code to avoid the cast then it won’t have to do a bunch of that extra work.

mojuba · on Nov 9, 2019

Yes, with all that in mind the complexity I see in the generated code still far exceeds my intuitive expectations. Of course I'll end up removing protocols from critical parts of my code, but like I said in the other comments, then what's the point of writing those parts in Swift? Protocols are the core part of the language, they are offered as a substitute for multiple inheritance and even for e.g. enforcing abstract methods (no other idiomatic way in Swift), they are elegant and look cheap except they are not!

NobodyNada · on Nov 9, 2019

The really expensive part here is not the use of a protocol, it’s the downcast (which isn’t really idiomatic Swift). Static dispatch is always faster than dynamic dispatch/polymorphism, but protocols are usually reasonably efficient (even more so if you can use generics instead of existentials).

Razengan · on Nov 9, 2019

> For one thing, Swift is hardly ready for application domains like ... games.

That is untrue; See my comment at https://news.ycombinator.com/item?id=21491597