High Performance Numeric Programming with Swift: Explorations and Reflections

technical
Author

Jeremy Howard

Published

January 10, 2019

Over the past few weeks I’ve been working on building some numeric programming libraries for Swift. But wait, isn’t Swift just what iOS programmers use for building apps? Not any more! Nowadays Swift runs on Linux and Mac, and can be used for web applications, command line tools, and nearly anything else you can think of.

Using Swift for numeric programming, such as training machine learning models, is not an area that many people are working on. There’s very little information around on the topic. But after a few weeks of research and experimentation I’ve managed to create a couple of libraries that can achieve the same speed as carefully optimized vectorized C code, whilst being concise and easy to use. In this article, I’ll take you through this journey and show you what I’ve learned about how to use Swift effectively for numeric programming. I will include examples mainly from my BaseMath library, which provides generic math functions for Float and Double, and optimized versions for various collections of them. (Along the way, I’ll have plenty to say, both positive and negative, about both Swift and other languages; if you’re someone who has a deep emotional connection to your favorite programming language and doesn’t like to see any criticism of it, you might want to skip this post!)

In a future post I’ll also show how to get additional speed and functionality by interfacing with Intel’s Performance Libraries for C.

Background

Generally around the new year I try to experiment with a new language or framework. One approach that’s worked particularly well for me is to look at what the people that built some of my favorite languages, books, and libraries are doing now. This approach led me to being a very early user of Delphi, Typescript, and C# (Anders Hejlsberg, after I used his Turbo Pascal), Perl (Larry Wall, after I used rn), JQuery (John Resig, after I read Modern Javascript), and more. So when I learnt that Chris Lattner (who wrote the wonderful LLVM) is creating a new deep learning framework called Swift for Tensorflow (which I’ll shorten to S4TF from here), I decided that I should take a look.

Note that S4TF is not just a boring Swift wrapper for Tensorflow! It’s the first serious effort I’ve seen to incorporate differentiable programming deep in to the heart of a widely used language. I’m hoping that S4TF will give us a language and framework that, for the first time, treats differentiable-programming as a first-class citizen of the programming world, and will allow us to do things like:

  • Write custom GPU kernels in Swift
  • Provide compile-time checks for named tensor axis name and size matching
  • Differentiate any arbitrary code, whilst also providing vectorized and fused implementations automatically.

These things are not available in S4TF, at least as yet (in fact, it’s such early days for the project that nearly none of the deep learning functionality works yet). But I fully expect them to happen eventually, and when that happens, I’m confident that using differentiable programming in Swift will be a far better experience in Swift than in any other language.

I was lucky enough to bump in to Chris at a recent conference, and when I told him about my interest in S4TF, he was kind enough to offer to help me get started with Swift. I’ve always found that who I work with matters much more to my productivity and happiness than what I work on, so that was another excellent reason to spend time on this project. Chris has been terrifically helpful, and he’s super-nice as well—so thanks, Chris!

About Swift

Swift is a general-purpose, multi-paradigm, compiled programming language. It was started by Chris Lattner while he was at Apple, and supported many concepts from Objective-C (the main language used for programming for Apple devices). Chris described the language to me as “syntax sugar for LLVM”, since it maps so closely to many of the ideas in that compiler framework.

I’ve been coding for around 30 years, and in that time have used dozens of languages (and have even contributed to some. I always hope that when I start looking at a new language that there will be some mind-opening new ideas to find, and Swift definitely doesn’t disappoint. Swift tries to be expressive, flexible, concise, safe, easy to use, and fast. Most languages compromise significantly in at least one of these areas. Here’s my personal view of some languages that I’ve used and enjoyed, but all of which have limitations I’ve found frustrating at times:

  • Python: Slow at runtime, poor support for parallel processing (but very easy to use)
  • C, C++: hard to use (and C++ is slow at compile time), but fast and (for C++) expressive
  • Javascript: Unsafe (unless you use Typescript); somewhat slow (but easy to use and flexible)
  • Julia: Poor support for general purpose programming, but fast and expressive for numeric programming. ( Edit: this may be a bit unfair to Julia; it’s come a long way since I’ve last looked at it!)
  • Java: verbose (but getting better, particularly if you use Kotlin), less flexible (due to JVM issues), somewhat slow (but overall a language that has many useful application areas)
  • C# and F#: perhaps the fewest compromises of any major programming language, but still requires installation of a runtime, limited flexibility due to garbage collection, and difficulties making code really fast (except on Windows, where you can interface via C++/CLI)

I’d say that Swift actually does a pretty good job of avoiding any major compromises (possibly Rust does too; I haven’t used it seriously so can’t make an informed comment). It’s not the best at any of the areas I’ve mentioned, but it’s not too far off either. I don’t know of another single language that can make that claim (but note that it also has its downsides, which I’ll address in the last section of this article). I’ll look briefly at each in turn:

  • Concise: Here’s how to create a new array b that adds 2 to every element of a: let b=a.map {$0+2}. Here, {$0+2} is an anonymous function, with $0 being the automatic name for the first parameter (you can optionally add names and types if you like). The type of b is inferred automatically. As you can see, there’s a lot we’ve done with just a small amount of code!
  • Expressive: The above line of code works not just for arrays, but for any object that supports certain operations (as defined by Sequence in the Swift standard library). You can also add support for Sequence to any of your objects, and even add it to existing Swift types or types in other libraries. As soon as you do so, those objects will get this functionality for free.
  • Flexible: There’s not much that Swift can’t do. You can use it for mobile apps, desktop apps, server code, and even systems programming. It works well for parallel computing, and also can handle (somewhat) small-memory devices.
  • Safe: Swift has a fairly strong type system, which does a good job of noticing when you’ve done something that’s not going to work. It has good support for optional values, without making your code verbose. But when you need extra speed or flexibility, there’s generally ways to bypass Swift’s checks.
  • Fast: Swift avoids the things that can make a language slow; e.g. it doesn’t use garbage collection, allows you to use value types nearly everywhere, and minimizes the need for locking. It uses LLVM behind the scenes, which has excellent support for creating optimized machine code. Swift also makes it easy for the compiler to know when things are immutable, and avoids aliasing, which also helps the compiler optimize. As you’ll see, you can often get the same performance as carefully optimized C code.
  • Easy to use: This is the one area where there is, perhaps, a bit of a compromise. It’s quite easy to write basic Swift programs, but there can be obscure type issues that pop up, mysterious error messages a long way from the real site where the problem occurred, and installing and distributing applications and libraries can be challenging. Also, the language has been changing a lot (for the better!) so most information online is out of date and requires changes to make it work. Having said all that, it’s far easier to use than something like C++.

Protocol-oriented programming

The main trick that lets Swift avoid compromises is its use of Protocol-oriented programming. The basic idea is that we try to use value types as much as possible. In most languages where ease-of-use is important, reference types are widely used since they allow the use of garbage collection, virtual functions, overriding super-class behavior, and so forth. Protocol-oriented programming is Swift’s way of getting many of these benefits, whilst avoiding the overhead of reference types. In addition, by avoiding reference types, we avoid all the complex bugs introduced when we have two variables pointing at the same thing.

Value types are also a great match for functional styles of programming, since they allow for better support of immutability and related functional concerns. Many programmers, particularly in the Javascript world, have recently developed an understanding of how code can be more concise, understandable, and correct, by leveraging a functional style.

If you’ve used a language like C#, you’ll already be familiar with the idea that defining something with struct gives you a value type, and using class gives you a reference type. This is exactly how Swift handles things too.

Before we get to protocols, let’s mention a couple of other fundamentals: Automatic Reference Counting (ARC), and copy-on-write.

Automatic Reference Counting (ARC)

From the docs: “Swift uses Automatic Reference Counting (ARC) to track and manage your app’s memory usage. In most cases, this means that memory management “just works” in Swift, and you do not need to think about memory management yourself. ARC automatically frees up the memory used by class instances when those instances are no longer needed.” Reference counting has traditionally been used by dynamic languages like Perl and Python. Seeing it in a modern compiled language is quite unusual. However, Swift’s compiler works hard to track references carefully, without introducing overhead.

ARC is important both for handling Swift’s reference types (which we still need to use sometimes), and also to handle memory use in value type objects sharing memory with copy-on-write semantics, or which are embedded in a reference type. Chris also mentioned to me a number of other benefits: it provides deterministic destruction, eliminates the common problems with GC finalizers, allows scaling down to systems that don’t/can’t want a GC, and eliminates unpredictable/unreproducible pauses.

Copy-on-write

One major problem with value types in most languages is that if you have something like a big array, you wouldn’t want to pass the whole thing to a function, since that would require a lot of slow memory allocation and copying. So most languages use a pointer or reference in this situation. Swift, however, passes a reference to the original memory, but if the reference mutates the object, only then does it get copied (this is done behind the scenes automatically). So we get the best performance characteristics of value and reference types combined! This is refered to as “copy-on-wrote”, which is rather delightfully refered to in some S4TF docs as “COW 🐮” (yes, with the cow face emoji too!)

COW also helps with programming in a functional style, yet still allowing for mutation when needed—but without the overhead of unnecessary copying or verbosity of manual references.

Protocols

With value types, we can’t use inheritance hierarchies to get the benefits of object-oriented programming (although you can still use these if you use reference types, which are also supported by Swift). So instead, Swift gives us protocols. Many languages, such as Typescript, C#, and Java, have the idea of interfaces—metadata which describes what properties and methods an object can contain. At first glance, protocols look a lot like interfaces. For instance, here’s the definition from my BaseMath library of ComposedStorage, which is a protocol describing a collection that wraps some other collection. It defines two properties, data and endIndex, and one method, subscript (which is a special method in Swift, and provides indexing functionality, like an array). This protocol definition simply says that anything that conforms to this protocol must provide implementations of these three things.

public protocol ComposedStorage {
  associatedtype Storage:MutableCollection where Storage.Index==Int
  typealias Index=Int

  var data: Storage {get set}
  var endIndex: Int {get}
  subscript(i: Int)->Storage.Element {get set}
}

This is a generic protocol. Generic protocols don’t use <Type> markers like generic classes, but instead use the associatedtype keyword. So in this case, ComposedStorage is saying that the data attribute contains something of a generic type called Storage which conforms to the MutableCollection protocol, and that type in turn has an associatedtype called Index which must be of type Int in order to conform to ComposedStorage. It also says that the subscript method returns whatever type the Storage’s Element associatedtype contains. As you can see, protocols provide quite an expressive type system.

Now look further, and you’ll see something else… there are also implementations provided for this protocol!

public extension ComposedStorage {
  subscript(i: Int)->Storage.Element {
    get { return data[i]     }
    set { data[i] = newValue }
  }
  var endIndex: Int {
    return data.count
  }
}

This is where things get really interesting. By providing implementations, we’re automatically adding functionality to any object that conforms to this protocol. For instance, here is the entire definition from BaseMath of AlignedStorage, a class provides array-like functionality but internally uses aligned-memory, which is often required for fast vectorized code:

public class AlignedStorage<T:SupportsBasicMath>: BaseVector, ComposedStorage {
  public typealias Element=T
  public var data: UnsafeMutableBufferPointer<T>

  public required init(_ data: UnsafeMutableBufferPointer<T>) {self.data=data}
  public required convenience init(_ count: Int)      { self.init(UnsafeMutableBufferPointer(count)) }
  public required convenience init(_ array: Array<T>) { self.init(UnsafeMutableBufferPointer(array)) }

  deinit { UnsafeMutableRawBufferPointer(data).deallocate() }

  public var p: MutPtrT {get {return data.p}}
  public func copy()->Self { return .init(data.copy()) }
}

As you can see, there’s not much code at all. And yet this class provides all of the functionality of the protocols RandomAccessCollection, MutableCollection, ExpressibleByArrayLiteral, Equatable, and BaseVector (which together include hundreds of methods such as map, find, dropLast, and distance). This is possible because the protocols that this class conforms to, BaseVector and ComposedStorage, provide this functionality through protocol extensions (either directly, or by other protocols that they in turn conform to).

Incidentally, you may have noticed that I defined AlignedStorage as class, not struct, despite all my earlier hype about value types! It’s important to realize that there are still some situations where classes are required. Apple’s documentation provides some helpful guidance on this topic. One thing that structs don’t (yet) support is deinit; that is, the ability to run some code when an object is destroyed. In this case, we need to deallocate our memory when we’re all done with our object, so we need deinit, which means we need a class.

One common situation where you’ll find you really need to use protocols is where you want the behavior of abstract classes. Swift doesn’t support abstract classes at all, but you can get the same effect by using protocols (e.g. in the above code ComposedStorage defines data but doesn’t implement it in the protocol extension, therefore it acts like an abstract property). The same is true of multiple inheritance: it’s not supported by Swift classes, but you can conform to multiple protocols, each of which can have extensions (this is sometimes refered to as mixins in Swift). Protocol extensions share a lot of ideas with traits in Rust and typeclasses in Haskell.

Generics over Float and Double

For numeric programming, if you’re creating a library then you probably want it to transparently support at least Float and Double. However, Swift doesn’t make this easy. There is a protocol called BinaryFloatingPoint which in theory supports these types, but unfortunately only three math functions in Swift are defined for this protocol (abs, max, and min - and the standard math operators +-*/).

You could, of course, simply provide separate functionality for each type, but then you’ve got to deal with creating two versions of everything, and your users have to deal with the same problem too. Interestingly enough, I’ve found no discussions of this issue online, and Swift’s own libraries suffer from this issue in multiple places. As discussed below, Swift hasn’t been used much at all for numeric programming, and these are the kinds of issues we have to deal with. BTW, if you search for numerical programming code online, you will often see the use of the CGFloat type (which suffers from Objective-C’s naming conventions and limitations, which we’ll learn more about later), but that only provides support for one of float or double (depending on the system you’re running on). The fact that CGFloat exists at all in the Linux port of Swift is rather odd, because it was only created for Apple-specific compatibility reasons; it is almost certainly not something you’ll be wanting to use.

Resolving this problem is actually fairly straightforward, and is a good example of how to use protocols. In BaseMath I’ve created the SupportsBasicMath protocol, which is extracted below:

public protocol SupportsBasicMath:BinaryFloatingPoint {
  func log2() -> Self
  func logb() -> Self
  func nearbyint() -> Self
  func rint() -> Self
  func sin() -> Self

}

Then we tell Swift that Float conforms to this protocol, and we also provide implementations for the methods:

extension Float : SupportsBasicMath {
  @inlinable public func log2() -> Float {return Foundation.log2(self)}
  @inlinable public func logb() -> Float {return Foundation.logb(self)}
  @inlinable public func nearbyint() -> Float {return Foundation.nearbyint(self)}
  @inlinable public func rint() -> Float {return Foundation.rint(self)}
  @inlinable public func sin() -> Float {return Foundation.sin(self)}

}

Now in our library code we can simply use SupportsBasicMath as a constraint on a generic type, and we can call all the usual math functions directly. (Swift already provides support for the basic math operators in a transparent way, so we don’t have to do anything to make that work.)

If you’re thinking that it must have been a major pain to write all those wrapper functions, then don’t worry—there’s a handy trick I used that meant the computer did it for me. The trick is to use gyb templates to auto-generate the methods using python code, like so:

% for f in binfs:
  func ${f}(_ b: Self) -> Self
% end # f

If you look at the Swift code base itself, you’ll see that this trick is used liberally, for example to define the basic math functions themselves. Hopefully in some future version we’ll see generic math functions in the standard library. In the meantime, just use SupportsBasicMath from BaseMath.

Performance tricks and results

One of the really cool things about Swift is that wrappers like the above have no run-time overhead. As you see, I’ve marked them with the inlinable attribute, which tells LLVM that it’s OK to replace calls to this function with the actual function body. This kind of zero-overhead abstraction is one of the most important features of C++; it’s really amazing to see it in such a concise and expressive language as Swift.

Let’s do some experiments to see how this works, by running a simple benchmark: adding 2.0 to every element of an array of 1,000,000 floats in Swift. Assuming we’ve already allocated an array of appropriate size, we can use this code (note: benchmark is a simple function in BaseMath that times a block of code):

benchmark(title:"swift add") { for i in 0..<ar1.count {ar2[i]=ar1[i]+2.0} }
> swift add: .963 ms

Doing a million floating point additions in a millisecond is pretty impressive! But look what happens if we try one minor tweak:

benchmark(title:"swift ptr add") {
  let (p1,p2) = (ar1.p,ar2.p)
  for i in 0..<ar1.count {p2[i]=p1[i]+2.0}
}
> swift ptr add: .487 ms

It’s nearly the same code, yet twice as fast - so what happened there? BaseMath adds the p property to Array, which returns a pointer to the array’s memory; so the above code is using a pointer, instead of the array object itself. Normally, because Swift has to handle the complexities of COW, it can’t fully optimize a loop like this. But by using a pointer instead, we skip those checks, and Swift can run the code at full speed. Note that due to copy-on-write it’s possible for the array to move if you assign to it, and it can also move if you do things such as resize it; therefore, you should only grab the pointer at the time you need it.

The above code is still pretty clunky, but Swift makes it easy for us to provide an elegant and idiomatic interface. I added a new map method to Array, which puts the result into a preallocated array, instead of creating a new array. Here’s the definition of map (it uses some typealiases from BaseMath to make it a bit more concise):

@inlinable public func map<T:BaseVector>(_ f: UnaryF, _ dest: T) where Self.Element==T.Element {
  let pSrc = p; let pDest = dest.p; let n = count
  for i in 0..<n {pDest[i] = f(pSrc[i])}
}

As you can see, it’s plain Swift code. The cool thing is that this lets us now use this clear and concise code, and still get the same performance we saw before:

benchmark(title:"map add") { ar1.map({$0+2.0}, ar2) }
> map add: .518 ms

I think this is quite remarkable; we’ve been able to create a simple API which is just as fast as the pointer code, but to the class user that complexity is entirely hidden away. Of course, we don’t really know how fast this is yet, because we haven’t compared to C. So let’s do that next.

Using C

One of the really nice things about Swift is how easy it is to add C code that you write, or use external C libraries. To use our own C code, we simply create a new package with Swift Package Manager (SPM), pop a .c file in its Sources directory, and a .h file in its Sources/include directory. (Oh and BTW, in BaseMath that .h file is entirely auto-generated from the .c file using gyb!) This level of C integration is extremely rare, and the implications are huge. It means that every C library out there, including all the functionality built in to your operating system, optimized math libraries, Tensorflow’s underlying C API, and so forth can all be accessed from Swift directly. And if you for any reason need to drop in to C yourself, then you can, without any manual interfacing code or any extra build step.

Here’s our sum function in C (this is the float version—the double version is similar, and the two are generated from a single gyb template):

void smAdd_float(const float* pSrc, const float val, float* pDst, const int len) {
  for (int i=0; i<len; ++i) { pDst[i] = pSrc[i]+val; }
}

To call this, we need to pass in the count as an Int32; BaseMath adds the c property to arrays for this purpose (alternatively you could simply use numericCast(ar1.count). Here’s the result:

benchmark(title:"C add") {smAdd_float(ar1.p, 2.0, ar2.p, ar1.c)}
> C add: .488 ms

It’s basically the same speed as Swift. This is a very encouraging result, because it shows that we can get the same performance as optimized C using Swift. And not just any Swift, but idiomatic and concise Swift, which (thanks to methods like reduce and map can look much closer to math equations than most languages that are this fast.

Reductions

Let now try a different experiment: taking the sum of our array. Here’s the most idiomatic Swift code:

benchmark(title:"reduce sum") {a1 = ar1.reduce(0.0, +)}
> reduce sum: 1.308 ms

…and here’s the same thing with a loop:

benchmark(title:"loop sum") { a1 = 0; for i in 0..<size {a1+=ar1[i]} }
> loop sum: 1.331 ms

Let’s see if our earlier pointer trick helps this time too:

benchmark(title:"pointer sum") {
  let p1 = ar1.p
  a1 = 0; for i in 0..<size {a1+=p1[i]}
}
> pointer sum: 1.379 ms

Well that’s odd. It’s not any faster, which suggests that it isn’t getting the best possible performance. Let’s again switch to C and see how it performs there:

float smSum_float(const float* pSrc, const int len) {
  float r = 0;
  for (int i=0; i<len; ++i) { r += pSrc[i]; }
  return r;
}

Here’s the result:

benchmark(title:"C sum") {a1 = smSum_float(ar1.p, ar1.c)}
> C sum: .230 ms

I compared this performance to Intel’s optimized Performance Libraries version of sum and found this is even faster than their hand-optimized assembler! To get this to perform better than Swift, I did however need to know a little trick (provided by LLVM’s vectorization docs), which is to compile with the -ffast-math flag. For numeric programming like this, I recommend you always use at least these flags (this is all I’ve used for these experiments, although you can also add -march=native, and change the optimization level from O2 for Ofast):

-Xswiftc -Ounchecked -Xcc -ffast-math -Xcc -O2

Why do we need this flag? Because strictly speaking, addition is not associative, due to the quirks of floating point. But this is, in practice, very unlikely to be something that most people will care about! By default, clang will use the “strictly correct” behavior, which means it can’t vectorize the loop with SIMD. But with -ffast-math we’re telling the compiler that we don’t mind treating addition as associative (amongst other things), so it will vectorize the loop, giving us a 4x improvement in speed.

The other important thing to remember for good performance in C code like this is to ensure you have const marked on everything that won’t change, as I’ve done in the code above.

Unfortunately, there doesn’t seem to currently be a way to get Swift to vectorize any reduction. So for now at least, we have to use C to get good performance here. This is not a limitation of the language itself, it’s just an optimization that the Swift team hasn’t gotten around to implementing yet.

The good news is: BaseMath adds the sum method to Array, which uses this optimized C version, so if you use BaseMath, you get this performance automatically. So the result of test #1 is: failure. We didn’t manage to get pure Swift to reach the same performance as C. But at least we got a nice C version we can call from Swift. Let’s move on to another test and see if we can get better performance by avoiding doing any reductions.

Temporary storage

So what if we want to do a function reduction, such as sum-of-squares? Ideally, we’d like to be able to combine our map style above with sum, but without getting the performance penalty of Swift’s unoptimized reductions. To make this work, the trick is to use temporary storage. If we use our map function above to store the result in preallocated memory, we can then pass that to our C sum implementation. We want something like a static variable for storing our preallocated memory, but then we’d have to deal with locking to handle contention between threads. To avoid that, we can use thread local storage (TLS). Like most languages, Swift provides TLS functionality; however rather than make it part of the core language (like, say, C#), it provides a class, which we can access through Thread.current.threadDictionary. BaseMath adds the preallocated memory to this dictionary, and makes it available internally as tempStore; this is then the internal implementation of unary function reduction (there are also binary and ternary versions available):

@inlinable public func sum(_ f: UnaryF)->Element {
  self.map(f, tempStore)
  return tempStore.sum()
}

We can then use this as follows:

benchmark(title:"lib sum(sqr)") {a1 = ar1.sum(Float.sqr)}
> lib sum(sqr): .786 ms

This provides a nice speedup over the regular Swift reduce version:

benchmark(title:"reduce sumsqr") {a1 = ar1.reduce(0.0, {$0+Float.sqr($1)})}
> reduce sumsqr: 1.459 ms

Here’s the C version:

float smSum_sqr_float(const float* restrict pSrc, const int len) {
  float r = 0;
  #pragma clang loop interleave_count(8)
  for (int i=0; i<len; ++i) { r += sqrf(pSrc[i]); }
  return r;
}

Let’s try it out:

benchmark(title:"c sumsqr") {a1 = smSum_sqr_float(ar1.p, ar1.c)}
> c sumsqr: .229 ms

C implementations of sum for all standard unary math functions are made available by BaseMath, so you can call the above implementation by simply using:

benchmark(title:"lib sumsqr") {a1 = ar1.sumsqr()}
> c sumsqr: .219 ms

In summary: whilst the Swift version using temporary storage (and calling C for just the final sum) was twice as fast as just using reduce, using C is another 3 or more times faster.

The warts

As you can see, there’s a lot to like about numeric programming in Swift. You can get both the performance of optimized C with the convenience of automatic memory management and elegant syntax.

The most concise and flexible language I’ve used is Python. And the fastest I’ve used is C (well… actually it’s FORTRAN, but let’s not go there.) So how does it stack up against these high bars? The very idea that we could compare a single language to the flexibility of Python and the speed of C is an amazing achievement itself!

Overall, my view is that in general it takes a bit more code in Swift than Python to write the stuff I want to write, and there’s fewer ways to abstract common code out. For instance, I use decorators a lot in Python, and use them to write loads of code for me. I use *args and **kwargs a lot (the new dynamic features in Swift can provide some of that functionality, but it doesn’t go as far). I zip multiple variables together at once (in Swift you have to zip pairs of pairs for multiple variables, and then use nested parens to destructure them). And then there’s the code you have to write to get your types all lined up nicely.

I also find Swift’s performance is harder to reason about and optimize than C. C has its own quirks around performance (such as the need to use const and sometimes even requiring restrict to help the compiler) but they’re generally better documented, better understood, and more consistent. Also, C compilers such as clang and gcc provide powerful additional capabilities using pragmas such as omp and loop which can even auto-parallelize code.

Having said that, Swift closer to achieving the combination of Python’s expressiveness and C’s speed than any other language I’ve used.

There are some issues still to be aware of. One thing to consider is that protocol-oriented programming requires a very different way to doing things to what you’re probably used to. In the long run, that’s probably a good thing, since learning new programming styles can help you become a better programmer; but it could lead to some frustrations for the first few weeks.

This issue is particularly challenging because Swift’s compiler often has no idea where the source of a protocol type problem really is, and its ability to guess types is still pretty flaky. So extremely minor changes, such as changing the name of a type, or changing where a type constraint is defined, can change something that used to work, into something that spits out four screens for error messages. My advice is to try to create minimal versions of your type structures in a standalone test file, and get things working there first.

Note, however, that ease of use generally requires compromises. Python is particularly easy, because it’s perfectly happy for you to shoot yourself in the foot. Swift at least makes sure you first know how to untie your shoelaces. Chris told me: the pitch when building Swift in the first place was that the important thing to optimize for is the “end to end time to get to a correct implementation of whatever you’re trying to do”. This includes both time to pound out code, time to debug it, and time to refactor/maintain it if you’re making a change to an existing codebase. I don’t have enough experience yet, but I suspect that on this metric Swift will turn out to be a great performer.

There are some parts of Swift which I’m not a fan of: the compromises made due to Apple’s history with Objective-C, it’s packaging system, it’s community, and the lack of C++ support. Or to be more precise: it is largely parts of the Swift ecosystem that I’m not a fan of. The language itself is quite delightful. And the ecosystem can be fixed. But, for now, this is the situation that Swift programmers have to deal with, so let’s look at each of these issues in turn.

Objective-C

Objective-C is a language developed in the 1980’s designed to bring some of the object-oriented features of Smalltalk to C. It was a very successful project, and was picked by NeXT as the basis for programming NeXTSTEP in 1988. With NeXT’s acquisition by Apple, it became the primary language for coding for Apple devices. Today, it’s showing its age, and the constraints imposed by the decision to make it a strict superset of C. For instance, Objective-C doesn’t support true function overloading. Instead, it uses something called selectors, which are simply required keyword arguments. Each function’s full name is defined by the concatenation of the function name with all the selector names. This idea is also used by AppleScript, which provides something very similar to allow the name print to mean different things in different contexts:

print page 1
print document 2
print pages 1 thru 5 of document 2

AppleScript in turn inherited this idea from HyperTalk, a language created in 1987 for Apple’s much-loved (and missed) HyperCard program. Given all this history, it’s not surprising that today the idea of required named arguments is something that most folks at Apple have quite an attachment to. Perhaps more importantly, it provided a useful compromise for the designers of Objective-C, since they were able to avoid adding true function overloading to the language, keeping close compatibility with C.

Unfortunately, this constraint impacts Swift today, over 40 years after the situation that led to its introduction in Objective-C. Swift does provide true function overloading, which is particularly important in numeric programming, where you really don’t want to have to create whole separate functions for floats, doubles, and complex numbers (and quaternions, etc…). But by default all keyword names are still required, which can lead to verbose and visually cluttered code. And Apple’s style guide strongly promotes this style of coding; their Objective-C and Swift, style guides closely mirror each other, rather than allowing programmers to really leverage Swift’s unique capabilities. You can opt out of requiring named arguments by prefixing a parameter name with _, which BaseMath uses everywhere that optional arguments are not needed.

Another area where things get rather verbose is when it comes to working with Foundation, Apple’s main class library, which is also used by Objective-C. Swift’s standard library is missing a lot of functionality that you’ll need, so you’ll often need to turn to Foundation to get stuff done. But you won’t enjoy it. After the pleasure of using such a elegantly designed language as Swift, there’s something particularly sad about using it to access as unwieldy a library as Foundation. For instance, Swift’s standard library doesn’t provide a builtin way to format floats with fixed precision, so I decided to add that functionality to my SupportsBasicMath protocol. Here’s the code:

extension SupportsBasicMath {
  public func string(_ digits:Int) -> String {
    let fmt = NumberFormatter()
    fmt.minimumFractionDigits = digits
    fmt.maximumFractionDigits = digits
    return fmt.string(from: self.nsNumber) ?? "\(self)"
  }
}

The fact that we can add this functionality to Float and Double by writing such an extension is really cool, as is the ability to handle failed conversions with Swift’s ?? operator. But look at the verbosity of the code to actually use the NumberFormatter class from Foundation! And it doesn’t even accept Float or Double, but the awkward NSNumber type from Objective-C (which is itself a clunky workaround for the lack of generics in Objective-C). So I had to add an nsNumber property to SupportsBasicMath to do the casting.

The Swift language itself does help support more concise styles, such as the {f($0)} style of closures. Concision is important for numeric programming, since it lets us write our code to reflect more closely the math that we’re implementing, and to understand the whole equation at a glance. For a masterful exposition of this (and much more), see Iverson’s Turing Award lecture Notation as a Tool for Thought.

Objective-C also doesn’t have namespaces, which means that each project picks some 2 or 3 letter prefix which it adds to all symbols. Most of the Foundation library still uses names inherited from Objective-C, so you’ll find yourself using types like CGFloat and functions like CFAbsoluteTimeGetCurrent. (Every time I type one of these symbols I’m sure a baby unicorn cries out in pain…)

The Swift team made the surprising decision to use an Objective-C implementation of Foundation and other libraries when running Swift on Apple devices, but to use native Swift libraries on Linux. As a result, you will sometimes see different behavior on each platform. For instance, the unit test framework on Apple devices is unable to find and run tests that are written as protocol extensions, but they work fine under Linux.

Overall, I feel like the constraints and history of Objective-C seem to bleed in to Swift programming too often, and each time it happens, there’s a real friction that pops up. Over time, however, these issues seem to be reducing, and I hope that in the future we’ll see Swift break out from the Objective-C shackles more and more. For instance, perhaps we’ll see a real effort to create idiomatic Swift replacements for some of the Objective-C class libraries.

Community

I’ve been using Python a lot over the last couple of years, and one thing that always bothered me is that too many people in the Python community have only ever used that one language (since it’s a great beginners’ language and is widely taught to under-graduates). As a result, there’s a lack of awareness that different languages can do things in different ways, and each choice has its own pros and cons. Instead, in the Python world, there’s a tendency for folks to think that the Python way is the one true way.

I’m seeing something similar in Swift, but in some ways it’s even worse: most Swift programmers got their start as Objective-C programmers. So a lot of the discussion you see online is from Objective-C programmers writing Swift in a style that closely parallels how things are done in Objective-C. And nearly all of them do nearly all of their programming in Xcode (which is almost certainly my least favorite IDE, except for its wonderful Swift Playgrounds feature), so a lot of advice you’ll find online shows how to solve Swift problems by getting Xcode to do things for you, rather than writing the code yourself.

Most Swift programmers are writing iOS apps, so you’ll also find a lot of guidance on how to lay out a mobile GUI, but there’s almost no information about things like how to distribute command line programs for Linux, or how to compile static libraries. In general, because the Linux support for Swift is still so new, there’s not much information available how how to use it, and many libraries and tools don’t work under Linux.

Most of the time when I was tracking down problems with my protocol conformance, or trying to figure out how to optimize some piece of code, the only information I could find would be a mailing list discussion amongst Apple’s Swift language team. These discussions tend to focus on the innards of the compiler and libraries, rather than how to use them. So there’s a big missing middle ground between app developers discussing how to use Xcode and Swift language implementation discussing how to modify the compiler. There is a good community forming now around around the Discorse forums at [https://forums.swift.org/], which hopefully over time will turn in to a useful knowledge base for Swift programmers.

Packaging and installation

Swift has an officially sanctioned package system, called Swift Package Manager (SPM). Unfortunately, it’s one of the worst packaging systems I’ve ever used. I’ve noticed that nearly every language, when creating a package manager, reinvents everything from scratch, and fails to leverage all the successes and failures of previous attempts. Swift follows this unfortunate pattern.

There are some truly excellent packaging systems out there. The best, perhaps, was and still is Perl’s CPAN, which includes an international automated testing service that tests all packages on a wide range of systems, deeply integrates documentation, has excellent tutorials, and much more. Another terrific (and more modern) system is conda, which not only handles language-specific libraries (with a focus on Python) but also handles automatically installing compatible system libraries and binaries too—and manages to do everything in your home directory so you don’t even need root access. And it works well on Linux, Mac, and Windows. It can handle distribution of both compiled modules, or source.

SPM, on the other hand, has none of the benefits of any of these systems. Even though Swift is a compiled language, it doesn’t provide a way to create or distribute compiled packages, which means users of your package will have to install all the pre-requisites for building it. And SPM doesn’t let you describe how to build your package, so (for instance) if you use BaseMath it’s up to you to remember to add the flags required for good performance when you build something that uses it.

The way dependencies is handled is really awkward. Git tags or branches are used for dependencies, and there’s no easy way to switch between a local dev build and the packaged version (like, for instance the -e flag to pip or the conda develop command). Instead, you have to modify the package file to change the location of the dependency, and remember to switch it back before you commit.

It would take far too long to document all the deficiencies of SPM; instead, you can work on the assumption that any useful feature you’ve appreciated from whatever packaging system you’re using now probably won’t be in SPM. Hopefully someone will get around to setting up a conda-based system for Swift and we can all just start using that instead…

Also, installation of Swift is a mess. On Linux, for instance, only Ubuntu is supported, and different versions require different installers. On Mac, Swift versions are tied to Xcode versions in a confusing and awkward way, and command line and Xcode versions are somewhat separate, yet somewhat linked, in ways that make my brain hurt. Again, conda seems like it could provide the best option to avoid this, since a single conda package can be used to support any flavor of Linux, and Mac can also be supported in the same way. If the work was done to get Swift on to conda, then it would be possible to say just conda install swift on any system, and everything would just work. This would also provide a solution for versioning, isolated environments, and complex dependency tracking.

(If you’re on Windows, you are, for now, out of luck. There’s an old unofficial port to Cygwin. And Swift runs fine on the Windows Subsystem for Linux. But no official native Windows Swift as yet, sadly. But there is some excellent news on this front: a hero named Saleem Abdulrasool has made great strides towards making a complete native port entirely indepdently, and in the last few days it has gotten to a point where the vast majority of the Swift test suite passes.)

C++

Whilst Apple went with Objective-C for their “C with objects” solution, the rest of the world went with C++. Eventually, the Objective-C extensions were also added to C++, to create “Objective-C++”, but there was no attempt to unify the concepts across the languages, so the resulting language is a mutant with many significant restrictions. However, there is a nice subset of the language that gets around some of the biggest limitations of C; for instance you can use function overloading, and have access to a rich standard library.

Unfortunately, Swift can’t interface with C++ at all. Even something as simple as a header file containing overloaded functions will cause Swift language interop to fail.

This is a big problem for numeric programmers, because many of the most useful numeric libraries today are written in C++. For instance, the ATen library at the heart of PyTorch is C++. There are good reasons that numeric programmers lean towards C++: it provides the features that are needed for concise and expressive solutions to numeric programming problems. For example, Julia programmers are (rightly) proud of how easy it is to support the critical broadcasting functionality in their language, which they have documented in the Julia challenge. In C++ this challenge has an elegant and fast solution. You won’t find something like this in pure C, however.

So this means that a large and increasing number of the most important building blocks for numeric programming are out of bounds for Swift programmers. This is a serious problem. (You can write plain C wrappers for a C++ class, and then create a Swift class that uses those wrappers, but that’s a very big and very boring job which I’m not sure many people are likely to embark on.)

Other languages have shown potential ways around this. For instance, C# on Windows provides “It Just Works” (IJW) interfacing with C++/CLI, a superset of C++ with support for .Net. Even more interestingly, the CPPSharp project leverages LLVM to auto-generate a C# wrapper for C++ code with no calling overhead.

Solving this problem will not be easy for Swift. But because Swift uses LLVM, and already interfaces with C (and Objective-C) it is perhaps better placed to come up with a great solution than nearly any other language. Except, perhaps, for Julia, since they’ve already done this. Twice.

Conclusion

Swift is a really interesting language which can support fast, concise, expressive numeric programming. The Swift for Tensorflow project may be the best opportunity for creating a programming language where differentiable programming is a first class citizen. Swift also lets us easily interface with C code and libraries.

However, Swift on Linux is still immature, the packaging system is weak, installation is clunky, and the libraries suffer from some rough spots due to the historical ties to Objective-C.

So how does it stack up? In the data science world, we’re mainly stuck using either R (which is the least pleasant language I’ve ever used, but with the most beautifully designed data munging and plotting libraries anywhere) or Python (which is painfully slow, very hard to parallelize, but is extremely expressive and has the best deep learning libraries available). We really need another option. Something that is fast, flexible, and provides good interop with existing libraries.

Overall, the Swift language itself looks to be exactly what we need, but much of the ecosystem needs to be replaced or at least dramatically leveled-up. There is no data science ecosystem to speak of, although the S4TF project seems likely to create some important pieces. This is a really good place to be spending time if you’re interested in being part of something that has a huge amount of potential, and has some really great people working to make that happen, and you are OK with helping smooth out the warts along the way.