21 October 2019 / SWIFT

Value Types and Reference Types in Swift

Value types and reference types are the core concepts in Swift. Needless to say, understanding them is fundamental to every Swift developer. In present article let’s cover next questions:

What are value and reference semantics?
How structs and classes are stored in memory?
How value and reference types perform?
What happens if you mix both?
When to use what?

Defining Value Types and Reference Types

Swift has three ways of declaring a type: classes, structs and enums. They can be categorized into value types (structs and enums) and reference types (classes). The way they are stored in memory defines the difference between them:

Value types are stored directly where the computation goes. Each variable of value type has its own copy of data and operations on one do not affect the other.
Reference types are stored somewhere else and we have a reference pointing to that place in memory. Variables of reference type can point to the same data; hence operations on one variable can affect the data pointed by the other variable.

Value and Reference Semantics

Value and reference types have semantic difference in what they can represent in your code and how they are passed around:

Reference objects represent an identity. We usually want only one class object to designate an identity in the real world, e.g a customer. Any object that references a customer will reference the same class object.
Value objects are such things as location. We may have several value objects representing the same object in the real world. If we have two locations and wish to see whether they are equal, we don’t look at their identities but rather at the values they represent.

When talking about Swift value and reference types, we usually mean structs and classes. Throughout the article we’ll be using the terms ‘value type’ and ‘struct’, ‘reference type’ and ‘class’ interchangeably.

Performance Dimensions

Apart from mutabiltiy and semantics, the distinction between value and reference types matters in terms of performance. The three dimensions that contribute to the performance difference of Swift structs (and enums) vs classes are:

cost of copying;
cost of allocation and deallocation;
cost of reference counting;

Since we’ll be talking about memory a lot, let’s make sure that we understand what memory is and how data is stored.

Memory Segments

Memory is just a long list of bytes. The bytes are arranged orderly, every byte having its own address. Range of discrete addresses is known as an address space.

The address space of an iOS app logically consists of four segments: text, data, the stack and the heap [Tanenbaum], [Mach-O Runtime Architecture]:

Swift Value Types and Reference Types: Understanding the Difference

The text segment contains the machine instructions that form the app’s executable code. It is produced by the compiler by translating the Swift code into machine code. This segment is read-only and takes constant space.

The data segment stores Swift static variables, constants and type metadata. All global data that needs an initial value when the program is started goes here.

The stack stores temporary data: method parameters and local variables. Every time we call a method, a new piece of memory is allocated on the stack. This memory is freed when the method exits. With some exceptions, all Swift value types go here.

The heap stores objects that have a lifetime. These are all Swift reference types and some cases of value types. The heap and the stack are growing towards each other.

Swift value types are allocated on the stack. Reference types are allocated on the heap.

Now that we’ve studied how memory segments work, let’s see how things are stored in memory.

Cost of Heap vs Stack Allocation

The stack memory segment works just like the stack data structure. You can push on top of the stack and pop off the end of it. A pointer to the top of the stack is enough to implement both operations. Therefore, we can allocate the memory that we need just by decrementing the stack pointer to make space. When the method exits, we increment the stack pointer to where it was before we called this method.

The cost of stack allocations and deallocations is the cost of assigning an integer [WWDC-416].

There is more involved into the heap allocations. We have to search the heap data structure to find an empty block of memory of the appropriate size. We also have to synchronize the heap, since multiple threads can be allocating memory there at the same time. To deallocate memory from the heap we have to reinsert that memory back to the appropriate position.

Heap allocations and deallocations costs are way larger than the stack ones [WWDC-416].

Although value and reference types are typically allocated on the stack and on the heap respectively, there are exceptions to these rules which are worth consideration.

Stack Promotion of Swift Reference Types

Swift compiler may promote reference types to be allocated on the stack when their size is fixed or lifetime can be predicted. This optimization happens during the SIL generation phase.

The Swift Intermediate Language (SIL) is a high-level, Swift-specific intermediate language suitable for further analysis and optimization of Swift code.

Here are some of such examples that I’ve discovered from reading the Swift compiler source code.

💡 Here you can learn different phases of Xcode Build System.

Boxing of Swift Value Types

Swift compiler may box value types and allocate them on the heap. I’ve attempted to create a comprehensive list by reading the Swift compiler source code.

Value types can be boxed in following cases:

1. When conforming to a protocol. Apart from the allocation cost, extra overhead appears when value type is stored within an existential container and exceeds 3 machine words length.

Existential container is a generic container for a value of unknown runtime type. Small value types can be inlined inside the existential container. The bigger ones are allocated on the heap and reference to them is stored inside the existential container buffer. The lifetime of such values is managed by Value Witness Table. This introduces reference counting overhead and a couple of levels of indirection when calling protocol methods.

Let’s see how boxing looks in SIL-generated code. We declare a protocol Bar and a struct Baz conforming to it:

protocol Bar {}
struct Baz: Bar {}

The command to print SIL representation of a Swift file:

swiftc -emit-silgen -O main.swift

The output shows that self is boxed in init():

// Baz.init()
sil hidden [ossa] @$s6boxing3BazVACycfC : $@convention(method) (@thin Baz.Type) -> Baz {
bb0(%0 : $@thin Baz.Type):
  %1 = alloc_box ${ var Baz }, var, name "self"   // user: %2
  ...
}

2. When mixing value and reference types.

It’s common to store a reference to a class from a struct and to have a struct as class field:

// Class inside a struct
class A {}
struct B { 
  let a = A() 
}

// Struct inside a class
struct C {}
class D {
    let c = C()
}

The SIL output shows that in both cases the structs B and C are allocated on the heap:

// B.init()
sil hidden [ossa] @$s6boxing1BVACycfC : $@convention(method) (@thin B.Type) -> @owned B {
bb0(%0 : $@thin B.Type):
  %1 = alloc_box ${ var B }, var, name "self"     // user: %2
  ...
}

// C.init()
sil hidden [ossa] @$s6boxing1CVACycfC : $@convention(method) (@thin C.Type) -> C {
bb0(%0 : $@thin C.Type):
  %1 = alloc_box ${ var C }, var, name "self"     // user: %2
  ...
}

3. Generic value types.

Let’s declare a generic struct:

struct Bas<T> {
    var x: T

    init(xx: T) {
        x = xx
    }
}

The SIL output shows that self is boxed in init(xx:):

// Bas.init(xx:)
bb0(%0 : $*Bas<T>, %1 : $*T, %2 : $@thin Bas<T>.Type):
  %3 = alloc_box $<τ_0_0> { var Bas<τ_0_0> } <T>, var, name "self" // user: %4
  ....
}

4. Escaping closure captures.

Swift’s closure model is that all local variables are captured by reference. Some of them may still be promoted to the stack as explained in CapturePromotion.

5. Inout arguments.

Let’s generate SIL for foo(x:) that accepts an inout argument:

func foo(x: inout Int) {
    x += 1
}

The SIL output shows that foo(x:) is boxing:

// foo(x:)
sil hidden [ossa] @$s6boxing3foo1xySiz_tF : $@convention(thin) (@inout Int) -> () {
// %0                                             // users: %7, %1
bb0(%0 : $*Int):
...
}

Cost of Copying

As we already know, most value types are allocated on the stack and copying them takes constant time. What contributes to the speed is that primitive types like integers and floating-point numbers are stored in CPU registers and there is no need to access RAM memory when copying them. Most of Swift extensible types, like strings, arrays, sets and dictionaries are copied on write. This means that copy only happens at the point of mutation.

Since reference types do not directly store their data, we only incur reference counting costs when copying them. There is more involved into it than just incrementing and decrementing an integer. Each operation requires several levels of indirection and must be performed atomically, since heap can be shared between multiple threads at the same time.

💡 We discuss the principles of ARC and the life cycle of heap-allocated objects in Advanced iOS Memory Management with Swift.

Things become interesting when we mix value and reference types. If structs or enums contain references, they’re going to be paying reference counting overhead proportional to the number of references they contain. This is best demonstrated by a code sample. Let’s create a struct with references and a class with references and count their retain count.

class Ref {}

// Struct with references
struct MyStruct {
    let ref1 = Ref()
    let ref2 = Ref()
}

// Class with references
class MyClass {
    let ref1 = Ref()
    let ref2 = Ref()
}

Let’s print reference counts for MyStruct:

let a = MyStruct()
let anotherA = a
print("self:", CFGetRetainCount(a as CFTypeRef))
print("ref1:", CFGetRetainCount(a.ref1))
print("ref1:", CFGetRetainCount(a.ref2))

It will print:

self: 1
ref1: 2
ref1: 2 

And for MyClass:

let b = MyClass()
let anotherB = b
print("self:", CFGetRetainCount(b))
print("ref1:", CFGetRetainCount(b.ref1))
print("ref1:", CFGetRetainCount(b.ref2))

It will print:

self: 2
ref1: 1
ref1: 1

The output shows that MyStruct incurs twice the reference counting cost.

The binary used to generate reference counts were compiled with debugging disabled and optimization level set to Optimize for Speed [-O].

Making the Choice: Struct vs Class

There is no simple answer to the question of whether you should be using a class or a struct. Although Apple recommends using classes for identity and structs for the rest of the cases, that’s not enough to guide your decision. Since every situation is different, we should take into account performance dimensions:

Value types with inner references should be avoided, since they violate value semantics and introduce reference counting overhead.
Value types with dynamic behavior, like arrays and strings, should adopt copy-on-write to amortize the cost of copy.
Value types are boxed when conforming to a protocol or being generic, leading to higher allocation costs.

For the rest of the cases the best answer we can give is: “It depends”. Evaluate every factor to make your choice as weighted as possible.

Thanks for reading!

If you enjoyed this post, be sure to follow me on Twitter to keep up with the new content. There I write daily on iOS development, programming, and Swift.