Performant Code#

Julia was designed for high-performance numerical and scientific computing. In order to fully utilise Julia capabilities writing performant code is cruical, particularly for the computationaly intensive tasks it has been desinged for.

Learning Objectives#

  • Understand the basics of performance tuning strategies in Julia

  • Learn about type stability and its importance in Julia

  • Understand the impact of global variables on performance and how to avoid them

  • Learn the importance of using built-in functions for optimisation

  • Learn to choose the appropriate data structure for different tasks

  • Understand the importance of memory management in writing performance code

  • Learn to profile and optimise Julia code for better performance

  • Understand how to measure running time and memory allocations in Julia

  • Learn to identify bottlenecks in code using profiling tools

Overview of Performance Tuning Strategies#

Performance tuning involves writing efficient code and using profiling tools to identify and address the performance bottlenecks that are found. Some strategies that we will cover include ensuring type stability, minimizing memory allocations, and optimizing computational efficiency.

Efficient Julia Code#

Type Stability#

Definition: Type stability means that the type of a variable’s value is predictable and consistent within a function. Julia’s Just-In-Time (JIT) compiler can optimize type stable code more efficiently.

In the example below, type stability is achieved through the use of the type hinting done with the use of ::Vector{Int} in the function arguement declaration. The ::Vector{Int} in Julia specifies that the variable should be a vector (array) of integers, ensuring type stability and enabling the compiler to optimize the code more effectively. Type hinting is achieved with the use of the ::.

# Type-unstable function
function sum_elements(arr)
    s = 0
    for x in arr
        s += x
    end
    return s
end

# Type-stable function
function sum_elements_stable(arr::Vector{Int})
    s = 0
    for x in arr
        s += x
    end
    return s
end
sum_elements_stable (generic function with 1 method)

The above example is used within the context of a function, but type hinting to achieve type stability can be done in at the variable level, with some example below.

x::Int = 10         # x must be an integer
y::Float64 = 3.14   # y must be a 64-bit floating point number
numbers::Vector{Int} = [1, 2, 3, 4]         # Vector (array) of integers
matrix::Matrix{Float64} = [1.0 2.0; 3.0 4.0] # Matrix of 64-bit floating point numbers
2×2 Matrix{Float64}:
 1.0  2.0
 3.0  4.0

Avoiding Global Variables#

Global variables can lead to type instability and hinder the compiler’s optimization efforts. In order to tackle the issues of global vairbales, local variables should be used within functions and pass necessary global data as arguments.

# Inefficient use of global variable
global_data = rand(1000)

function compute_sum()
    s = 0
    for x in global_data
        s += x
    end
    return s
end

# Efficient approach
function compute_sum(data)
    s = 0
    for x in data
        s += x
    end
    return s
end

data = rand(1000)
compute_sum(data)
505.9341173092064

Utilizing Built-in Functions#

Julia’s built-in functions are highly optimized and should be used when possible to achieve the best performance possible.

# Custom sum function
function custom_sum(arr)
    s = 0
    for x in arr
        s += x
    end
    return s
end

# Using built-in sum function
arr = rand(1000)
s = sum(arr)
491.2805725536547

Using Appropriate Data Structures#

Choosing the right data structure can significantly impact performance. For example the use of arrays for numerical computations, dictionaries for key-value pairs, and tuples for fixed collections of elements will help you to achieve the most performant Julia code possible.

# Inefficient data structure for numerical computation
data = [i for i in 1:1000]

# Efficient data structure
data = collect(1:1000)
1000-element Vector{Int64}:
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
    ⋮
  989
  990
  991
  992
  993
  994
  995
  996
  997
  998
  999
 1000

Memory Management#

Minimizing memory allocations and avoiding unnecessary copying of data can improve performance. For example the use of view can be used to create a lightweight reference to a subset of an array without copying it.

# Inefficient copying of array subset
subset = data[1:10]

# Efficient use of view
subset = view(data, 1:10)
10-element view(::Vector{Int64}, 1:10) with eltype Int64:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

Profiling and Optimization#

Introduction to Profiling#

Profiling is the process of measuring the performance of your code to identify slow parts and optimization opportunities. As Julia is designed to make performant code, Julia has a range of built-in profiling tools to help developers to understand the performance of their code.

Running Time#

@time is used within Julia to quickly measure the execution time and memory allocations of a piece of code.

@time sum(rand(1000))
  0.000002 seconds (1 allocation: 8.000 KiB)
499.9326034342019

We can see that running code to run the sum of 1000 rand numbers took a defined amount of time to run. If we increase the size of the numbers that we want to sum over then it will take longer to run. Understanding how the time taken to run the code will change when the size of the input data changes is key to being able to understadn the complexity of the code. It is common place to have a large computational task that is run for a smaller subset of data that will then be able to be used to extrapolate to an estimate of the time it will take the full computational run.

@time sum(rand(1000000000))
  3.664800 seconds (2 allocations: 7.451 GiB, 0.78% gc time)
4.999977281247152e8

Optimization Techniques Based on Profiling#

Identifying Bottlenecks#

The primary purpose of using profiling tools is to identify what the bottlenecks within your code are. The bottlenecks are the portions of the code that are taking a considerable amount of time to run are, which can then be improved to improve the overall runtime of the code.

# Inefficient code
function inefficient_sum(arr)
    s = 0
    for i in 1:length(arr)
        s += arr[i]
    end
    return s
end

# Optimized code
function optimized_sum(arr)
    s = 0
    for x in arr
        s += x
    end
    return s
end

data = rand(100000)
@time inefficient_sum(data)
@time optimized_sum(data)
  0.010771 seconds (3.24 k allocations: 217.344 KiB, 95.43% compilation time)
  0.003468 seconds (2.65 k allocations: 167.969 KiB, 86.81% compilation time)
50088.16768919239

In the above example you the inefficient_sum function is slower because it repeatedly calls length(arr) and uses array indexing arr[i], which incurs additional overhead. In contrast, the optimized_sum function iterates directly over the elements of the array using for x in arr, which is more efficient and avoids the extra overhead associated with indexing. This is just a very simple example of some of the profiling techniques that can be used, and the particulars of how to optimise a given piece of code will depend on the code that is being looked at.