Performant Code#
Julia was designed for high-performance numerical and scientific computing. In order to fully utilise Julia capabilities writing performant code is cruical, particularly for the computationaly intensive tasks it has been desinged for.
Learning Objectives#
Understand the basics of performance tuning strategies in Julia
Learn about type stability and its importance in Julia
Understand the impact of global variables on performance and how to avoid them
Learn the importance of using built-in functions for optimisation
Learn to choose the appropriate data structure for different tasks
Understand the importance of memory management in writing performance code
Learn to profile and optimise Julia code for better performance
Understand how to measure running time and memory allocations in Julia
Learn to identify bottlenecks in code using profiling tools
Overview of Performance Tuning Strategies#
Performance tuning involves writing efficient code and using profiling tools to identify and address the performance bottlenecks that are found. Some strategies that we will cover include ensuring type stability, minimizing memory allocations, and optimizing computational efficiency.
Efficient Julia Code#
Type Stability#
Definition: Type stability means that the type of a variable’s value is predictable and consistent within a function. Julia’s Just-In-Time (JIT) compiler can optimize type stable code more efficiently.
In the example below, type stability is achieved through the use of the type hinting done with the use of ::Vector{Int}
in the function arguement declaration. The ::Vector{Int}
in Julia specifies that the variable should be a vector (array) of integers, ensuring type stability and enabling the compiler to optimize the code more effectively. Type hinting is achieved with the use of the ::
.
# Type-unstable function
function sum_elements(arr)
s = 0
for x in arr
s += x
end
return s
end
# Type-stable function
function sum_elements_stable(arr::Vector{Int})
s = 0
for x in arr
s += x
end
return s
end
sum_elements_stable (generic function with 1 method)
The above example is used within the context of a function, but type hinting to achieve type stability can be done in at the variable level, with some example below.
x::Int = 10 # x must be an integer
y::Float64 = 3.14 # y must be a 64-bit floating point number
numbers::Vector{Int} = [1, 2, 3, 4] # Vector (array) of integers
matrix::Matrix{Float64} = [1.0 2.0; 3.0 4.0] # Matrix of 64-bit floating point numbers
2×2 Matrix{Float64}:
1.0 2.0
3.0 4.0
Avoiding Global Variables#
Global variables can lead to type instability and hinder the compiler’s optimization efforts. In order to tackle the issues of global vairbales, local variables should be used within functions and pass necessary global data as arguments.
# Inefficient use of global variable
global_data = rand(1000)
function compute_sum()
s = 0
for x in global_data
s += x
end
return s
end
# Efficient approach
function compute_sum(data)
s = 0
for x in data
s += x
end
return s
end
data = rand(1000)
compute_sum(data)
488.298487673433
Utilizing Built-in Functions#
Julia’s built-in functions are highly optimized and should be used when possible to achieve the best performance possible.
# Custom sum function
function custom_sum(arr)
s = 0
for x in arr
s += x
end
return s
end
# Using built-in sum function
arr = rand(1000)
s = sum(arr)
496.59417321092207
Using Appropriate Data Structures#
Choosing the right data structure can significantly impact performance. For example the use of arrays for numerical computations, dictionaries for key-value pairs, and tuples for fixed collections of elements will help you to achieve the most performant Julia code possible.
# Inefficient data structure for numerical computation
data = [i for i in 1:1000]
# Efficient data structure
data = collect(1:1000)
1000-element Vector{Int64}:
1
2
3
4
5
6
7
8
9
10
11
12
13
⋮
989
990
991
992
993
994
995
996
997
998
999
1000
Memory Management#
Minimizing memory allocations and avoiding unnecessary copying of data can improve performance. For example the use of view
can be used to create a lightweight reference to a subset of an array without copying it.
# Inefficient copying of array subset
subset = data[1:10]
# Efficient use of view
subset = view(data, 1:10)
10-element view(::Vector{Int64}, 1:10) with eltype Int64:
1
2
3
4
5
6
7
8
9
10
Profiling and Optimization#
Introduction to Profiling#
Profiling is the process of measuring the performance of your code to identify slow parts and optimization opportunities. As Julia is designed to make performant code, Julia has a range of built-in profiling tools to help developers to understand the performance of their code.
Running Time#
@time
is used within Julia to quickly measure the execution time and memory allocations of a piece of code.
@time sum(rand(1000))
0.000003 seconds (1 allocation: 8.000 KiB)
490.9124246953268
We can see that running code to run the sum of 1000 rand numbers took a defined amount of time to run. If we increase the size of the numbers that we want to sum over then it will take longer to run. Understanding how the time taken to run the code will change when the size of the input data changes is key to being able to understadn the complexity of the code. It is common place to have a large computational task that is run for a smaller subset of data that will then be able to be used to extrapolate to an estimate of the time it will take the full computational run.
@time sum(rand(1000000000))
3.589924 seconds (2 allocations: 7.451 GiB, 0.08% gc time)
5.000077359347286e8
Optimization Techniques Based on Profiling#
Identifying Bottlenecks#
The primary purpose of using profiling tools is to identify what the bottlenecks within your code are. The bottlenecks are the portions of the code that are taking a considerable amount of time to run are, which can then be improved to improve the overall runtime of the code.
# Inefficient code
function inefficient_sum(arr)
s = 0
for i in 1:length(arr)
s += arr[i]
end
return s
end
# Optimized code
function optimized_sum(arr)
s = 0
for x in arr
s += x
end
return s
end
data = rand(100000)
@time inefficient_sum(data)
@time optimized_sum(data)
0.003901 seconds (3.24 k allocations: 217.344 KiB, 86.99% compilation time)
0.003357 seconds (2.65 k allocations: 167.969 KiB, 85.11% compilation time)
50000.345750146895
In the above example you the inefficient_sum
function is slower because it repeatedly calls length(arr)
and uses array indexing arr[i]
, which incurs additional overhead. In contrast, the optimized_sum
function iterates directly over the elements of the array using for x in arr
, which is more efficient and avoids the extra overhead associated with indexing. This is just a very simple example of some of the profiling techniques that can be used, and the particulars of how to optimise a given piece of code will depend on the code that is being looked at.