Related: rust, systems-programming, Demystifying Alignment and Memory Layout in Rust
Here’s a list of exercises to test your understanding of memory alignment and layout for various types, some of which are Rust specific. The assumption for this exercise is that the Rust-style representation (#[repr(Rust)]
) is used.
Info
If you’re unfamiliar with the concept of alignment and memory layout, or simply need a refresher, read my write-up on it.
Verifying for yourself
For any examples here, you can verify the results for yourself in Rust with the provided example code below.
Stretch 🙆♂️
bool
Size Alignment 1 byte Byte-aligned While a boolean requires only a single-bit in theory, in practice it needs to be byte-aligned
u32
,i32
,f32
Size Alignment 4 bytes 4-byte aligned These types take up 32 bits, and their alignments are natively aligned to their size (for builtin types).
char
Size Alignment 4 bytes 4-byte aligned 4 bytes is sufficient to represent all possible unicode characters; natively aligned.
&str
Size Alignment 16 bytes 8-byte aligned This is a view into a UTF-8 string, containing two fields in static memory,
length: usize
andaddress: usize
. On most modern computers running a 64-bit architecture,usize
is 8 bytes long.Since this is a struct under the hood, the type’s alignment matches its field’s largest alignment, which is 8-bytes for either fields.
String
Size Alignment 24 bytes 8-byte aligned
String
is a(n owned) vector of UTF-8 encoded characters (i.e.Vec<u8>
1). Vectors contain bothlength: usize
,address: usize
and alsocapacity: usize
.Similar logic in previous examples apply for its alignment.
Walk 🚶
Given the following struct:
User
Size Alignment 32 bytes 8-byte aligned Rust orders the fields’ data in decreasing size, starting from
name
(24 bytes),age
(4 bytes) andenglish_native_lang
(1 byte). The fields’ data are power-of-two aligned so they can be appended directly to each other without padding.
User
’s alignment follows the largest of its field, which isname
(8-byte aligned). This means that 3 bytes of padding is added at the end to alignUser
.
(bool, u32)
Size Alignment 8 bytes 4-byte aligned The memory layout for tuples follow the same logic as structs. So the
u32
member is placed first, followed by thebool
member. Since this type is 4-byte aligned (following theu32
member’s alignment), 3 bytes are padding is added to “round up” its memory layout to a multiple of 4.
[User; 3]
Size Alignment 96 bytes 8-byte aligned This is an array of
User
, 3 elements long. For slices, the data is appended contiguously.
Zero-sized types:
()
,struct Empty;
,PhantomData
Size Alignment 0 bytes N/A As the name implies, these types in Rust have zero size. In fact, they only exist in code, and are optimized away by the compiler during runtime! 🔥
Given
SpaceShuttleStatus
Size Alignment 16 bytes 8-byte aligned Enums’ memory layout contains a discriminator part first followed by the variant part. Since there are three variants, the discriminator is of type
u8
.The variants’ size and alignments are computed independently. Both
OnGround
andDocked
are zero-sized, whileLaunched
is 8-bytes large and is 8-byte aligned.The variant part of the enum takes the largest variant’s size, which is
Launched
at 8-bytes. Since this has to be 8-byte aligned, data for the variant part cannot be appended directly after the discriminator. Instead, 7 bytes of padding has to be first inserted.
Run 🏃
SpaceShuttleFlight
Let’s think step-by-step,2 by working out the memory layout of the fields’ type.
SpaceShuttleStatus
is 16 bytes long, and is 8-byte aligned.
Vec<Astronaut>
points to a dynamic number ofAstronaut
s on the heap—the layout and alignment ofAstronaut
itself is not relevant for that ofSpaceShuttleFlight
, justVec<T>
. We know thatVec<T>
is always 24 bytes long, and 8-byte aligned (see [[#^0ae240|String
example]]).And so, the memory layout for
SpaceShuttleFlight
is:
Size Alignment 64 bytes 8-byte aligned Either the
id
orpassengers
field is place first since they both have the largest size (24 bytes).3status
has a size of 16 bytes and is placed last in the struct’s memory layout.All the fields are 8-byte aligned, and their sizes are in multiples of 8, so no additional padding is required to align the fields or the
SpaceShuttleFlight
struct itself as a whole.
Footnotes
-
UTF-8 is a variable-width encoding—a character may span between 1 and 4 bytes. Storing them as
u8
is more efficient than#[repr(transparent)]
which takes up 4 bytes. ↩ -
This is a real-life human response, not a
ChainOfThought
response from a chatbot 😆 ↩ -
For fields of the same size, the Rust compiler doesn’t provide guarantees on which field’s data goes first. ↩