As far as I know, reference/pointer aliasing can hinder the compiler's ability to generate optimized code, since they must ensure the generated binary behaves correctly in the case where the two references/pointers indeed alias. For instance, in the following C code,
void adds(int *a, int *b) {
*a += *b;
*a += *b;
}
when compiled by clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
with the -O3
flag, it emits
0000000000000000 <adds>:
0: 8b 07 mov (%rdi),%eax # load a into EAX
2: 03 06 add (%rsi),%eax # load-and-add b
4: 89 07 mov %eax,(%rdi) # store into a
6: 03 06 add (%rsi),%eax # load-and-add b again
8: 89 07 mov %eax,(%rdi) # store into a again
a: c3 retq
Here the code stores back to (%rdi)
twice in case int *a
and int *b
alias.
When we explicitly tell the compiler that these two pointers cannot alias with the restrict
keyword:
void adds(int * restrict a, int * restrict b) {
*a += *b;
*a += *b;
}
Then Clang will emit a more optimized version that effectively does *a += 2 * (*b)
, which is equivalent if (as promised by restrict
) *b
isn't modified by assigning to *a
:
0000000000000000 <adds>:
0: 8b 06 mov (%rsi),%eax # load b once
2: 01 c0 add %eax,%eax # double it
4: 01 07 add %eax,(%rdi) # *a += 2 * (*b)
6: c3 retq
Since Rust makes sure (except in unsafe code) that two mutable references cannot alias, I would think that the compiler should be able to emit the more optimized version of the code.
When I test with the code below and compile it with rustc 1.35.0
with -C opt-level=3 --emit obj
,
#![crate_type = "staticlib"]
#[no_mangle]
fn adds(a: &mut i32, b: &mut i32) {
*a += *b;
*a += *b;
}
it generates:
0000000000000000 <adds>:
0: 8b 07 mov (%rdi),%eax
2: 03 06 add (%rsi),%eax
4: 89 07 mov %eax,(%rdi)
6: 03 06 add (%rsi),%eax
8: 89 07 mov %eax,(%rdi)
a: c3 retq
This does not take advantage of the guarantee that a
and b
cannot alias.
Is this because the current Rust compiler is still in development and has not yet incorporated alias analysis to do the optimization?
Is this because there is still a chance that a
and b
could alias, even in safe Rust?
unsafe
code, aliasing mutable references are not allowed and result in undefined behavior. You can have aliasing raw pointers, but unsafe
code does not actually allow you to ignore Rust standard rules. It's just a common misconception and thus worth pointing out.
+=
operations in the body of adds
can be reinterpreted as *a = *a + *b + *b
. If the pointers don't alias, they can, you can even see what amounts to b* + *b
in the second asm listing: 2: 01 c0 add %eax,%eax
. But if they do alias, they can't, because by the time you add *b
for the second time, it will contain a different value than the first time around (the one you store on line 4:
of the first asm listing).
*a += 2 * (*b)
equivalence for future readers.
Rust originally did enable LLVM's noalias
attribute, but this caused miscompiled code. When all supported LLVM versions no longer miscompile the code, it will be re-enabled.
If you add -Zmutable-noalias=yes
to the compiler options, you get the expected assembly:
adds:
mov eax, dword ptr [rsi]
add eax, eax
add dword ptr [rdi], eax
ret
Simply put, Rust put the equivalent of C's restrict
keyword everywhere, far more prevalent than any usual C program. This exercised corner cases of LLVM more than it was able to handle correctly. It turns out that C and C++ programmers simply don't use restrict
as frequently as &mut
is used in Rust.
This has happened multiple times.
Rust 1.0 through 1.7 — noalias enabled
Rust 1.8 through 1.27 — noalias disabled
Rust 1.28 through 1.29 — noalias enabled
Rust 1.30 through 1.54 — noalias disabled
Rust 1.54 through ??? — noalias conditionally enabled depending on the version of LLVM the compiler uses
Related Rust issues
Current case Incorrect code generation for nalgebra's Matrix::swap_rows() #54462 Re-enable noalias annotations by default once LLVM no longer miscompiles them #54878 Enable mutable noalias for LLVM >= 12 #82834 Regression: Miscompilation due to bug in "mutable noalias" logic #84958
Incorrect code generation for nalgebra's Matrix::swap_rows() #54462
Re-enable noalias annotations by default once LLVM no longer miscompiles them #54878
Enable mutable noalias for LLVM >= 12 #82834
Regression: Miscompilation due to bug in "mutable noalias" logic #84958
Previous case Workaround LLVM optimizer bug by not marking &mut pointers as noalias #31545 Mark &mut pointers as noalias once LLVM no longer miscompiles them #31681
Workaround LLVM optimizer bug by not marking &mut pointers as noalias #31545
Mark &mut pointers as noalias once LLVM no longer miscompiles them #31681
Other make use of LLVM's scoped noalias metadata #16515 Missed optimization: references from pointers aren't treated as noalias #38941 noalias is not enough #53105 mutable noalias: re-enable permanently, only for panic=abort, or stabilize flag? #45029
make use of LLVM's scoped noalias metadata #16515
Missed optimization: references from pointers aren't treated as noalias #38941
noalias is not enough #53105
mutable noalias: re-enable permanently, only for panic=abort, or stabilize flag? #45029
Success story sharing
restrict
and miscompile on both Clang and GCC. It’s not limited to languages that aren’t “C++ enough”, unless you count C++ itself in that group.noalias
pointers into account when executing. It created new pointers based on input pointers, improperly copying thenoalias
attribute even though the new pointers did alias.