optimizing binary size
2022-02-20 ยท 5 min read
Setup #
# for running cargo bloat
$ RUSTFLAGS="-C target-cpu=native" c install cargo-bloat
# for running cargo size (among other things)
$ rustup component add llvm-tools-preview
$ rustup +nightly component add llvm-tools-preview # (for strip=symbols)
$ RUSTFLAGS="-C target-cpu=native" c install cargo-binutils
TL;DR #
# compile and sort functions by binary size
$ cargo bloat --release -n 100
# only like 20-25% of the binary size seems to be our code or other relevant
# stuff like ndarray. The rest seems to be mostly panic and fmt
# infrastructure...
# compile and sort crates by binary size
$ cargo bloat --release --crates
# print the size of each section in the binary
$ cargo size --bin kcddice --release -- -A
# if you already have a built rust binary, you can run
# rust-size directly:
$ rust-size -A target/release/kcddice
# strip debug info and all symbols (requires nightly) then print section size
$ RUSTFLAGS="-Z strip=symbols -C target-cpu=native" cargo +nightly \
size --bin kcddice --release -- -A
# before: 1.8 MiB! looking at the sections, it's mostly debug info.
# "-Z strip=symbols" brings this down to like 330-400 KiB (depending on
# other flags etc...)
# TODO: cargo-binutils also installed `cargo strip`; maybe that's helpful?
Cargo.toml #
[profile.release]
codegen-units = 1
lto = true
panic = "abort"
# opt-level = "s" # optimize for size, but still unroll
# opt-level = "z" # optimize for size, no unrolling at all
opt-level = 3
debug = 0
Compile std
with panic = "abort"
#
- shaves off maybe 150 KiB?
- removes a decent chunk of the backtrace/unwind infrastructure
# .cargo/config.toml
[unstable]
build-std = ["std", "panic_abort"]
build-std-features = [] # <- turns off backtrace+unwind features
WASM #
https://rustwasm.github.io/twiggy/
Example: fixing bloat #
Let me just run a quick smoketest (which depends on almost every crate in the monorepo)...
$ cargo test -p smoketest
# ..
Finished test [unoptimized + debuginfo] target(s) in 1m 28s
Running unittests src/lib.rs (target/debug/deps/smoketest-bd637d7668a0b714)
# ..
Man that sure took a while to link, I wonder how big the binary is?
$ ls -lah target/debug/deps/smoketest-bd637d7668a0b714
-rwxrwxr-x 1 phlip9 phlip9 880M May 4 11:19 target/debug/deps/smoketest-bd637d7668a0b714
JESUS. RIP MY SSD.
$ rustup component add llvm-tools-preview
$ rust-size -A target/debug/deps/smoketest-bd637d7668a0b714
section size addr
.interp 28 848
.note.gnu.property 32 880
.note.gnu.build-id 36 912
.note.ABI-tag 32 948
.gnu.hash 48 984
.dynsym 3816 1032
.dynstr 2240 4848
.gnu.version 318 7088
.gnu.version_r 432 7408
.rela.dyn 3597696 7840
.rela.plt 120 3605536
.init 27 3608576
.plt 96 3608608
.plt.got 24 3608704
.text 48736640 3608768
.fini 13 52345408
.rodata 3077016 52346880
.debug_gdb_scripts 34 55423896
.eh_frame_hdr 1529196 55423932
.eh_frame 5260568 56953128
.gcc_except_table 1449080 62213696
.tdata 72 63669824
.tbss 696 63669896
.init_array 16 63669896
.fini_array 8 63669912
.data.rel.ro 1139144 63669920
.dynamic 544 64809064
.got 701808 64809608
.data 24032 65511424
.bss 2056 65535456
.comment 43 0
.debug_aranges 5092144 0
.debug_pubnames 117174166 0
.debug_info 183307521 0
.debug_abbrev 2351170 0
.debug_line 29000740 0
.debug_str 182931466 0
.debug_loc 3646232 0
.debug_pubtypes 287664570 0
.debug_ranges 14749600 0
.debug_macro 11301 0
Total 891454821
WTF IS GOING ON WITH THE .debug_pubtypes
SECTION???
Ok ok, let's take a look at what we're working with...
$ sudo apt install dwarfdump
$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
| head -n 10
.debug_pubtypes
'ErrorData<alloc::boxed::Box<std::io::error::Custom, alloc::alloc::Global>>'
'alloc::boxed::Box<std::io::error::Custom, alloc::alloc::Global>'
'alloc::boxed::Box<(dyn core::error::Error + core::marker::Send + core::marker::Sync), alloc::alloc::Global>'
'Result<(), std::io::error::Error>'
'NonNull<u8>'
'u8'
'SimpleMessage'
'ErrorKind'
How many types we got?
$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
| wc -l
| numfmt --to=si
2.3M
Maybe there's some giga types?
$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
| awk '{ print length, $0 }' \
| sort -n -r \
> smoketest_debug_pubtypes
$ head -n 10 smoketest_debug_pubtypes
72011 '{closure_env#0}<&str, &str, n ..
71962 '&mut (nom::sequence::terminat ..
71957 '(nom::sequence::terminated::{ ..
71957 '(nom::sequence::terminated::{ ..
66740 '&mut nom::branch::alt::{closu ..
66717 '{closure_env#0}<&str, &str, n ..
66717 '{closure_env#0}<&str, &str, n ..
66668 '&mut (nom::sequence::terminat ..
66663 '(nom::sequence::terminated::{ ..
66663 '(nom::sequence::terminated::{ ..
Ok despite nom
taking to top 10, it looks like the primary culprit is my arch nemesis warp
. CURSE YOU WARP AND YOUR COMPOSABLE GENERICS.
Let's see what proportion of our .debug_pubtypes
is warp...
$ cat smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
271MiB
$ grep "nom" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
3.3MiB
$ grep "warp" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
82MiB
$ grep "lightning" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
45MiB
$ grep "proptest" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
5MiB
$ grep "Vec" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
71MiB
$ grep "hyper" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
95MiB
$ grep "tokio" smoketest_debug_pubtypes \
| cut -d " " -f1 - \
| awk '{sum += $1} END {print sum}' \
| numfmt --to=iec-i --suffix=B
95MiB
A bunch of duplicates...