optimizing binary size

2022-02-20 ยท 5 min read

Setup #

# for running cargo bloat
$ RUSTFLAGS="-C target-cpu=native" c install cargo-bloat

# for running cargo size (among other things)
$ rustup component add llvm-tools-preview
$ rustup +nightly component add llvm-tools-preview # (for strip=symbols)
$ RUSTFLAGS="-C target-cpu=native" c install cargo-binutils

TL;DR #

# compile and sort functions by binary size
$ cargo bloat --release -n 100

# only like 20-25% of the binary size seems to be our code or other relevant
# stuff like ndarray. The rest seems to be mostly panic and fmt
# infrastructure...

# compile and sort crates by binary size
$ cargo bloat --release --crates

# print the size of each section in the binary
$ cargo size --bin kcddice --release -- -A

# if you already have a built rust binary, you can run
# rust-size directly:
$ rust-size -A target/release/kcddice

# strip debug info and all symbols (requires nightly) then print section size
$ RUSTFLAGS="-Z strip=symbols -C target-cpu=native" cargo +nightly \
	size --bin kcddice --release -- -A

# before: 1.8 MiB! looking at the sections, it's mostly debug info.
# "-Z strip=symbols" brings this down to like 330-400 KiB (depending on
# other flags etc...)

# TODO: cargo-binutils also installed `cargo strip`; maybe that's helpful?

Cargo.toml #

[profile.release]
codegen-units = 1
lto = true
panic = "abort"
# opt-level = "s" # optimize for size, but still unroll
# opt-level = "z" # optimize for size, no unrolling at all
opt-level = 3
debug = 0

Compile std with panic = "abort" #

  • shaves off maybe 150 KiB?
  • removes a decent chunk of the backtrace/unwind infrastructure
# .cargo/config.toml
[unstable]
build-std = ["std", "panic_abort"]
build-std-features = [] # <- turns off backtrace+unwind features

WASM #

https://rustwasm.github.io/twiggy/

Example: fixing bloat #

Let me just run a quick smoketest (which depends on almost every crate in the monorepo)...

$ cargo test -p smoketest
# ..
    Finished test [unoptimized + debuginfo] target(s) in 1m 28s
     Running unittests src/lib.rs (target/debug/deps/smoketest-bd637d7668a0b714)
# ..

Man that sure took a while to link, I wonder how big the binary is?

$ ls -lah target/debug/deps/smoketest-bd637d7668a0b714
-rwxrwxr-x 1 phlip9 phlip9 880M May  4 11:19 target/debug/deps/smoketest-bd637d7668a0b714

JESUS. RIP MY SSD.

$ rustup component add llvm-tools-preview
$ rust-size -A target/debug/deps/smoketest-bd637d7668a0b714
section                     size       addr
.interp                       28        848
.note.gnu.property            32        880
.note.gnu.build-id            36        912
.note.ABI-tag                 32        948
.gnu.hash                     48        984
.dynsym                     3816       1032
.dynstr                     2240       4848
.gnu.version                 318       7088
.gnu.version_r               432       7408
.rela.dyn                3597696       7840
.rela.plt                    120    3605536
.init                         27    3608576
.plt                          96    3608608
.plt.got                      24    3608704
.text                   48736640    3608768
.fini                         13   52345408
.rodata                  3077016   52346880
.debug_gdb_scripts            34   55423896
.eh_frame_hdr            1529196   55423932
.eh_frame                5260568   56953128
.gcc_except_table        1449080   62213696
.tdata                        72   63669824
.tbss                        696   63669896
.init_array                   16   63669896
.fini_array                    8   63669912
.data.rel.ro             1139144   63669920
.dynamic                     544   64809064
.got                      701808   64809608
.data                      24032   65511424
.bss                        2056   65535456
.comment                      43          0
.debug_aranges           5092144          0
.debug_pubnames        117174166          0
.debug_info            183307521          0
.debug_abbrev            2351170          0
.debug_line             29000740          0
.debug_str             182931466          0
.debug_loc               3646232          0
.debug_pubtypes        287664570          0
.debug_ranges           14749600          0
.debug_macro               11301          0
Total                  891454821

WTF IS GOING ON WITH THE .debug_pubtypes SECTION???

Ok ok, let's take a look at what we're working with...

$ sudo apt install dwarfdump

$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
	| head -n 10

.debug_pubtypes
 'ErrorData<alloc::boxed::Box<std::io::error::Custom, alloc::alloc::Global>>'
 'alloc::boxed::Box<std::io::error::Custom, alloc::alloc::Global>'
 'alloc::boxed::Box<(dyn core::error::Error + core::marker::Send + core::marker::Sync), alloc::alloc::Global>'
 'Result<(), std::io::error::Error>'
 'NonNull<u8>'
 'u8'
 'SimpleMessage'
 'ErrorKind'

How many types we got?

$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
	| wc -l
	| numfmt --to=si
2.3M

Maybe there's some giga types?

$ dwarfdump --print-type --format-suppress-offsets target/debug/deps/smoketest-bd637d7668a0b714 \
	| awk '{ print length, $0 }' \
	| sort -n -r \
	> smoketest_debug_pubtypes

$ head -n 10 smoketest_debug_pubtypes
72011  '{closure_env#0}<&str, &str, n ..
71962  '&mut (nom::sequence::terminat ..
71957  '(nom::sequence::terminated::{ ..
71957  '(nom::sequence::terminated::{ ..
66740  '&mut nom::branch::alt::{closu ..
66717  '{closure_env#0}<&str, &str, n ..
66717  '{closure_env#0}<&str, &str, n ..
66668  '&mut (nom::sequence::terminat ..
66663  '(nom::sequence::terminated::{ ..
66663  '(nom::sequence::terminated::{ ..

Ok despite nom taking to top 10, it looks like the primary culprit is my arch nemesis warp. CURSE YOU WARP AND YOUR COMPOSABLE GENERICS.

Let's see what proportion of our .debug_pubtypes is warp...

$ cat smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
271MiB

$ grep "nom" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
3.3MiB

$ grep "warp" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
82MiB

$ grep "lightning" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
45MiB

$ grep "proptest" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
5MiB

$ grep "Vec" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
71MiB

$ grep "hyper" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
95MiB

$ grep "tokio" smoketest_debug_pubtypes \
	| cut -d " " -f1 - \
	| awk '{sum += $1} END {print sum}' \
	| numfmt --to=iec-i --suffix=B
95MiB

A bunch of duplicates...