Saturday, 15 June 2019

How to generate a usable map file for Rust code - and related (f)rustrations

Intro


Cargo does not produce a .map file, and if it does, mangling makes it very unusable. If you're searching for the TLDR, read from "How to generate a map file" on the bottom of the article.

Motivation

As a person with experience in embedded programming I find it very useful to be able to look into the map file.

Scenarios where looking at the map file is important:
  • evaluate if the code changes you made had the desired size impact or no undesired impact - recently I saw a compiler optimize for speed an initialization with 0 of an array by putting long blocks of u8 arrays in .rodata section
  • check if a particular symbol has landed in the appropriate memory section or region
  • make an initial evaluation of which functions/code could be changed to optimize either for code size or for more readability (if the size cost is acceptable)
  • check particular symbols have expected sizes and/or alignments

Rustrations 

Because these kind of scenarios  are quite frequent in my work and I am used to looking at the .map file, some "rustrations" I currently face are:
  1. No map file is generated by default via cargo and information on how to do it is sparse
  2. If generated, the symbols are mangled and it seems each symbol is in a section of its own, making per section (e.g. .rodata, .text, .bss, .data) or per file analysys more difficult than it should be
  3. I haven't found a way disable mangling globally, without editing the rust sources. - I remember there is some tool to un-mangle the output map file, but I forgot its name and I find the need to post-process suboptimal
  4. no default map file filename or location - ideally it should be named as the crate or app, as specified in the .toml file.

How to generate a map file

Generating map file for linux (and possibly other OSes)

Unfortunately, not all architectures/targets use the same linker, or on some the preferred linker could change for various reasons.

Here is how I managed to generate a map file for an AMD64/X86_64 linux target where it seems the linker is GLD:

Create a .cargo/config file with the following content:

.cargo/config:
[build]
    rustflags = ["-Clink-args=-Wl,-Map=app.map"]

This should apply to all targets which use GLD as a linker, so I suspect this is not portable to Windows integrated with MSVC compiler.

Generating a map file for thumb7m with rust-lld


On baremetal targets such as Cortex M7 (thumbv7m where you might want to use the llvm based rust-lld, more linker options might be necessary to prevent linking with compiler provided startup code or libraries, so the config would look something like this:
.cargo/config: 
[build]
target = "thumbv7m-none-eabi"
rustflags = ["-Clink-args=-Map=app.map"]
The thins I dislike about this is the fact the target is forced to thumbv7m-none-eabi, so some unit tests or generic code which might run on the build computer would be harder to test.

Note: if using rustc directly, just pass the extra options

Map file generation with some readable symbols

After the changes above ae done, you'll get an app.map file (even if the crate is of a lib) with a predefined name, If anyone knows ho to keep the crate name or at least use lib.map for libs, and app.map for apps, if the original project name can't be used.

The problems with the generated linker script are that:
  1. all symbol names are mangled, so you can't easily connect back to the code; the alternative is to force the compiler to not mangle, by adding the #[(no_mangle)] before the interesting symbols.
  2. each symbol seems to be put in its own subsection (e.g. an initalized array in .data.

Dealing with mangling

For problem 1, the fix is to add in the source #[no_mangle] to symbols or functions, like this:

#[no_mangle]
pub fn sing(start: i32, end: i32) -> String {
    // code body follows
}

Dealing with mangling globally

I wasn't able to find a way to convince cargo to apply no_mangle to the entire project, so if you know how to, please comment. I was thinking using #![no_mangle] to apply the attribute globally in a file would work, but is doesn't seem to work as expected: the subsection still contains the mangled name, while the symbol seems to be "namespaced":

Here is a some section from the #![no_mangle] (global) version:
.text._ZN9beer_song5verse17h0d94ba819eb8952aE
                0x000000000004fa00      0x61e /home/eddy/usr/src/rust/learn-rust/exercism/rust/beer-song/target/release/deps/libbeer_song-d80e2fdea1de9ada.rlib(beer_song-d80e2fdea1de9ada.beer_song.5vo42nek-cgu.3.rcgu.o)
                0x000000000004fa00                beer_song::verse
 
When the #[no_mangle] attribute is attached directly to the function, the subsection is not mangled and the symbol seems to be global:

.text.verse    0x000000000004f9c0      0x61e /home/eddy/usr/src/rust/learn-rust/exercism/rust/beer-song/target/release/deps/libbeer_song-d80e2fdea1de9ada.rlib(beer_song-d80e2fdea1de9ada.beer_song.5vo42nek-cgu.3.rcgu.o)
                0x000000000004f9c0                verse
I would prefer to have a cargo global option to switch for the entire project, and code changes would not be needed, comment welcome.

Each symbol in its section

The second issue is quite annoying, even if the fact that each symbol is in its own section can be useful to control every symbol's placement via the linker script, but I guess to fix this I need to custom linker file to redirect, say all constants "subsections" into ".rodata" section.

I haven't tried this, but it should work.