Saturday, 15 June 2019

How to generate a usable map file for Rust code - and related (f)rustrations


Cargo does not produce a .map file, and if it does, mangling makes it very unusable. If you're searching for the TLDR, read from "How to generate a map file" on the bottom of the article.


As a person with experience in embedded programming I find it very useful to be able to look into the map file.

Scenarios where looking at the map file is important:
  • evaluate if the code changes you made had the desired size impact or no undesired impact - recently I saw a compiler optimize for speed an initialization with 0 of an array by putting long blocks of u8 arrays in .rodata section
  • check if a particular symbol has landed in the appropriate memory section or region
  • make an initial evaluation of which functions/code could be changed to optimize either for code size or for more readability (if the size cost is acceptable)
  • check particular symbols have expected sizes and/or alignments


Because these kind of scenarios  are quite frequent in my work and I am used to looking at the .map file, some "rustrations" I currently face are:
  1. No map file is generated by default via cargo and information on how to do it is sparse
  2. If generated, the symbols are mangled and it seems each symbol is in a section of its own, making per section (e.g. .rodata, .text, .bss, .data) or per file analysys more difficult than it should be
  3. I haven't found a way disable mangling globally, without editing the rust sources. - I remember there is some tool to un-mangle the output map file, but I forgot its name and I find the need to post-process suboptimal
  4. no default map file filename or location - ideally it should be named as the crate or app, as specified in the .toml file.

How to generate a map file

Generating map file for linux (and possibly other OSes)

Unfortunately, not all architectures/targets use the same linker, or on some the preferred linker could change for various reasons.

Here is how I managed to generate a map file for an AMD64/X86_64 linux target where it seems the linker is GLD:

Create a .cargo/config file with the following content:

    rustflags = ["-Clink-args=-Wl,"]

This should apply to all targets which use GLD as a linker, so I suspect this is not portable to Windows integrated with MSVC compiler.

Generating a map file for thumb7m with rust-lld

On baremetal targets such as Cortex M7 (thumbv7m where you might want to use the llvm based rust-lld, more linker options might be necessary to prevent linking with compiler provided startup code or libraries, so the config would look something like this:
target = "thumbv7m-none-eabi"
rustflags = [""]
The thins I dislike about this is the fact the target is forced to thumbv7m-none-eabi, so some unit tests or generic code which might run on the build computer would be harder to test.

Note: if using rustc directly, just pass the extra options

Map file generation with some readable symbols

After the changes above ae done, you'll get an file (even if the crate is of a lib) with a predefined name, If anyone knows ho to keep the crate name or at least use for libs, and for apps, if the original project name can't be used.

The problems with the generated linker script are that:
  1. all symbol names are mangled, so you can't easily connect back to the code; the alternative is to force the compiler to not mangle, by adding the #[(no_mangle)] before the interesting symbols.
  2. each symbol seems to be put in its own subsection (e.g. an initalized array in .data.

Dealing with mangling

For problem 1, the fix is to add in the source #[no_mangle] to symbols or functions, like this:

pub fn sing(start: i32, end: i32) -> String {
    // code body follows

Dealing with mangling globally

I wasn't able to find a way to convince cargo to apply no_mangle to the entire project, so if you know how to, please comment. I was thinking using #![no_mangle] to apply the attribute globally in a file would work, but is doesn't seem to work as expected: the subsection still contains the mangled name, while the symbol seems to be "namespaced":

Here is a some section from the #![no_mangle] (global) version:
                0x000000000004fa00      0x61e /home/eddy/usr/src/rust/learn-rust/exercism/rust/beer-song/target/release/deps/libbeer_song-d80e2fdea1de9ada.rlib(beer_song-d80e2fdea1de9ada.beer_song.5vo42nek-cgu.3.rcgu.o)
                0x000000000004fa00                beer_song::verse
When the #[no_mangle] attribute is attached directly to the function, the subsection is not mangled and the symbol seems to be global:

.text.verse    0x000000000004f9c0      0x61e /home/eddy/usr/src/rust/learn-rust/exercism/rust/beer-song/target/release/deps/libbeer_song-d80e2fdea1de9ada.rlib(beer_song-d80e2fdea1de9ada.beer_song.5vo42nek-cgu.3.rcgu.o)
                0x000000000004f9c0                verse
I would prefer to have a cargo global option to switch for the entire project, and code changes would not be needed, comment welcome.

Each symbol in its section

The second issue is quite annoying, even if the fact that each symbol is in its own section can be useful to control every symbol's placement via the linker script, but I guess to fix this I need to custom linker file to redirect, say all constants "subsections" into ".rodata" section.

I haven't tried this, but it should work.

No comments: