Wednesday, 18 July 2007

How to build rrdtool with soft-float for a debian arm machine (which uses hard float)

Being a good Debian user

Notes: the bulk of this article was written on the 17th; updates were added in later; this article is really long, you might want to read it in more sessions

Debian's arm port uses hardfloat. This means that apps like rrdtool (graph function) take ages to run because there is no floating point unit. In order to make this faster you'll want to use soft-float emulation. But this support needs to be present in all depended upon libs, including libc, which is an ABI incompatible change which would break all binaries. I can't use the yet-in-development-and-unofficial armel port since the machine has to be up and running almost 24/7 and I can't afford downtime.

So what to do?

Well, obviously use another soft-float enable libc, like uclibc, in parallel withe the hrad-float enabled libc. Debian Sid contains uclibc development libs, but building the whole toolchain is a pain. There is no real support in the distro itself and building on the arm machine is kind of out of the question. I tried a little in a debootstrapped chroot on the native machine, but this was a pain and I was stuck tring to figure out how to disable whitles and bells for rrdtool since I didn't needed any of the perl, tcl or python support. So I stopped, went to bed and decided I'll continue the next day (the day I wrote the biggest part of this article). The future looked dark, trying to build a softfloat uclibc rrdtool on a system that wasn't prepared for that and from which softfloat was pulled away somewhere during etch's development stage....

Deciding to go with the competition - native building

From link to link I found out that gentoo has arm-softfloat-linux-uclibc port, among others, so I decided to use that on the target machine (a NSLU2) in a chroot. Well I stumbled on the low computing power, when I tryed to emerge sync the tree. I gave up un that and tried to use emerge-websync. It looked better, but I didn't got too far with this either since the machine ended up being overloaded, with figures varying from 3 to 8 for the load average (all of them).

Deciding to go with the competition - cross building

It was time to bring in the big guns and I decided I was going to crossbuild those binaries, probably statically since it meant lower risk for things to break on the target machine.

Since I observed that gentoo has a nice tool called crossdev which actually uses emerge (the equivalent of dpkg+apt in Debian world) to make the cross chain tool and saw that they have some really good documentation and a development version of an arm-softfloat-linux-uclibc, I decided I could .

So I went on and downloaded the latest stage3 tarball for x86 (so I can have a gentoo system to start with), made all the necessary things and created a gentoo chroot. I emerge sync-ed, updated portage, installed crossdev (following the nice instructions on the gentoo embedded site about the subject). This took a while, but it went like a breeze. I made the necessary arangements to be able to cross build packages (setting SYSROOT, making that nice xmerge script).

As my goal was to make an arm-softfloat-linux-uclibc binary for rrdtool I tried directly to xmerge rrdtool (of course, first just pretended -p).

Finding bugs in the packages

First bug

All went fine until I found freetype which for some weird reason detected the i486-pc-linux-gnu-gcc compiler as i486-pc-linux-gnu (observe the missing trailing -gcc). I looked over the logs and determined that the CC_BUILD environment variable was not set. So exported CC_BUILD=i486-pc-linux-gnu-gcc and xmerged the library.

(While trying to fix this issue I entered #gentoo-embedded on freenode and actually met Yuri Vasilev, whom I had met in Edinburgh, at DebConf. Look for the video about Ligusk if you are interested about his project to make a Debian like distribution based on Gentoo. It was nice to meet a friendly "face" - as much you can say that online.)

I reported the bug (already there is a patch, I haven't tested, but looks ok) and the workaround in the gentoo bugzilla... after creating an account, since I never used that BTS before.

Second bug

Next pain was libart_lgpl whose upstream, for some comfortability reason chose to detect the target type sizes (and basically ignoring inttypes.h) with a small program which was built with the cross compiler which later was supposed to be ran on the host. I don't think is necessary to say that i failed, of course, since the binary was an ARM binary, while the host was an x86 machine. Yupee! not.

Ok, two bugs, no softfloat rrdtool yet. I reported this second bug and tried to work around the issue. Since I didn't really used too much a gentoo system, I wanted to know details and what possibilities I had. I got some really nice tips from "solar" (aka Ned Ludd) on #gentoo-embedded. He pointed out that while building a package one can press CTRL+z, do some stuff, like altering the file in the build directory and then fg back. Ok, cool.

All I need now is somebody with direct access to an arm machine that can compile and run for me the program in question. seanius on #debian-devel was kind enough to do this for me and, to my surprize, I got the same output header file as I got in the host x86 chroot. But that detail is not important, let's just compile that rrdtool. Well, first libart_lgpl...

Ok, I CTRL+Z-ed the build process just after the configure stage ended, cd-ed into the build directory, I tampered with the file and changed that line from

./gen_art_config > gen_art_config.h


cp /gen_art_config.h.arm gen_art_config.h

Haha! In your face! Ok, nasty did was done so I fg-ed and went on confident that I will get my hands on that rrdtool very soon. The package build ended succesfully, without any other suprizes, so I went for the graal!

xmerging rrdtool, the bugs don't stop

xmerge -v rrdtool

Bla, bla, bla.... bla, bla, bla, what do you know, there are a few IEEE math function tests in the configuration script of rrdtool. They failed. I don't know why, but I have a feeling that this should be happening on softfloat or even maybe the rrdtool source is not that cross compile aware. I didn't care that much, so I did what I knew best, tampering with an ongoing build.

This time is was a little bit harder to catch since I had to wait for the source to unpack (well, there is a larger frame, but that is the rough timing) and at that moment CTRL+Z the build.

I didn't had much time to waste, so after looking shortly over the tests I realized I should just ignore the error and look for the "failing command" that was interrupting my beloved build. Muhahah!

Thank God for verbose error messages, I could find the place fast and chaged the two "exit 1" statements into two harmless 'echo "Ignoring IEEE math errors"', sic. I had to do this a few times since I got the timping wrong, at some point I got the syntax wrong (at which point I remembered the cp trick I did earlier and saved the modified file in /configure.rrdtool )...

xmerging rrdtool and the 4th bug

After passing past this stage, I got a compilation error which, judging from the command, meant that the linker tried to link the target (arm) binary against the host (x86) libraries.

Oooook, this is not going to stop me! I look at the to try understand why this happened and then realized it wasn't worth it for me to try to fix it the right way. So I cheated once again! Muhaha, I just ran some useful commands which I'll let you feast your eyes to:

ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/ /usr/lib/
ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/libpng12.a /usr/lib/
ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/ /usr/lib/

Ok, good to go. Of course, I had to stop again the build to copy the "fixed" configure script that ignored IEEE math errors, but that is already history.

And, believe it or not, actually the build succeded. I had a cross built rrdtool using shared libraries built against softfloat ucllibc. No guarantees about its correctness, but it built!

This is when I decided to start writing this article.

Still there are things to do

I will have to do a few things before being happy about the victory:
  • check that all the resulted binaries are indeed ARM ELF files, and not Intel x86. It appears the shared libs are already.
    • done, they are (checked the next day)
  • figure out qpkg (is a utility from portage-utils which seems to be a way to make binary packages) to package somehow the result of this work or maybe just ignore it and use plain cp
  • see how I can fit together both the softfloat uclibc binaries and the native debian binaries on the same filesystem, including configuration...
    • but I fear this will be a nightmare, so I might decide to go with static compilation of rrdtool, which I don't think it will be quite straight forward given the experiences describe above; there could be also the size issue to take into account, and since the NSLU2 has only 32MB of RAM, this could be problematic
      • I think I'll have to use a statically linked binary after all (note added late, the second day)
  • see if the rrdtool binary really works and if all the work payed off, performance wise - it should, taking into account what I have heard from wookey during the debconf7 talk about the armel (warning, 73MB ogg file) port
  • be happy with the solution I got or wait for the Debian armel port to catch up and switch to that asap
    • if I'll have to do the switch I'll probably think of ways of migrating the whole system from arm to armel, and not to reinstall everything; this might prove useful for the whole debian armel port, when people are working on migration plans, since arm will be deprecated in favour of armel

1 comment:

Anonymous said...

I've done more or less the same things as you describe here to get rrdtool running on a Western Digital MyBook World Edition running debian.

As there is currently no build for armel, I'm using the arm port. I wanted to run cacti, but rrdtool would take about 17 min to draw a graph.

Using a buildroot environment, I compiled it static, by letting rrdtool's make finish and then deleting the binaries. Running make again shows you exactly the command you have to change, just go in the src directory and issue the command - adding -static and changing all .so libs to .a... Static building allows you to place the resulting binary in /usr/local/bin and don't worry about messing with duplicate libraries etc.

I also had a problem with fonts (not showing), but then figured out that there is a compile-time variable for specifying font locations. Cacti allows you to directly point to the fonts, so that problem was solved as well.

So now I have rrdtool working - instead of 17 minutes it takes a couple of seconds!