Monday 17 December 2007

spechless

Why don't you replace it with a lioness?

I am sure some people will say I am missing something...

This political correctness has went too far once again.

Thursday 6 December 2007

I am laptop-less, but disk-ful

Today I went for the second time[1] to emag's service[2] to have my laptop's[*] battery replaced since it was powering the laptop only for 15 (yes, fifteen) minutes. This after only 11 months since I bought the laptop[**].

I didn't observed this drastic reduction until recently (last time I remember was about 1 hour and 50 minutes, back in August), because I wasn't so mobile since August.

To make the long story short, I had to: wait, wait, argue, wait, argue, talk normally to the manager, wait, wait, argue, argue, observe sheer contempt from the personal when I asked again for the manager, talked normally to the manager, left after 4-5 hours wasted (+2 yesterday).

All this just to be able to keep my data safe by holding on to my hard disk, since they insisted they wanted the whole laptop[3], and making them write on the warranty receipt they were the ones who pulled out the HDD (I might be over-cautious/paranoid, but I wouldn't trust emag with the garbage bin).


Now I am on a forced vacation away from Debian for an undefined time, although I might be able to send mails and do make small patches from work.


Hey, emag:
  • teach your employees to behave properly like the manager did! (I don't remember her name, sorry, I'm really bad with names)
  • you'd better redirect clients with broken hardware to the actual services instead of acting like a useless buffer when is clear you're over your heads
  • simple and obvious defects like it was in my case (I could have proven it in exactly 15 minutes) should be handled with minimum impact for the client, for example, keep the faulty battery, but let me take home the laptop


[*] Dell Inspiron 6400 / E1505
[**] when I think the long battery time was one of the most important points, it looks ridiculous now to have such a bad degradation in such a short time

[1] first was yesterday, but due to some support person's incompetence I had to postpone - I asked two questions, they gave me two wrong answers
[2] after all this I wouldn't recommend emag to anyone
[3] You might ask why would they need the whole laptop for a clear 'battery is broken' complaint? Apparently "the service can't test the batteries without the laptop". They probably have neither any spare laptops nor dedicated equipment to measure batteries.

Tuesday 4 December 2007

People who think the default wiki theme is nice

The people who think the default wiki.d.o theme is nice, should think again. Thanks to an anonymous reader I was pointed to http://browsershots.org and I looked at the wiki.debian.org default theme. Then I shrug.

This was after I made a few screen shots myself to prove that the default wiki.d.o theme looks like crap on MSIE or Opera even on a 1024px wide screen, not to mention 800px wide screen or smaller window be it of MSIE or Opera.

In case you think I have faked something, see for yourselves at http://browsershots.org/http://wiki.debian.org/ (the third request is for a 800px wide screen, so expect fun when is done uploading MSIE variants).

wiki.d.o theme -is it bad timing or too late?

I have been working recently working on the wiki.d.o theme; this work was started last year around May - June and I halted it due to lack of time. The point was (opposed to what many might have thought) to make a theme that looks like the www.d.o theme.

Currently the TODO is outdated, so here is an updated one (I'll update it and push it there, too):
  • fix the alignment issues of the edit page for MSIE
  • port the changes done in __init__.py into debian.py
  • clean up the screen.css and add comments for the hacks made for MSIE
  • decide if the CSS3 border-radius is worth the trouble and, if it makes sense, replace with a portable implementation like is done for http://www.debian-ports.org/
  • retest on Opera, MSIE 6.0, iceweasel/Firefox and konqueror (iceweasel is the reference)
  • test on other browsers (maybe MSIE 7.0)
There are some screen shots with the current theme as seen on Opera, Epiphany (for some reason sometimes different from iceweasel) and MSIE 6.0, but here are again since I expect that my public darcs repo will change:

In MSIE (by far the most annoying browser and creates the most issues):


In Opera:


In Epiphany (the window was smaller, the content isn't bigger):

If you want to contribute, you can:

darcs get http://users.alioth.debian.org/~eddyp-guest/darcs/wiki.debian.org_theme

read the README to understand what matters, and tackle one of the items on the TODO list above.



I found it pitiful that Zack has based his PTS rework based on the wiki.d.o default theme since that theme is, in my opinion the one that should be corrected (at least the colors should be).

I am not an expert in aesthetics and i don't claim that the www.d.o theme is beautiful, but it seems that the recurrent discussion of it being replaced with something better never found a person to make it true. This is one of the reasons I stopped last year since it seemed the dream would come true at that time.

Thursday 29 November 2007

Have you ever drove your dreams?

Thomas writes about a dream of his and how his dream self failed to do the right thing.

I have been able several times, while asleep, to be aware enough to realize I was dreaming and was asleep and started to drive my dream the way I wanted.

I made myself fly superman like and fly a plane, I managed to come in second place after Ayrton Senna (this was while he was alive and I respected him too much to take his victory) I managed to do a lot of things. At some point I even managed to turn right a recursive nightmare I had and got rid of it that way.

Unfortunately, I haven't been able to do this in the last 3 or 4 years, but I know I had to set a goal before going to sleep in order to achieve it.

Has anyone else experienced this?

I thought that was Debian's 50000

Phew, Rob, I thought that was Debian's 50000th bug!

Fortunately, I haven't lost the bet!

Tuesday 27 November 2007

Updates: NSLU2, Andrew S. Tanenbaum in .ro

Last weekend was as hectic as my life has been lately: I have been trying to restore sanity into my NSLU2, I went to a lecture from Andrew S. Tanenbaum and I made a 2.5 hours drive to my parents in about 4 hours because of the fog.

First, my slug:
  • refuses to recognise the USB NIC I have been using until the latest incidents (it either says 'not accepting address, error -71' or 'device descriptor read/64, error -71')
  • sometimes reboots when I insert the USB NIC
  • either doesn't boot at all or boots really slowly when the USB NIC is inserted
  • (obviously) doesn't show the NIC in lsusb listing when is not recognised
Since the USB NIC works on my laptop, I suspect a hardware problem with the slug. Bummer!

Dear no-so-lazyweb, is there a way to install Debian on an ASUS WL-500G Premium router without loosing wireless ability? Or, is there a way to make use of my USB NIC with the ASUS router?



Second, Andrew S. Tanenbaum visited Romania and lectured Friday at the University „Politehnica” Bucharest.

He presented Minix3's architecture and the advantages it has over monolithic OSes. I attended the lecture (although I am not a student anymore) and found it quite nice and well prepared, but I had the feeling that sometimes he was trying to avoid or to bash topics that were not putting Minix into a good light or challenged its title of being the first open[0] OS based on a micro-kernel architecture[1]. In spite of that, I found him to be a really good speaker and I liked the overall presentation, although, I also expected some on the spot demos or at least some recordings.

The things that I remember:
  • 2.4 millions subtle code alterations in drivers with only 80000 driver crashes (of course, no kernel crashes)
  • simulation of network driver repeated crashes at different time intervals and how it affects performance - a 30% degradation at crashes that occur once every second and an insignificant degradation at crashes occurring at each 10 seconds
  • every driver has a set of rights assigned to it; it was difficult for them to define this - this sounds a lot like SELinux issues
  • messages have a fixed length
  • there is no dynamic memory allocation within the kernel
  • the kernel is 5000 lines of code (all drivers are in user space)
  • really secure system
  • there were performance comparisons with Minix2 and the hit was about 20%; still, is said that L4 has only an approximate 2-5% performance hit because of the micro-kernel architecture
  • apparently the FreeBSD kernel has only 3 bugs /1000 lines of code
  • Minix uses a BSD license
I also got a Minix live CD (which is more like the Gentoo Linux install CD - just console in the live system) and made an installation of Minix in a qemu machine[2]. Unfortunately, I don't think I'll have the time to dwell into the source.

I was thinking, would it worth the effort to try to make a GNU/Hurd/Minix system (i.e. replace Mach with Minix's micro-kernel)? BTW, is Debian GNU/Hurd now based on L4 or does it still uses Mach?


Note: Some of my work colleagues suggested that the presentation was the same as one he made at linux.conf.au last year, but I can't confirm/infirm that since I didn't saw the recording.



I won't write about the "fog drive", but I'll just say it wasn't pleasant at all, and I felt I was in driving in The Twilight Zone for the whole Friday evening.




[0] he gave credit to QNX
[1] For instance, I tried to ask him twice if he felt that GNU Hurd was violating the micro-kernel paradigm or if he can compare it to Minix' architecture. I had the impression that both times he avoided to answer and started the usual Hurd bashing, "they have been developing it for 20+ years, but got nothing working", meanwhile "Minix is here". After the lecture/presentation somebody told me that AST shortly said that they "were similar, but different". I didn't catch that line.
[2] thanks to qemu-launcher it is trivial to create and manage multiple qemu virtual machines

Wednesday 21 November 2007

Wednesday 14 November 2007

Lesson relearned: when Linux networking weirdess occurs...

My relearned lesson for the day: when Linux networking weirdness occurs in a NAT environment, remember to try MTU clamping.

Thanks to the comments by Justin and Sesse, I was fast-tracked to the core of the problems I have been experiencing since Thursday, MTU issues. What's worse (from my pov) is that I have encountered this issue before with the provider I had in Timișoara, but, since that ISP was using PPPoE and my current ISP in Bucharest doesn't, I never really made the connection. I even had a commented out iptables rule for MTU clamping in my firewall script.

The rule I am talking about looks like this:

iptables -t mangle -A POSTROUTING -p tcp --tcp-flags SYN,RST SYN -o $EXT_IF -j TCPMSS --clamp-mss-to-pmtu

or like the one I have been using (seems more logical to me):

iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu


Note that this is not a fix, but a workaround and the real problem is over-zealous admins or weird setups[1] which think that banning TCP fragmentation (or the entire ICMP traffic) is a way to secure networks.


Once again, thanks to everybody who read and/or commented about my issue.

[1] Sesse told me that in his case there was a transparent proxy involved when he exeprienced MTU weirdness.

Linux: plain weird network behaviour; Windows is OK

Update: problem fixed, thanks for the comments; it was MTU related issues.

Note: This is a long post, but I expect it to bring in some questions for many Linux people; I advise you to read this when you have enough time.


After the last events related to connectivity at home which have been lasting since Thursday evening and the weird fact that it seems that NAT doesn't work at home, but does at work for the same setup, I called the ISP's support and wasted 15 minutes trying to convince them at least to send a technical team with another modem just to test if that is the reason for the breakage.

Of course, talking to the ISP support and trying to convince them that there might be a problem on their side since NAT works with another provider but doesn't with them, was fruitless.

Tonight I was stuck and tried another approach, convinced I would confirm my suspicion that there is something wrong on ISP's side. Still I am not yet sure what to conclude from what happened.


So, to get a better view of what is going on, I'll describe the setup I have and what are its limitations and characteristics. People in a hurry can skip to the paragraph starting with "I tested" and stare in wonder at will.


So the connection I have is done through a DSL modem which gets the MAC of the network card connected to it and exposes that as its own to the ISP's network. This MAC seems to be quite persistent and special measures must be taken in order to be able to use another NIC to connect. The modem (or probably some machine in ISP's; the IP is 10.10.0.1) offers a DHCP address and everything should work fine.

Because of this "your MAC is my MAC" issue, when I connected the first time, I used an USB NIC since a broken router or some temporary failure would have allowed me to use the Internet connection directly from my laptop. I can say this decision has proven in time to be wise.

The router I used until now is a NSLU2 with Debian installed on it. The built in network interface always faced the internal network.

The router (which I call ritter) served as a NAT router for two machines inside the network, my laptop and my apartment mate's laptop. All until last Thursday, after which it never got back properly.



I tested (doing NAT on my laptop; the laptop shouldn't have been affected by the power problems):
  • with ritter behind the laptop NAT, at home
  • with two different virtual machines as NAT "clients", at home
  • with ritter behind the laptop NAT, at work
  • with a virtual machine as NAT client, at home
  • directly from the laptop
  • NAT made through a SNAT rule
  • NAT made through a MASQUERADING rule
  • with TTL mangled (increased by one, although is was never in the ballpark of a low TTL)

Of course, I have n-checked that /proc/sys/net/ipv4/ip_forward was set to 1, the tables had policy ACCEPT and there were no extra rules, except the basic NAT-ting stuff, the routes were correctly set on both the clients and the machine doing the NAT.


All I could see is that:
  1. the machine doing the NAT was always working fine
  2. at work all NAT clients worked fine
  3. at home any of the NAT "clients" were
    1. able to resolve addresses while the DNS server was in ISP territory and another LAN
    2. ping the outside world (if ping was available - in D-I is not)
    3. hanging when trying to get a http page
    4. telnet-ting directly to the port 80 was ok (but I didn't try to "GET / HTTP/1.0")

So after all of this, I was thinking of trying to see if a new client (the laptop of my apartment mate, a Windows XP machine) would work using the connection at home. It didn't work.

Then I thought of trying to do "Internet Connection Sharing" as is called in Windows. Of course, there was some pain to find Windows XP drivers for the ASIX AX88172 network card (remember, the modem needed to see the MAC of that NIC), but I managed to find the proper one.


And, almost sure the NAT wouldn't work for this case either, I configured the new connection as a shared one. I didn't even disabled the firewall, as I was thinking I could take those down gradually.

I wasn't expecting this, not even by pure chance, but my laptop which was a NAT "client" now was able to browse, ping and do whatever was normal through NAT, while the Windows machine was doing the "Connection sharing".

I was utterly flabbergasted. And that was just the beginning.


I was expecting that the problem coincidentally went away, but after a minute I was proven otherwise. It still didn't work with Linux as the NAT-ting machine. I connected back the USB NIC to the Windows machine and I saw the same thing. NAT was just working.


At that point I was to observe an even more shocking fact: the IP that the Windows machine got was different from the one the Linux machine received, in spite of the fact that the network card was the same, so it would have made sense to get the same one. More than that, the IP that the Windows machine got was from an entirely different network, although it was a valid IP belonging to my ISP.

I was thinking that one reason why it works with Windows might be that there could be some TCP protocol twist that is differently implemented in Windows and the equipment from my ISP gets along better with the Windows network stack.


As a way to test that, I am thinking of forcing somehow the IP on the Linux machine to see if anything changes. But before doing that, I felt the urge to post these, maybe some kind soul will shed some light on this issue for me or drop a hint.

Another reason might be different DHCP servers answering, but I don't know how I can see in Windows who offered the lease.


If anyone has any clue why these weird things happen, please drop a line. I would greatly appreciate it. TIA.

Tuesday 13 November 2007

good news, bad news, such is life

After a really crazy weekend in which it looked like I wasn't able to set up NAT[1], I went to work with the gear and tested the exact same thing I did at home and it worked without a glitch.

So, now I suspect that there might be something wrong with my ISP or the DSL modem, while ritter (my NSLU2) is fine. As a bonus I just installed Debian armel on it (installation report to arrive soon).


When I got home, after the laptop came back from hibernate[2] I saw these messages on all my terminals:

Message from syslogd@bounty at Mon Nov 12 23:57:04 2007 ...
bounty kernel: Uhhuh. NMI received for unknown reason b0.

Message from syslogd@bounty at Mon Nov 12 23:57:04 2007 ...
bounty kernel: You have some hardware problem, likely on the PCI bus.

Message from syslogd@bounty at Mon Nov 12 23:57:04 2007 ...
bounty kernel: Dazed and confused, but trying to continue

Is this a good time to panic?
I guess I'll have to dig into this, too.


In other news, svn-buildpackage 0.6.23 has been uploaded to unstable and it fixes yet another 7 bugs, which brings svn-bp's bug count down to 19 open valid bugs (2 more if you count wontfix bugs, too). This is the lowest bug count of svn-buildpackage since at least the end of April, according to the bug count graph.

Also, oolite 1.65-6 was also uploaded to unstable and fixes the breakage due to the gnustep build tools changes. Unfortunately, on arm is dep-waiting for libgnustep-base-dev which is broken on this arch.


Thanks to my sponsors, you know who you are ;-) .

[1] I have been trying to set up a simple NAT machine for more than 6 hours and, although everything seemed to be OK, checked and double checked it didn't work
[2] which works now only if I don't use fglrx which is broken from this PoV (bug 449095)

Saturday 10 November 2007

nslu2 broken?

Thursday evening, when I came from work I found my nslu2 not working. It is/was a router and a local mirror. I tried to understand what's going on, but I couldn't. Everything looked fine, except that it did not resolve any addresses and the ping to my provider was not giving any results.

I tried to restart dnsmasq, although it seemed it was running fine, and then, in a desperate gesture (after trying to understand what was going on) I thought of restarting networking. I got locked on the outside and was forced to restart the system.

Everything looked to come back to normal. Since it was late, I set myself to investigate the next morning what happened. But yesterday morning, the network was again down and the slug was inaccessible.

Since the only way to restore sanity in such a situation (because I don't have a serial console on the slug) was to reset and hope for the best, I did that.


The slug didn't start anymore. It seemed to cycle through boot -> reset -> reboot. I tried to see what was going on and connected the hard disk to my laptop. The filesystem was clean, although /var/log/boot.0 was a directory (and that directory had the same content as /var/lib/dpkg/info). I manually removed that, but that didn't make the slug boot.


It seems my slug is kind of dead.

Now I would like to know if is there any way to do remotely what flash-kernel does from within debian.

Monday 5 November 2007

(debian) work and the silence

I've been really busy lately in RL and Debian work took the hit.

Some news:
  • on the 1st of November, the compendiums on i18n.debian.net managed to waste a huge chunk of disk space and made other scripts (and itself) on churro fail miserably; now there is only a 7 days backlog of the compendiums
  • oolite needs an upload due to the gnustep libs transition; I'll try to prepare tonight the long due 1.65-6 version for unstable
  • wormux upstream is preparing for yet another beta which should be really close to the final 0.8 version; I wish I had more time to work on this game
  • sadly, no news on the naughtysvn front :-( from me
  • I have been coding now and then on svn-buildpackage 0.6.23 and I intent to make yet another drop in the bug count visible on the graph:

Sunday 28 October 2007

gpg signatures sent

I finally managed to resend the signatures to the few people I decided to send them a while back after debconf7.

I actually resent all the signatures I thought I should sign (if I didn't socialize at all with you during debconf or before you shouldn't receive a signature from me).

So, please:
  • sorry, if you get my signatures again; if so, ignore
  • don't be mad if you didn't receive a signed key from me, I probably don't consider I know you enough to do that yet ;-)
Now I can cross one more item on my long todo list. Yay!

This message has emerged thanks to: caff, dato, python's smtplib and rfc822, vi, gpg, exim, linksys, dell, todo(the application from openhand) and blogger :-)

Monday 15 October 2007

svn-buildpackage 0.6.22 released to experimental

Thanks to Damyan Ivanov for the upload, svn-buildpackage 0.6.22 is now in experimental. The upload was done to experimental due to the big number of changes affecting it and because I wanted to get a fairly significant amount of testing of the major fixes before propagating the code to unstable.

This release should fix 15 (yes, fifteen) bugs[0], most of which were important bugs or bugs affecting usability.

The major fixes are:
  • mkdir -p like functionality in the repo for the tags and other possibly missing directories - this means that even repositories created with older broken versions should be fixed automatically[1]
  • .svn/deb-layout is no longer a broken cache, but only a real local override for the layout information; .svn/deb-layout is created only on express request via --svn-savecfg; note that although .svn/deb-layout is no longer created automatically, old checkouts should be purged of this cruft (unless the override is wanted)
  • build dependencies are not required on --svn-export
  • automatic creation of the origDir when using origUrl
  • some code clean up
Installing the package on an etch, lenny or sid system should be straight forward: just get the deb from experimental and install it (no backporting is necessary).


If you usually use svn-buildpackage, please install the experimental version and report any bugs and/or success stories. I am particularly interested in feedback related to the behaviour around .svn/deb-layout and the other methods of specifying layout information.

Please send success stories to me directly eddy.petrisor @ gmail.com. Bugs should be directed to the BTS, but I hope there will be mail just directly to me :-) .


[0] I wonder, does the thickness of the yellow area on these graphs ever decreases?
[1] as soon as the breakage would have been visible in the past

Saturday 13 October 2007

SELinux: MLS/MCS support

When getting this error message:

bounty:~/usr/src/selinux/localpolicies/resolvconf# semodule -i localresovconf.pp
libsepol.link_modules: Tried to link in a non-MLS module with an MLS base.
libsemanage.semanage_link_sandbox: Link packages failed
semodule: Failed!


You have to go back to the checkmodule step and type the same command, but add also the -M parameter:

bounty:~/usr/src/selinux/localpolicies/resolvconf# checkmodule -m -M -o localresovconf.mod *.te
checkmodule: loading policy configuration from resolvconf.te
checkmodule: policy configuration loaded
checkmodule: writing binary representation (version 6) to localresovconf.mod

After that is all OK:

bounty:~/usr/src/selinux/localpolicies/resolvconf# semodule_package -o localresovconf.pp -m *.mod
bounty:~/usr/src/selinux/localpolicies/resolvconf# semodule -i localresovconf.pp
bounty:~/usr/src/selinux/localpolicies/resolvconf# semodule -l | grep localresolv
localresolvconf 1.0

Friday 12 October 2007

SELinux: unusable from a newbie perspective

Thanks Russell for the explanations on the execmem bits.

Now I am trying to go further and set up my system to really work with SELinux enabled because, although the promise of the targeted policy is to allow you to do your job mostly as you did before, that "mostly" has a really wide meaning, more than you'd think you bargained for.

Examples from my laptop: hald does not start by default (various denials), resolvconf is denied some getattr operations on tmpfs, hald-addon-dell-backlight is denied access to some character device I just know it should have access to, etc. Of course, this means that automatic mounting does not work anymore and there is a decrease in usability just because of the "mostly" part.

I am sure that some of the denials are correct (see the execstack stuff or the memexec), but there are cases where this "mostly" is stretched way too much. IMHO, desktop installs should suffer no restrictions when using the targeted policy (and I mean "no restrictions that would make my system less than it was before enabling and enforcing SELinux").


But let me tell a reason why SELinux sucks without any help from others:

The interface sucks big time.

Probably there is are good reasons, but if you need to create and use a new policy based on the denials found in the logs, you need to use no less than 4 (four) different tools with the right incantation (although you can use just 2, if you don't want to customize the rules[1]):
  • audit2allow -m local -l
  • checkmodule -M -m -o local.mod local.te
  • semodule_package -o local.pp -m local.mod
  • semodule -i local.pp
In the end, you still hit the bad interface:

bounty:~# semodule -i local.pp
libsepol.check_assertion_helper: assertion on line 0 violated by allow hald_t memory_device_t:chr_file { read };
libsepol.check_assertions: 1 assertion violations occured
libsemanage.semanage_expand_sandbox: Expand module failed
semodule: Failed!

Now what? That error message doesn't tell me much, except that some assertion failed. But where does the assertion come from? Why is it an assertion? What is bad about that line?
To me that really looks like an internal error or an inconsistency in what the SELinux tools generate.


[1] still, why advertise the longer path as the Fedora FAQ does? I would have done it the other way around.

Tuesday 9 October 2007

selinux darcs policy; same for oolite

If you enable and enforce the targeted policy in Debian and you use darcs you need to allow it to use execmem:

chcon -t unconfined_execmem_exec_t /usr/bin/darcs


This allows to overcome these denials:

type=AVC msg=audit(1191957463.678:108): avc: denied { execmem } for pid=14811 comm="darcs" scontext=user_u:system_r:unconfined_t:s0 tcontext=user_u:system_r:unconfined_t:s0 tclass=process

And get this ugly message:

$ darcs w -l
darcs: internal error: getMBlock: mmap: Permission denied
(GHC version 6.6.1 for x86_64_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Aborted



Since oolite complained of the same issue I also ran this:
chcon -t unconfined_execmem_exec_t /usr/lib/GNUstep/System/Applications/oolite.app/oolite




Update: If you want to know the reasons why darcs (a VCS) to need execmem read the RedHat bugs related to this and the upstream bug report on GHC: https://bugzilla.redhat.com/show_bug.cgi?id=195820
https://bugzilla.redhat.com/show_bug.cgi?id=195821
http://cvs.haskell.org/trac/ghc/ticket/738

SELinux is enabled. Now what?

Disclaimer: I am 100% newbie on SELinux, so any inaccuracies, mistakes or fallacies are almost sure due to this fact.

After reading Russell's latest post on SELinux, and reading the 5 minutes tutorial on SELinux I decided I should try SELinux on my laptop, too.

Now I have it enabled/enforcing/permissive/refpolicy-targeted.

First issue, hal didn't start in my GNOME session, although the hal module appears to be loaded:

bounty:/emul/ia32-linux/usr/lib/dri# semodule -l | grep hal
hal 1.4.0


It seemed that gdomap indirectly required execstack. I cleared the execstack bit (or whatever it is) on libcallback.so.0.0.0 and libavcall.so.0.0.0 and gdomap started.

OTOH, oolite failed to start since it required execmem:

0 eddy@bounty ~ $ oolite
trampoline: cannot make memory executable
Aborted

And after allowing execmem it worked:
# setsebool allow_execmem=1

0 eddy@bounty ~ $ oolite
2007-10-09 01:42:04.686 oolite[26717] initialising SDL
open /dev/sequencer: No such file or directory
2007-10-09 01:42:04.789 oolite[26717] init: numSticks=0
2007-10-09 01:42:04.789 oolite[26717] CREATING MODE LIST
2007-10-09 01:42:04.789 oolite[26717] Added res 1024 x 768
...

I also seem to have some other denied messages, but I hope I'll understand this soon enough to make it work.

I would like to know if is possible to allow execmem only for oolite, and since I suspect it is, how can I accomplish this?

So, now my question is, where is the fine manual on setting SELinux? I digged the whole evening to get oolite to start.

Saturday 29 September 2007

trains are my coding anti-cryptonite

Another weekend trip to my parents' meant another successful svn-buildpackage coding session.

I worked on expanding the svnMkdirP functionality to svn-buildpackage and checked svn-upgrade. So when the testing is done (don't know when that will happen :-( ) I'll merge the work done in the svnmkdir-p branch[1] into trunk and ask for a sponsor - I don't think that will be a problem :-P.


If you want to test, don't try trunk yet[*], your should try the svnmkdir-p branch[1]. Don't forget to bump down the version.


BTW, svn-buildpackage 0.6.22 should close .....


0 eddy@bounty ~/usr/src/svn-buildpackage/svnmkdir-p $ dpkg-parsechangelog | grep ^Closes | sed -r -e 's#Closes:##' -e 's#[0-9]{6}#1 +#g' -e 's#\+$##' | bc
13

13 bugs.

Here are the bastards:
  1. 408690
  2. 411666
  3. 414581
  4. 419996
  5. 423487
  6. 428225
  7. 428689
  8. 433404
  9. 433536
  10. 434932
  11. 435746
  12. 436133
  13. 436554




[1] svn+ssh://svn.debian.org/svn/collab-maint/deb-maint/svn-buildpackage/branches/svnmkdir-p or svn://svn.debian.org/svn/collab-maint/deb-maint/svn-buildpackage/branches/svnmkdir-p

[*] the branch is not yet merged into trunk because it needs more thorough testing

Thursday 27 September 2007

Invariant sections...

Thanks to Holger's post I saw this really educational comic strip.

To those who don't get the joke: the invariant sections could put you in that exact position. Please read the chapter about "Invaraint sections" from the "Draft Debian Position Statement about the GNU Free Documentation License(GFDL)"

Wednesday 26 September 2007

svn-buildpackage pending changes

Just a few snippets:

Author: eddyp-guest
Date: Wed Sep 26 02:14:15 2007
New Revision: 4959

URL: http://svn.debian.org/wsvn/collab-maint/?sc=1&rev=4959
Log:
create a special branch for the svn "mkdir -p" like functionality until all scripts are converted to use this function and more tests are done

Added:
deb-maint/svn-buildpackage/branches/svnmkdir-p/
- copied from r4958, deb-maint/svn-buildpackage/trunk/

0 eddy@bounty ~/usr/src/svn-buildpackage/svnmkdir-p $ dpkg-parsechangelog
Source: svn-buildpackage
Version: 0.6.22
Distribution: UNRELEASED
Urgency: low
Maintainer: Eddy Petrișor
Date: Wed, 26 Sep 2007 05:24:59 +0300
Closes: 408690 411666 414581 419996 423487 428225 428689 433404 433536 434932 435746 436133
Changes:
svn-buildpackage (0.6.22) UNRELEASED; urgency=low
.
[ Eddy Petrișor ]
* IMPORTANT: changed default behaviour of saving the configuration in
.svn/deb-layout by default to avoid stale data to override the
configuration options that were updated in the repository.
(Closes: #414581)
As a consequence, a new option --svn-savecfg was added to allow a
mechanism for easily overriding options locally
.
[ Gonéri Le Bouder ]
* SDCommon::sd_exit: read the parameter correctly is SDCommon::nosave=1
(Closes: #428225)
.
[ Eddy Petrișor ]
* s-u: when importing options from ~/.svn-buildpackage.conf, filter in
only the valid options (Closes: #428689)
* s-u: replace retcode with retval for consistency with svn-bp
* s-i: manpage still claimed layout 2 was not implmented (Closes: #433404)
* s-i: now really supports injects for layout 2 (with the disadvantage of
not creating the tag directory)
* s-i: no longer fails on initial checkout (Closes: 411666)
* when using origUrl, make sure the origDir exists before downloading
in it
* s-i: man page: document the missing -o option (Closes: 419996, 435746)
* s-u: complete the man page synopsis section (Closes: 436133)
* s-b: do not require the build deps to be present when exporting
(Closes: 423487); thanks Stefano Zacchiroli for the patch
* SDcommon.pm: enhance the guessing algo of the layout to make svn-upgrade
guess correctly on layout 2 repos; thanks Gregor Herrmann for the patch
(Closes: 434932)
* Makefile: the version of the package is placed quoted in "SDCommon.pm" so
that versions like "0.6.22~bpo40+1" don't cause s-b to barf
* SDCommon.pm: implemented a function that emulates a 'mkdir -p'
functionality for svn; this will allow a fix for #434932
* s-i: based on the mkdir-p functionality create missing directories on
inject (Closes: 433536, 408690)


Friday 21 September 2007

software for managing finances

Dear lazyweb,

I decided to track our (mine and my finceé's) expenses more closely and I am in search of some software for this task.

The requirements for the software:
  • not to hard to use (I don't want to read a whole manual to know how to use it)
  • possibility to assign a category for a given expense (so I can locate problematic type of expenses)
  • can easily extract statistics (e.g.: top last month's expenses, expenses split per category)
  • appropriate for domestic (small scale) accounting
  • available in Debian (or will be soon :-) )
  • preferably GNOME/GTK based
I have never used such software before, so I am a total noob in this area. Please be gentile if I might have naively described the problem.

Thanks in advance!

Monday 17 September 2007

svn-buildpackage:development, RFH

Although slow, development on svn-buildpackage still continues. Sorry for the delays, but both myself and Eduard Bloch have been unable to assign more time to svn-buildpackage development in the last few months.

From the changelog of the trunk package (unreleased 0.6.22):
  [ Eddy Petrișor ]
* IMPORTANT: changed default behaviour of saving the configuration in
.svn/deb-layout by default to avoid stale data to override the
configuration options that were updated in the repository.
(Closes: #414581)
As a consequence, a new option --svn-savecfg was added to allow a
mechanism for easily overriding options locally

BTW, did you know that svn-buildpackage is maintained in collab-maint and welcomes contributions and committers/reviewers (especially those)?

In case you want to help, this is the place to start from:

svn co svn+ssh://svn.debian.org/svn/collab-maint/deb-maint/svn-buildpackage/trunk

or

svn co svn://svn.debian.org/svn/collab-maint/deb-maint/svn-buildpackage/trunk

If you don't have commit access to collab-maint yet (just make a request on the alioth tracker and that will be fixed ;-) ).

Thursday 13 September 2007

hmm, I was here before

I am stumbling again on the problem of not being able to install libgnome-dev and libsvn-dev side by side, namely bug #429025.

For some reason, I was under the impression that the bug was fixed... but apr-util is behind :-(.

Wednesday 12 September 2007

is easy to let yourself manipulated

There is this guy whose blog I found about a week ago. He does some really nice and funny sketches mostly with issues regarding Bucharest and things he finds out/encounters.

The fact that he is smart to see some of the rudeness happening in Bucharest doesn't help him notice he is manipulated by the press (and himself) and writes:

"La 6 ani de la 11 septembrie 2001, Osama bin Laden a scos capu' din vagauna cu o noua inregistrare video in care da lumii intregi 2 alternative: convertirea la Islam sau moartea!"

Which translates approximately to:

"Six years after September 11, 2001, Osama bin Laden shows his face out from his hiding place and makes a new video in which gives the whole wolrd 2 alternatives: converting to Islam or death!"

(I won't take into discussion the correct translation of the message from Osama, if the message is authentic, if Osama is payed by somebody interested in keeping the conflict open, if Osama's forces are actually active in Iraq, if Bush is Osama himself, if we're actually living in 1984 and Osama is already dead but the American manipulative government is using all sorts of recordings from him to keep people inside a cage and fed them propaganda, < insert your theory/idea here >....)

He links to this article, whose title might make one to think what the guy said is true (according to the article)...

But, if read carefully is clear that there are a few twists (in order of apparition):
  • the alternatives were given to the American people ("urged the American people")
  • the alternatives were for the ending of the war in Iraq ("there are two ways to end the war in Iraq")
  • the first alternative is escalation of the conflict, not death; that would translate, IMHO, into "we will be fighting in Iraq and you can give up, but we will not", so I'd say surrender/retreat/keep fighting and probably die are all viable alternatives for the first one
  • apparently he says converting to Islam is an alternative ("The second way, he continues, is to reject America's democratic system and convert to Islam.")
  • but, if we read on we see that he actually says just to acknowledge that democracy is not at all a successful model, unless by success one thinks in terms of a system that achieves the interests of the corporations ("It has now become clear to you and the entire world the impotence of the democratic system and how it plays with the interest of the peoples and their blood by sacrificing soldiers and populations to achieve the interests of the major corporations")
  • and more importantly, converting to Islam is not actually a condition, but there is an invitation to do that ("I invite you to embrace Islam,")

So, the conclusion is that it's really easy to fall into the trap of believing what the titles and headlines suggest, rather than what the articles actually say, especially if you keep your head open only to single stream of ideas (I guess you, my dear reader, are smart enough to see what I mean here).


Disclaimer, my analysis might be off, my English sucks, I might be Osama, I might be Bush, I didn't saw the recording, there's a cat on a freezer...

Tuesday 11 September 2007

dpkg i18n going backwards?

Now in dpkg I found this:

dpkg: no, cannot proceed with %s (--auto-deconfigure will help):\n
%s

Before there was something like:

dpkg: no, cannot remove %s (--auto-deconfigure will help):\n
%s


This is going backwards wrt to i18n since one of the basic rules of i18n is to have complete sentences, while keeping the verb covered behind some parameter is not a good way of doing that.

Monday 10 September 2007

debian lenny on my laptop

I did this Friday night:

cat /etc/debian_version
lenny/sid

The good:
  • the upgrade was pretty painless, including the kernel upgrade (I just spent about an hour to resolve/force the upgrade path for some packages since aptitude was stubbornly refusing to upgrade since some recommends were not satisfied)
  • the kernel doesn't seem to need any options to make the headphones work
  • I can use a free driver for the wlan (bcm43xx instead of ndiswrapper)
  • as a bonus, now I can suspend-to-disk while the wlan interface is turned on (with ndis+2.6.18-5 the system hanged during suspend)
The bad:
  • the two icons I got on the desktop for the crypto partition I have mounted under /crypto are still both there (the worse thing is that I don't remember/can't find the bug number)
  • suspend-to-ram still doesn't work - I suspect flgrx being the culprit
  • there are now two icons for xchat in the systray. Should I be removing xchat-systray? Apparently yes, although there are a few things that xchat-systray had which are not available in the native systray thingie:
This is xchat-systray's menu:


And this is xchat's native menu:
By the way, the Hide menu entry there hides xchat itself, not the tray icon.



Update:
one more bad: this is still not fixed.


Update++:
I am a moron: apparently this is a known problem and a workaround exists. Now I can have my sincle xchat-systray icon :-) .

Friday 31 August 2007

handy stuff for reformatting text

<eddyp> anyone knows of a quick way to reformat a text to a certain with in vim?
<eddyp> width, even
<liw> !fmt -w 42
<pkern> Something along the lines of gq I think.
<pkern> {Visual}gq says :help gq
<eddyp> liw: ":%!fmt -w 42" should be correct
<eddyp> thanks
* eddyp notes the fmt command for future reference
<liw> fmt is a pretty handy command at times
<ricky> Wow, that is handy.
* seanius is a fan of the J and fmt combo
<pkern> (gq has the advantage that textwidth is obeyed. And fmt could be specified with formatprg.)
<eddyp> pkern: that gq is handy, too
<eddyp> "gggqG" and bam, everything is refromatted
<eddyp> does anyone mind if I post this on my blog?
<eddyp> pkern, liw, seanius, ricky ?
<liw> eddyp, nah
<pkern> eddyp: Please enlighten me why you could possibly need our approval (apart from copying the log verbatim perhaps).
<eddyp> pkern: exactly :-)
<eddyp> verbatim copy; irc is the best source of documentation :-)
<ricky> eddyp: Go ahead :)
<pkern> eddyp: Then please correct the obvious mistake of mine: ":help gq says {Visual}gq", hah. And I disagree on the usefulness of IRC logs as documentation. :-P
<ricky> As a side note, the -u option seems cut words at spaces properly, which I'd been looking for :)
<seanius> eddyp: if you find my armchair opinions valuable for some reason, then by all means :)

Friday 24 August 2007

Dear lazyweb...

Why do I get these errors in a real etch system and I don't in a pbuilder login environment? Missing build-deps?


# Add here commands to compile the package.
/usr/bin/make
make[1]: Entering directory `/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1'
/usr/bin/make -C doc
make[2]: Entering directory `/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc'
docbook2man uclean.sgml
Using catalogs: /etc/sgml/catalog
Using stylesheet: /usr/share/docbook-utils/docbook-utils.dsl#print
Working on: /home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:1:59:W: cannot generate system identifier for public text "-//OASIS//DTD DocBook V4.1//EN"
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:35:0:E: reference to entity "REFENTRY" for which no system identifier could be generated
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:1:0: entity was defined here
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:35:0:E: DTD did not contain element declaration for document type name
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:37:9:E: element "REFENTRY" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:38:15:E: element "REFENTRYINFO" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:39:12:E: element "ADDRESS" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:40:15:E: element "EMAIL" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:42:11:E: element "AUTHOR" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:43:19:E: element "FIRSTNAME" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:44:17:E: element "SURNAME" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:46:14:E: element "COPYRIGHT" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:47:11:E: element "YEAR" undefinednsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:48:13:E: element "HOLDER" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:50:12:E: element "DATE" undefinednsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:52:10:E: element "REFMETA" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:53:17:E: element "REFENTRYTITLE" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:55:15:E: element "MANVOLNUM" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:57:13:E: element "REFNAMEDIV" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:58:12:E: element "REFNAME" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:60:15:E: element "REFPURPOSE" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:62:17:E: element "REFSYNOPSISDIV" undefined
nsgmls:/home/eddy/usr/src/perso/build-area/svn-buildpackage-0.6.22~bpo40+1/doc/uclean.sgml:63:16:E: element "CMDSYNOPSIS" undefined

I am worried about my apartment mate

The last record of her laptop being on is:

syslog.4.gz:Aug 20 01:01:30 ritter dnsmasq[1984]: DHCPACK(eth0) ... endirra

I haven't seen her since, and her phone is off. I tried calling her before, but when it was on, she didn't answer. What is weird is that in the past, she didn't always answer the phone (or better said, seldom did).

Also, when she had a crush on some guy, she left for a whole week without saying anything...

But now I am worried.

Wednesday 22 August 2007

my nslu2 runs an eabi kernel

Thanks to Riku Voipio's advice, my slug now runs an EABI kernel with OABI compat support. As a bonus I have the latest 2.6.18-5 abi kernel :-) .

Things to remember when trying to do the same with your slug:
  • running make menuconfig (somehow) with the debian/arch/arm/config-ixp4xx file as config will, most definitely disable the compilation of the internal driver
  • before flashing a new linux image, check that the drivers for the network interfaces you expect to be up after reboot are present in the deb
  • arm iptables will fail with an eabi kernel, even if it has oabi compat support, you will need to install iptables in the armel chroot; so if you use iptables at all, you need to pass that invocation to the iptables binary from the armel chroot
I just chroot-ed into an armel chroot and rrdtool made some nice and meaningful graphs.

Instead of:


I got:



Some complaints:
  • migrating to armel will probably be a pain, if care is not taken to migrate settings, scripts and everything in /etc
  • the load spikes up just to make these graphs :-(

Saturday 18 August 2007

how did the Debian Games Team got rid of spam

How to get rid of most spam on Alioth mailing lists.

Until some Debian developer implements this idea in the BTS, the mailing list and in PTS, the quantities of spam that people get on the mail addresses they expose in the Debian world, the spam will keep on coming in huge quantities. I don't care that there will always be spam, I care that the amount of spam would be lower or probably nonexistent for one time sending (I sent just one mail to the BTS from two different mail addresses and it was enough to get huge amounts of spam on them).


The Debian Games Team's ML has suffered for a while, back in 2006, from spam bombing.
We thought that the easiest method to get rid of spam is to requires registration in order to be able to send mails to the list.

That was a big mistake because, as some people suspect, mails from the BTS needed to be allowed in and the reporters got an automated reply saying that the mail is waiting for approval, etc, etc - plain rude, people send you bug reports and you say "You are not listed, bla, bla bla". We changed the setting not to send any automatic reply and people were more content.

But still, we had to hand approve or white list valid mails/addresses. That worked for a while, but once the mail ended up on spammer lists, the thing blew out of proportions. We were desperate. We tweaked the settings of the lists in different ways, but in the end we ended up in having such a broken setup that we had to approve white listed addresses.

That did for me. I had to do something. So I started working on the ML setup and basically I did the following things to get rid of spam:
  • anything coming from white listed mail addresses was approved
  • black listed mails (bellsouth is one of those) is rejected (or should be)
  • anything that came from the BTS (there are some nice headers that the BTS puts) and was evaluated as ham with a score of 0 (there is a field for that, too - some field has Spam=No, Score=...) was allowed
  • anything that came from the BTS that was Spam=No with a score above 0 was held for moderation
  • anything else is held for moderation

This solution allowed us to lighten the burden of manually white listing every email address that ever sent valid mails to the BTS and to focus only on real spam.

Try it your self on your lists, you'll be pleasantly surprised.

Friday 17 August 2007

2 of the cross compiling issues...

... I found when trying to cross compile rrdtool already have a fix.
  • freetype's bug was fixed by somebody in gentoo and they sent the patch upstream; I tested the patch and it works
  • for the libart_lgpl bug I made a couple of (really non-instrusive) patches (one for the bootstrapped version and one for upstream source) and I sent them to both gnome and gentoo; as a bonus, libart_lgpl's art_config.h is now defined based on stdint. The definitions for ART_SIZEOF_* were removed since a grep on the source revealed that they are not used (confirmed by the fact that the source compiled).
I have these patches myself and they work. There shouldn't be any reason for the native build to fail.

The rrdtool issues [1] were not reported since:
  1. I am not sure if is really a bug (IEEE math test bug) and
  2. I don't have a clean fix (link-with-build's-library issue).



[1] look for 'xmerging rrdtool, the bugs don't stop'

Thursday 16 August 2007

too much to do, too little time

Short update:
  • I had no time to do hacking lately
  • there are too many mails on -boot; I wish they changed their minds and made a -boot-dev list, the BTS mails are incredibly annoying (yes, I made a filter, but is not that efficient); I am thinking seriousely about unsubscribing from that list
  • I would like to retreat from supporting the ppp-udeb package of ppp since I have no more interest in it; I already unsubcribed from part of the mails related to it
  • I have a new HDD from my router and is really silent :-) (some good news, at last)
  • glest-data contains some unclearly licensed stuff; imagine how nice is that when upstream is unresponsive (although not dead - adds new messages on the upstream forum, but never answers other's posts)
  • sorry for not having time to work on svn-buildpackage, glest, naughtysvn, wormux and oolite; I would really like NMU work on the debian packages (but I don't know what to say about svn-buildpackage since Eduard Bloch is the lead developer)
  • real life is taking is share back
  • I am thinking of upgrading to lenny in an attempt to motivate myself back into coding
  • aspell-ro with support for the correct diacritics should be uploaded soon in sid; cedilla support is dropped; minor obstructions in the way of the uplaod
  • LVM rules
  • RPM still sucks big time not because of itself, but because of the pure lack of tools for it
  • scons can't cross build
  • the freetype cross building bug (reported and found in gentoo) has a working patch; they claim they sent the patch
  • I got 2 cool presents for my birthday (27 now): an ASUS wl-500g Premium wireless router and a Fujitsu USB HDD
  • chroot-ing in armel from arm isn't possible (for me?):
ritter:/home/eddy# chroot /data/chroots/armel-root-fs/
Illegal instruction
ritter:/home/eddy# uname -a
Linux ritter 2.6.18-4-ixp4xx #1 Sun Aug 5 19:53:40 EEST 2007 armv5tel GNU/Linux
  • I no longer have this on my belly (it washed away):

  • the vacation was really great
  • I have too many interests in and outside of Debian, I'll have to cut back on some of them
  • Ross Burton rules because he wrote tasks and he maintains it properly

Wednesday 8 August 2007

small commits

Update: I have been informed that git and mercurial have support for the hunk commit feature, excellent! See the comments section if you want to know the details. IIRC, The last time someone talked about this feature was when John Gorzen analyzed his options for another VCS (still now I can't find any reference to this issue).


Lars, darcs allows this and, as a supplement that is missing from any other VCS I know, is the ability to select individually changes in a file. That means you can commit only some of the changes affecting a file and leave others for later commits.

So for this diff:

$ darcs diff -u
diff -rN -u old-asd/asdnbfnd new-asd/asdnbfnd
--- old-asd/asdnbfnd 2007-08-08 18:12:17.000000000 +0300
+++ new-asd/asdnbfnd 2007-08-08 18:12:17.000000000 +0300
@@ -1,3 +1,3 @@
-#!/bin/ash
+#!/bin/sh

-echo "intentionaol spelling error"
+echo "intentional spelling error"

you can do this:

$ darcs rec -m "spelling correction"
hunk ./asdnbfnd 1
-#!/bin/ash
+#!/bin/sh
Shall I record this change? (1/?) [ynWsfqadjkc], or ? for help: n
hunk ./asdnbfnd 3
-echo "intentionaol spelling error"
+echo "intentional spelling error"
Shall I record this change? (2/?) [ynWsfqadjkc], or ? for help: y
Finished recording patch 'spelling correction'

$ darcs rec -m "make the script run on posix shell"
hunk ./asdnbfnd 1
-#!/bin/ash
+#!/bin/sh
Shall I record this change? (1/?) [ynWsfqadjkc], or ? for help: y
Finished recording patch 'make the script run on posix shell'

Of course, being able to unrecord/uncommit or to rollback is a nice thing to have.

Bash says: I am different from myself!

Dear lazyweb, is this a bug? Should I fill this?

$ sh -x rpm/build_latest_srpm foo 2>&1 | grep -E '\'
+ alias 'rpmbuilder=rpmbuild --define "_topdir /tmp/rpmbuildarea/foo/redhat"'
+ rpmbuild --define '_topdir /tmp/rpmbuildarea/foo/redhat' -bs foo.spec
$ sh rpm/build_latest_srpm foo 2>&1 | grep -E '\'

$ bash -x rpm/build_latest_srpm foo 2>&1 | grep -E '\'
+ alias 'rpmbuilder=rpmbuild --define "_topdir /tmp/rpmbuildarea/foo/redhat"'
+ rpmbuilder -bs foo.spec
rpm/build_latest_srpm: line 46: rpmbuilder: command not found
$ bash rpm/build_latest_srpm foo 2>&1 | grep -E '\'
rpm/build_latest_srpm: line 46: rpmbuilder: command not found

$ rpm/build_latest_srpm foo 2>&1 | grep -E '\'
rpm/build_latest_srpm: line 46: rpmbuilder: command not found

$ type sh
sh is hashed (/bin/sh)
$ type bash
bash is hashed (/bin/bash)
$ ll /bin/*sh
-rwxr-xr-x 1 root root 677184 2006-12-11 23:20 /bin/bash
lrwxrwxrwx 1 root root 21 2006-08-17 21:05 /bin/csh -> /etc/alternatives/csh
-rwxr-xr-x 1 root root 80200 2007-02-02 09:34 /bin/dash
lrwxrwxrwx 1 root root 4 2006-12-19 15:02 /bin/rbash -> bash
lrwxrwxrwx 1 root root 4 2006-12-19 15:02 /bin/sh -> bash
lrwxrwxrwx 1 root root 13 2006-08-17 20:58 /bin/tcsh -> /usr/bin/tcsh


The script has /bin/bash in the shabang.

I tried to reproduce with another simpler script, and I observed that the issue happens when the script is processed by bash. With a /bin/sh shabang, it didn't since the command processing the script was sh, also all sh commands were ok.

I also suspect that it has to do with this paragraph from the bash manual:


Aliases are not expanded when the shell is not interactive, unless the expand_aliases shell option is set using
shopt (see the description of shopt under SHELL BUILTIN COMMANDS below).


... but different and (I would say) worse behaviour than the sh invocation should not happen.

The test script follows:

#!/bin/bash

alias rpm-build="echo \"I:\""
alias dpkg-select="dpkg --get-selections \"ls\*\""

echo "alias1:" && rpm-build test
echo "alias2:" && dpkg-select

Test and enjoy. Before you ask, yes, I use bash functionality.

Update: Also using set -i just before the alias definition and keeping it during the expansion does not help.

this is the way forward

leveraging existing technologies and obtaining cool things

Tuesday 7 August 2007

tilting at windmills - the NSLU2 issues

It turns out I was playing Don Quijote when I was trying to fix the reset issues I had with my slug.

Of course, I was suspecting the USB rack for the issues, but the issues never showed up on the laptop... until now.

I decied to temporary use a USB stick for the FS and started copying files over to the stick. During the copying the rack disconnected and everything became more clear. I was tilting at the windmills all the time.

BTW, be smarter than myself and just change the UUID to mach the old one instead of regenerating the initrd and reflashing it:

tune2fs -U the_uuid_of_the_previous_root_partition /dev/sda1


Oh, next on the agenda will be to use my ipod mini as a hdd for the slug in order to prevent ruining the stick... And another thing, total silence is really nice, even if the rack was not that noisy.

Monday 6 August 2007

nslu2, the kernel...

Update: it didn't work .... :((

eddy@ritter ~ $ grep '##### kern' -A 2000 /var/log/syslog | grep -E '(reset|####)'
Aug 6 11:19:00 ritter manual message: ##### kernel was changed to linux-image-2.6.18-4-ixp4xx_2.6.18.dfsg.1-12etch2.1_arm.deb #####
Aug 6 11:23:08 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:35:15 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:52:41 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:54:47 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2



I cross built this:

Changes:
linux-2.6 (2.6.18.dfsg.1-12etch2.1) stable; urgency=low
.
* Non-maintainer upload.
* ixp4xx kernel:
Disabled options:
- USB_EHCI_SPLIT_ISO
- USB_EHCI_ROOT_HUB_IT
Enabled options:
- USB_BANDWITH

And it took me on my amd64:

real 105m0.331s
user 79m43.423s
sys 12m46.324s

Is not an upload, is just my try to fix this issue. Let's see if it works.

Oh, thanks to Riku for pointing out that cross building the kernel is trivial.

Friday 3 August 2007

dear anonymous reader...

I have answered to your request about the config files for bluetooth.

Have fun!

more news from the nslu2 front

The main reason for the lack of activity during this week was that I had some problems with my router, a NSLU2 running the debian arm port. This was the machine I wanted the softfloat rrdtool I was talking about in these 2 posts.

It turns out that for some weird reason the USB controller of the slug resets when there is too much traffic going on. I got lots and lots of messages like:


$ grep reset -A 1 /var/log/syslog | grep -E '(reset|repeated)'
Aug 3 07:02:27 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
...
Aug 3 11:50:29 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:51:03 ritter last message repeated 4 times
Aug 3 11:53:15 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:53:46 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:55:48 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:56:07 ritter last message repeated 2 times
Aug 3 11:56:52 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:56:57 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2



That wouldn't have been a problem as long as the damn thing worked, but in some cases after the reset the root filesystem was lost and all the services running on the machine died (except the networking itself since all of that is part of the kernel and is loaded in memory).

It took me about three evenings to get to this conclusion (imagine being locked out of a machine which advertised running ssh, http, and dns services, being able to telnet and getting a respons, but the full protocol failed).

It all became more clear when, while being connected via ssh I got his result:

eddy@ritter ~ $ ls
-bash: ls: command not found

"WHAT?" was the first reaction, then I figured that life without ls is unbearable and came up with this bit:

alias ls='for I in * ; do echo $I ; done'

Quite cool to understand a little bit about the system. Still, cat was unavailable so having /proc and /sys available did not help that much. At some point I was thinking about a busybox shell, tried that, it didn't work since it wanted to remove some important bits like the initramfs tools.


I googled again and found a good starting point for a discussion of the same problem someone else was having. I ended up trying all the proposed solutions: rmmod-ing ehci-hcd, blacklisting the module, regenerating a initrd image with the blacklisted module... None of these worked (individually) and I ended up having an even more unstable system when ehci was not present (I was loosing the root FS after aprox. 1 min after logging in remotely immediately after start). Restoring the image with ehci was a pain (not to mention the time spent trying to figure a way to revert the change safely, since bricking the machine was not out of the question - imagine loosing the FS while flashing the ram drive image).

I finally managed to revert the ehci enabled image and tried the workaround proposed in this gentoo BR. I tried 128 and now I am at 64 and I am still getting those resets; Still something good came out of this, I followed Michael Prokop's Use root=UUID on NSLU2 article and I think I have now a stable root file system just thanks to him.

Now I am stuck:


grep -E '(max_sect|reset)' /var/log/syslog
Aug 3 07:02:27 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 08:27:38 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 08:38:43 ritter manual message: max_sectors was set to 128
Aug 3 08:55:25 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 09:31:47 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 09:54:11 ritter set_max_sectors: max_sectors was set to 128
Aug 3 10:09:32 ritter set_max_sectors: max_sectors was set to 128
Aug 3 10:15:27 ritter set_max_sectors: sda's max_sectors was set to 128
Aug 3 10:24:24 ritter set_max_sectors: sda's max_sectors was set to 128
Aug 3 10:38:12 ritter set_max_sectors: sda's max_sectors was set to 128
Aug 3 10:47:04 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 10:48:27 ritter set_max_sectors: sda's max_sectors was set to 64
Aug 3 11:04:36 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 3 11:06:16 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
....
Aug 3 15:05:58 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2

Should I try setting to 32?

In case you're wondering, I am doing the setting with this script:

#!/bin/sh

MAX_SECTORS=64
DISK=$(ls -l `grep 'uuid.* / ' /etc/fstab | awk '{print $1;}'` | sed 's#.*/\(.*\)$#\1#' | tr -d '[0-9]')

[ -z "$DISK" ] && logger -t 'set_max_sectors' "error: could not detect root disk" && exit

echo "$MAX_SECTORS" > /sys/block/$DISK/device/max_sectors
logger -t 'set_max_sectors' "${DISK}'s max_sectors was set to ${MAX_SECTORS}"

exit 0


Dear (not at all) lazy web, what can I do to fix/workaround this issue? Any help would be gratly appreciated (personal mail or comments).

Note: I am reluctant to try the fix proposed in the last comment that claims to fix the issue since compiling an arm kernel natively is a pain and I would like to have an official kernel from Debian Etch, as I do now.

Thursday 2 August 2007

softfloat rrdtool sequel

Hey, I am back from vacation. I was back on Saturday but I didn't had the time to blog or do Debian work, so hello :-D.

After coming back from my vacation I went back to the softfloat rrdtool thingie I was working on before and I finaly managed to compile a statically linked softfloat uclibc rrdtool. Needless to say that I found that a pain and I ended up:
  • avoiding the Makefile and made the linking manually because:
    • for some unknown reason, the -static option didn't ended up in the final link call
    • libfreetype and libart_lgpl always were added as .so files... - libtool is probably the usual susupect, but I suspect the upstream libraries themselves
  • avoided (completely) using xmerge and used directly the source since:
    • CFLAGS="-Wl,-static" did not do the right thing (it didn't do in raw source, either)
    • I was about to make a static rrdtool, so I didn't needed gentoo's ebuilds at that point

If you are curious about the hideous details, just add a comment or email me and I'll send them or publish them.

So, in the end, I managed to produce a staticlly linked arm-softfloat-uclibc rrdtool which:
  • did not work directly on the .rrd files generated by the debian packaged collectd since they were "generated on another achitecture"; well with debian's rrdtool, 'rrdtool dump' was not that slow, so it was acceptable to dump with debian's rrdtool and restore with the statically linked one
  • generated graphs in about 1 minute (way slower than my amd64 machine, which does in 3s, but at least 100+ times faster than debian's arm rrdtool - remember, I tested the generation and it didn't finish the first graph not even after 50 minutes)
So that was fun. The total round trip for a round of graph generation was about 15 min which was more than cool and fast compared to the slow native variant, whohoo!

Now the downside, I am not through with it. If you look at the picture you'll notice that there is a "small-ish" problem: there are no scales, legends and no letters/glyphs of any kind, so the graphs are still quite useless, but better than nothing:


They should look something like this:




Notice any difference? :-/

Of course, I won't stop here!

I suspect the problems come from the cross compiled libfreetype and libart_lgpl, and I'll probably end up doing some native compiling after all, sigh!

Friday 20 July 2007

Vacation, yes!

Starting tomorrow I will be on vacation until the 28th of July. Excellent! So if you expect some thing from me, just keep it on hold until the next month :-P .

One week of sunbathing on the Bulgarian seaside. You might ask why Bulgaria and not Romania since both have access to the same sea?

It appears that my predictions about the price of a ticket for a vacation on the Romanian sea side exceeding the price of one in Bulgaria or Greece (including transport) came true (Some might say this was the case last year, too, but this year is more obvious). I'll see if the services will be better, afaik, they should be.

Wednesday 18 July 2007

How to build rrdtool with soft-float for a debian arm machine (which uses hard float)

Being a good Debian user

Notes: the bulk of this article was written on the 17th; updates were added in later; this article is really long, you might want to read it in more sessions

Debian's arm port uses hardfloat. This means that apps like rrdtool (graph function) take ages to run because there is no floating point unit. In order to make this faster you'll want to use soft-float emulation. But this support needs to be present in all depended upon libs, including libc, which is an ABI incompatible change which would break all binaries. I can't use the yet-in-development-and-unofficial armel port since the machine has to be up and running almost 24/7 and I can't afford downtime.

So what to do?

Well, obviously use another soft-float enable libc, like uclibc, in parallel withe the hrad-float enabled libc. Debian Sid contains uclibc development libs, but building the whole toolchain is a pain. There is no real support in the distro itself and building on the arm machine is kind of out of the question. I tried a little in a debootstrapped chroot on the native machine, but this was a pain and I was stuck tring to figure out how to disable whitles and bells for rrdtool since I didn't needed any of the perl, tcl or python support. So I stopped, went to bed and decided I'll continue the next day (the day I wrote the biggest part of this article). The future looked dark, trying to build a softfloat uclibc rrdtool on a system that wasn't prepared for that and from which softfloat was pulled away somewhere during etch's development stage....

Deciding to go with the competition - native building

From link to link I found out that gentoo has arm-softfloat-linux-uclibc port, among others, so I decided to use that on the target machine (a NSLU2) in a chroot. Well I stumbled on the low computing power, when I tryed to emerge sync the tree. I gave up un that and tried to use emerge-websync. It looked better, but I didn't got too far with this either since the machine ended up being overloaded, with figures varying from 3 to 8 for the load average (all of them).

Deciding to go with the competition - cross building

It was time to bring in the big guns and I decided I was going to crossbuild those binaries, probably statically since it meant lower risk for things to break on the target machine.


Since I observed that gentoo has a nice tool called crossdev which actually uses emerge (the equivalent of dpkg+apt in Debian world) to make the cross chain tool and saw that they have some really good documentation and a development version of an arm-softfloat-linux-uclibc, I decided I could .

So I went on and downloaded the latest stage3 tarball for x86 (so I can have a gentoo system to start with), made all the necessary things and created a gentoo chroot. I emerge sync-ed, updated portage, installed crossdev (following the nice instructions on the gentoo embedded site about the subject). This took a while, but it went like a breeze. I made the necessary arangements to be able to cross build packages (setting SYSROOT, making that nice xmerge script).

As my goal was to make an arm-softfloat-linux-uclibc binary for rrdtool I tried directly to xmerge rrdtool (of course, first just pretended -p).

Finding bugs in the packages

First bug

All went fine until I found freetype which for some weird reason detected the i486-pc-linux-gnu-gcc compiler as i486-pc-linux-gnu (observe the missing trailing -gcc). I looked over the logs and determined that the CC_BUILD environment variable was not set. So exported CC_BUILD=i486-pc-linux-gnu-gcc and xmerged the library.

(While trying to fix this issue I entered #gentoo-embedded on freenode and actually met Yuri Vasilev, whom I had met in Edinburgh, at DebConf. Look for the video about Ligusk if you are interested about his project to make a Debian like distribution based on Gentoo. It was nice to meet a friendly "face" - as much you can say that online.)


I reported the bug (already there is a patch, I haven't tested, but looks ok) and the workaround in the gentoo bugzilla... after creating an account, since I never used that BTS before.

Second bug

Next pain was libart_lgpl whose upstream, for some comfortability reason chose to detect the target type sizes (and basically ignoring inttypes.h) with a small program which was built with the cross compiler which later was supposed to be ran on the host. I don't think is necessary to say that i failed, of course, since the binary was an ARM binary, while the host was an x86 machine. Yupee! not.

Ok, two bugs, no softfloat rrdtool yet. I reported this second bug and tried to work around the issue. Since I didn't really used too much a gentoo system, I wanted to know details and what possibilities I had. I got some really nice tips from "solar" (aka Ned Ludd) on #gentoo-embedded. He pointed out that while building a package one can press CTRL+z, do some stuff, like altering the Makefile.in file in the build directory and then fg back. Ok, cool.

All I need now is somebody with direct access to an arm machine that can compile and run for me the program in question. seanius on #debian-devel was kind enough to do this for me and, to my surprize, I got the same output header file as I got in the host x86 chroot. But that detail is not important, let's just compile that rrdtool. Well, first libart_lgpl...

Ok, I CTRL+Z-ed the build process just after the configure stage ended, cd-ed into the build directory, I tampered with the Makefile.in file and changed that line from

./gen_art_config > gen_art_config.h

into

cp /gen_art_config.h.arm gen_art_config.h

Haha! In your face! Ok, nasty did was done so I fg-ed and went on confident that I will get my hands on that rrdtool very soon. The package build ended succesfully, without any other suprizes, so I went for the graal!


xmerging rrdtool, the bugs don't stop

xmerge -v rrdtool

Bla, bla, bla.... bla, bla, bla, what do you know, there are a few IEEE math function tests in the configuration script of rrdtool. They failed. I don't know why, but I have a feeling that this should be happening on softfloat or even maybe the rrdtool source is not that cross compile aware. I didn't care that much, so I did what I knew best, tampering with an ongoing build.

This time is was a little bit harder to catch since I had to wait for the source to unpack (well, there is a larger frame, but that is the rough timing) and at that moment CTRL+Z the build.

I didn't had much time to waste, so after looking shortly over the tests I realized I should just ignore the error and look for the "failing command" that was interrupting my beloved build. Muhahah!

Thank God for verbose error messages, I could find the place fast and chaged the two "exit 1" statements into two harmless 'echo "Ignoring IEEE math errors"', sic. I had to do this a few times since I got the timping wrong, at some point I got the syntax wrong (at which point I remembered the cp trick I did earlier and saved the modified file in /configure.rrdtool )...

xmerging rrdtool and the 4th bug

After passing past this stage, I got a compilation error which, judging from the command, meant that the linker tried to link the target (arm) binary against the host (x86) libraries.

Oooook, this is not going to stop me! I look at the Makefile.in to try understand why this happened and then realized it wasn't worth it for me to try to fix it the right way. So I cheated once again! Muhaha, I just ran some useful commands which I'll let you feast your eyes to:

ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/libart_lgpl_2.so /usr/lib/
ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/libpng12.a /usr/lib/
ln -sf /usr/arm-softfloat-linux-uclibc/usr/lib/libfreetype.so /usr/lib/

Ok, good to go. Of course, I had to stop again the build to copy the "fixed" configure script that ignored IEEE math errors, but that is already history.

And, believe it or not, actually the build succeded. I had a cross built rrdtool using shared libraries built against softfloat ucllibc. No guarantees about its correctness, but it built!

This is when I decided to start writing this article.

Still there are things to do

I will have to do a few things before being happy about the victory:
  • check that all the resulted binaries are indeed ARM ELF files, and not Intel x86. It appears the shared libs are already.
    • done, they are (checked the next day)
  • figure out qpkg (is a utility from portage-utils which seems to be a way to make binary packages) to package somehow the result of this work or maybe just ignore it and use plain cp
  • see how I can fit together both the softfloat uclibc binaries and the native debian binaries on the same filesystem, including ld.so configuration...
    • but I fear this will be a nightmare, so I might decide to go with static compilation of rrdtool, which I don't think it will be quite straight forward given the experiences describe above; there could be also the size issue to take into account, and since the NSLU2 has only 32MB of RAM, this could be problematic
      • I think I'll have to use a statically linked binary after all (note added late, the second day)
  • see if the rrdtool binary really works and if all the work payed off, performance wise - it should, taking into account what I have heard from wookey during the debconf7 talk about the armel (warning, 73MB ogg file) port
  • be happy with the solution I got or wait for the Debian armel port to catch up and switch to that asap
    • if I'll have to do the switch I'll probably think of ways of migrating the whole system from arm to armel, and not to reinstall everything; this might prove useful for the whole debian armel port, when people are working on migration plans, since arm will be deprecated in favour of armel