Rambling around foo: linux

Showing posts with label linux. Show all posts

Saturday, 25 March 2017

LVM: Converting root partition from linear to raid1 leads to boot failure... and how to recover

I have a system which has 3 distinct HDDs used as physucal volumes for Linux LVM. One logical volume is the root partition and it was initally created as a linear LV (vg0/OS).
Since I have PV redundancy, I thought it might be a good idea to convert the root LV from liear to raid1 with 2 mirrors.

WARNING: It seems LVM raid1 logicalvolume for / is not supported with grub2, at least not with Ubuntu's 2.02~beta2-9ubuntu1.6 (14.04LTS) or Debian Jessie's grub-pc 2.02~beta2-22+deb8u1!

So I did this:
lvconvert -m2 --type raid1 vg0/OS

Then I restarted to find myself at the 'grub rescue>' prompt.

The initial problem was seen on an Ubuntu 14.04 LTS (aka trusty) system, but I reproduced it on a VM with Debian Jessie.

I downloaded the Super Grub2 Disk and tried to boot the VM. After choosing the option to load the LVM and RAID support, I was able to boot my previous system.

I tried several times to reinstall GRUB, thinking that was the issue, but I always got this kind of error:

/usr/sbin/grub-probe: error: disk `lvmid/QtJiw0-wsDf-A2zh-2v2y-7JVA-NhPQ-TfjQlN/phCDlj-1XAM-VZnl-RzRy-g3kf-eeUB-dBcgmb' not found.

In the end, after digging for more than 4 hours for answers, I decided I might be able to revert the config to linear configuration, from the (initramfs) prompt.

Initally the LV was inactive, so I activated it:

lvchange -a y /dev/vg0/OS

Then restored the LV to linear:

lvconvert -m0 vg0/OS

Then tried to reboot without reinstalling GRUB, just for kicks, which succeded.

In order to confirm this was the issue, I redid the whole thing, and indeed, with a raid1 root, I always got the error lvmid error.

I'll have to check on Monday at work if I can revert it the same way the Ubuntu 14.04 system, but I suspect I will have no issues.

Is it true root on lvm-raid1 is nto supported?

Monday, 2 September 2013

Integrating Beyond Compare with Semanticmerge

Note: This post will probably not be on the liking of those who think free software is always preferable to closed source software, so if you are such a person, please take this article as an invitation to implement better open source alternatives that can realistically compete with the closed source applications I am mentioning here. I am not going to mention here where the open source alternatives are not up to the same level as the commercial tools, I'll leave that for the readers or for another article.

Semanticmerge is a merge tool that attempts to do the right thing when it comes to merging source code. It is language aware and it currently supports Java and C#. Just today the creators of the software have started working on the support for C.

Recently they added Debian packages, so I installed it on my system. For open source development Codice Software, the creators of Semanticmerge, offers free licenses, so I decided to ask for one today, and, although is Sunday, I received an answer and I will get my license on Monday.

When a method is moved from one place to another and changed in a conflicting way in two parallel development lines, Semanticmerge can isolate the offending method and can pass all its incarnations (base, source and destination or, if you prefer, base, mine and theirs) to a text based merge tool to allow the developer to decide how to resolve the merge. On Linux, the Semanticmerge samples are using kdiff3 as the text-based merge tool, which is nice, but I don't use kdiff3, I use Meld, another open source visual tool for merges and comparisons.

OTOH, Beyond Compare is a merge and compare tool made by Scooter Software which provides a very good text based 3-way merge with a 3 sources + 1 result pane, and can compare both files and directories. Two of its killer features is having the ability split differences into important and non-important ones according to the syntax of the compared/merged files, and the ability to easily change or add to the syntax rules in a very user-friendly way. This allows to easily ignore changes in comments, but also basic refactoring such as variable renaming, or other trivial code-wide changes, which allows the developer to focus on the important changes/differences during merges or code reviews.

Syntax support for usual file formats like C, Java, shell, Perl etc. is built in (but can be modified, which is a good thing) and new file types with their syntaxes can be added via the GUI from scratch or based on existing rules.

I evaluated Beyond Compare at my workplace and we decided it would be a good investment to purchases licenses for it for the people in our department.

Having these two software separate is good, but having them integrated with each other would be even better. So I decided I would try to see how it can be done. I installed Beyond compare on my system, too and looked through the examples

The first thing I discovered is that the main assumption of Semanticmerge developers was that the application would be called via the SCM when merges are to be done, so passing lots of parameters would not be problem. I realised that when I saw how one of the samples' starting script invoked semantic merge:

semanticmergetool -s=$sm_dir/src.java -b=$sm_dir/base.java -d=$sm_dir/dst.java -r=/tmp/semanticmergetoolresult.java -edt="kdiff3 \"#sourcefile\" \"#destinationfile\"" -emt="kdiff3 \"#basefile\" \"#sourcefile\" \"#destinationfile\" --L1 \"#basesymbolic\" --L2 \"#sourcesymbolic\" --L3 \"#destinationsymbolic\" -o \"#output\"" -e2mt="kdiff3 \"#sourcefile\" \"#destinationfile\" -o \"#output\""

Can you see the problem? It seems Semanticmerge has no persistent knowledge of the user preferences with regards to the text-based merge tool and exports the issue to the SCM, at the price of overcomplicating the command line. I already mentioned this issue in my license request mail and added the issue and my fix suggestion in their voting system of features to be implemented.

The upside was that by comparing the command line for kdiff3 invocations, the kdiff3 documentation and, by comparison, the Beyond Compare SCM integration information, I could deduce what is the command line necessary for Semanticmerge to use Beyond Compare as an external merge and diff tool.

The -edt, -emt and -e2mt options are the ones which specify how the external diff tool, external 3-way merge tool and external 2-way merge tool is to be called. Once I understood that, I split the problem in its obvious parts, each invocation had to be mapped, from kdiff3 options to beyond compare options, adding the occasional bell and whistle, if possible.

The parts to figure out, ordered by compexity, were:

-edt="kdiff3 \"#sourcefile\" \"#destinationfile\"

-e2mt="kdiff3 \"#sourcefile\" \"#destinationfile\" -o \"#output\""

-emt="kdiff3 \"#basefile\" \"#sourcefile\" \"#destinationfile\" --L1 \"#basesymbolic\" --L2 \"#sourcesymbolic\" --L3 \"#destinationsymbolic\" -o \"#output\""

Semantic merge integrates with kdiff3 in diff mode via the -edt option. This was easy to map to Beyond Compare, I just replaced kdiff3 with bcompare:

-edt="bcompare \"#sourcefile\" \"#destinationfile\""

Integration for 2-way merges was also quite easy, the mapping to Beyond Compare was:

-e2mt="bcompare \"#sourcefile\" \"#destinationfile\" -savetarget=\"#output\""

For the 3-way merge I was a little confused because the Beyond Compare documentation and options were inconsistent between Windows and Linux. On Windows, for some of the SCMs, the options that set the titles for the panes are '/title1', '/title2', '/title3' and '/title4' (way too descriptive for my taste /sarcasm), but for some others are '/lefttitle', '/centertitle', '/righttitle', '/outputtitle', while on Linux the options are the more explicit kind, but with a '-' instead of a '/'.

The basic things were easy, ordering the parameters in the 'source, destination, base, output' instead of kdiff3's 'base, source, destination, -o ouptut', so I wanted to add the bells and whistles, since it really makes more sense for the developer to see something like 'Destination: [method] readOptions' instead of '/tmp/tmp4327687242.tmp', and because that's exactly what is necessary for Semanticmerge when merging methods, since on conflicts the various versions of the functions are placed in temporary files which don't mean anything.

So, after some digging into the examples from Beyond Compare and kdiff3 documentation, I ended up with:

-emt="bcompare \"#sourcefile\" \"#destinationfile\" \"#basefile\" \"#output\" -lefttitle='#sourcesymbolic' -righttitle='#destinationsymbolic' -centertitle='#basesymbolic' -outputtitle='merge result'"

Sadly, I wasn't able to identify the symbolic name for the output, so I added the hard coded 'merge result'. If Codice people would like to help with with this information (or if it exists), I would be more than willing to do update the information and do the necessary changes.

Then I added the bells and whistles for the -edt and -e2mt options, so I ended up with an even more complicated command line. The end result was this monstrosity:

semanticmergetool -s=$sm_dir/src.java -b=$sm_dir/base.java -d=$sm_dir/dst.java -r=/tmp/semanticmergetoolresult.java -edt="bcompare \"#sourcefile\" \"#destinationfile\" -lefttitle='#sourcesymbolic' -righttitle='#destinationsymbolic'" -emt="bcompare \"#sourcefile\" \"#destinationfile\" \"#basefile\" \"#output\" -lefttitle='#sourcesymbolic' -righttitle='#destinationsymbolic' -centertitle='#basesymbolic' -outputtitle='merge result'" -e2mt="bcompare \"#sourcefile\" \"#destinationfile\" -savetarget=\"#output\" -lefttitle='#sourcesymbolic' -righttitle='#destinationsymbolic'"

So when I 3-way merge a function I get something like this (sorry for high resolution, lower resolutions don't do justice to the tools):

I don't expect this post to remain relevant for too much time, because after sending my feedback to Codice, they were open to my suggestion to have persistent settings for the external tool integration, so, in the future, the command line could probably be as simple as:

semanticmergetool -s=$sm_dir/src.java -b=$sm_dir/base.java -d=$sm_dir/dst.java -r=/tmp/semanticmergetoolresult.java

And the integration could be done via the GUI, while the command line can become a way to override the defaults.

Monday, 4 March 2013

So, I did the responsible thing and fixed appdirs

In my previous post I expressed my frustration at the way a perfectly nice and fine idea, a portable way to get the standard configuration and data directory/files, was broken for Linux and BSD, because the authors of appdirs thought the XDG standard was "subject to some interpretation".

Although I said I decided not to use appdirs, I realised that wouldn't help anyone, so I fixed the code.

During the coding phase I discovered that the authors of appdirs broke the XDG standard even more, this time, ignoring XDG_DATA_DIRS, and talking about XDG_CONFIG_DIRS. When I found this I became convinced the *nix part of the implementation was subject to continuous irony, since the comment in this newly found breakage said "Perhaps should *use* that envvar", referring to XDG_CONFIG_DIRS, but writing later in code:

/etc/xdg/<appname>

Sweet, isn't it?

If you want a fixed version, you can grab it from my repository, on the linux-fixes branch:

https://github.com/eddyp/appdirs/tree/linux-fixes

Friday, 25 January 2013

(Serial) console flooded with kernel messages?

(If you want to ignore the explanations and see how to stop the Linux kernel from flooding the console with low importance messages, go straight to the bottom of the article, is the small bit at end with larger font.)

After connecting to the serial console on my Linksys WRT160NL router I was faced with the problem that the console was flooded with all sorts of messages such as:

DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.U DST=178.156.183.255 PROTO=UDP SPT=137 DPT=137 LEN=58
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
ACCEPT IN=br0 OUT=eth1 SRC=a.b.c.d DST=69.171.246.16 PROTO=TCP SPT=3651 DPT=443
DROP IN=eth1 OUT= SRC=X.Y.Z.U DST=178.156.183.255 PROTO=UDP SPT=137 DPT=137 LEN=58
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=178.156.183.146 DST=255.255.255.255 PROTO=UDP SPT=17500 DPT=17500 LEN=120
DROP IN=eth1 OUT= SRC=178.156.183.146 DST=178.156.183.255 PROTO=UDP SPT=17500 DPT=17500 LEN=120
DROP IN=eth1 OUT= SRC=178.156.183.146 DST=255.255.255.255 PROTO=UDP SPT=17500 DPT=17500 LEN=120
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=X.Y.Z.W DST=255.255.255.255 PROTO=UDP SPT=58488 DPT=2008 LEN=26
DROP IN=eth1 OUT= SRC=178.156.177.142 DST=255.255.255.255 PROTO=UDP SPT=17500 DPT=17500 LEN=153
DROP IN=eth1 OUT= SRC=178.156.177.142 DST=255.255.255.255 PROTO=UDP SPT=17500 DPT=17500 LEN=153

The serial console was working, but it was impossible to do anything practical in these conditions. I tried looking on the net for 'linux stop console flooding' and similar terms, but didn't get too far, except the fact the problem was the loglevel.

Here is the explanation of what this means (quote from Documentation/kernel-parameters.txt):

        loglevel=       All Kernel Messages with a loglevel smaller than the
                        console loglevel will be printed to the console. It can
                        also be changed with klogd or other programs. The
                        loglevels are defined as follows:

                        0 (KERN_EMERG)          system is unusable
                        1 (KERN_ALERT)          action must be taken immediately
                        2 (KERN_CRIT)           critical conditions
                        3 (KERN_ERR)            error conditions
                        4 (KERN_WARNING)        warning conditions
                        5 (KERN_NOTICE)         normal but significant condition
                        6 (KERN_INFO)           informational
                        7 (KERN_DEBUG)          debug-level messages

This was enough to go through my local git tree of the kernel in the Documentation directory and grep for loglevel. This brought me to this interesting bit from Documentation/sysctl/kernel.txt

==============================================================

printk:

The four values in printk denote: console_loglevel,
default_message_loglevel, minimum_console_loglevel and
default_console_loglevel respectively.

These values influence printk() behavior when printing or
logging error messages. See 'man 2 syslog' for more info on
the different loglevels.

- console_loglevel: messages with a higher priority than
  this will be printed to the console
- default_message_loglevel: messages without an explicit priority
  will be printed with this priority
- minimum_console_loglevel: minimum (highest) value to which
  console_loglevel can be set
- default_console_loglevel: default value for console_loglevel

==============================================================

So I ran 'cat /proc/sys/kernel/printk' and got (I managed to read it through the flood of messages from the firewall):

7       4       1       7

According to the explanations above, that meant that console_loglevel was too, so to fix it I ran:

echo '2 4 1 7' > /proc/sys/kernel/printk

And, behold, the serial console was usable.

Tuesday, 22 January 2013

(Rsnapshot) backup and security - I see problems

In my previous post I was asking for suggestions for backup solutions that would be open/free software, do backups over the network to a local HDD, be cross platform to allow Windows and Linux clients and not be too CPU/memory hungry (on the server).

Several people suggested rsnapshot, BackupPC, areca-backup, and rsync. Thank you for all your suggestions, you have been a tremendous help. I have decided to give rsnapshot a try since it was suggested to me that it would actually do what is supposed to do for Windows clients, too (which was initially my perceived show stopper for rsnapshot).

Still, when getting to the implementation, I was a little disappointed by the very permissive access that needs to be provided on the client machines, since the backup is initiated from the backup server. Even the so called more secure suggested solutions seem way too permissive for my taste, since losing the control over the backup system means basically giving total access to the data from all client machines, which is quite a big problem in my opinion.

The data-transfer mechanism employed by rsnapshot is simply

S ==(connects and reads all data)==> C
S stores data in the final storage area

Am I the only one seeing a problem with this idea? If the server can connect to all your client machines and read all areas as it pleases, even if you restrict it to some directories, the data is already compromised when the backup server is compromised (think .ssh private keys, files with wireless network passwords and so on; I won't say card information - you don't keep credit/debit card information on your computer, or at least not in plain text, do you?).

What I would consider a better alternative would be a server-initiated dialogue which goes a little like this (S is server, C is client, '=' represent connections via ssh):

S ---(requests backup initiation procedure)---> C
S waits for a defined period of time that C connects back to send (already encrypted) data; if it doesn't arrive, it aborts
S <===(sends encrypted data to be backed up)=== C
S <-(signals the completion of data transfer)-- C
S stores the data in the final storage area

This way, the server can allow access from the clients only in designated areas (even a chroot is possible) from designated clients, access can even be provided only after a port knocking procedure and only during the backup time frame (since the server initiates the negotiation, it can expect only then the knocks, but only then), so the server is quite well secured. The connection to the server can even be done through an unprivileged account, it can even be one account per client machine which can be limited to a scponly shell, if you care for that level of security.

On the other hand, the client information is secure since it can be encrypted directly on the client machine and sent only after encryption, the client machine can decide and control what it sends, while the backup server can only store what the client provides. Also, if the server is compromised, the clients' data and system aren't compromised at all, since the data is on the backup machine, but is encrypted with a key only known on the client (and a backup copy of it can be stored somewhere safe).

I am aware this approach can be problematic for permission (user/group preservation), but it doesn't happen if there is a local <-> remote user mapping or simply the numeric IDs are kept.

I am also aware this means smarter clients and might mean the Windows machine might not be able to implement this completely, but a little more security than "here is all my data" can still be achieved, can't it?

What do other people think? Am I insane or paranoid?

I think I can implement this type of protocol in some scripts (at least one for server and one for clients) and use the backup_script feature of rsnapshot to keep this clean and nice within rsnapshot.

What might prove problematic with this approach is that rsync spedup is lost (might be?) because the copy is done to a temporary directory which, I assume, is empty, so tough luck. Another problem seems to be that every time the backup is done, the client has to encrypt each of the files to backup, which seems to be a real performance penalty, especially if the data to be backed up is quite large.

Is there an encryption layer that does this automatically at file level in the same/similar manner that LUKS does for entire block devices? Having the right file names, but with scrambled/encrypted contents seems to be the ideal solution from this PoV.

Thanks for reading and possible suggestions you might point me to.

P.S.: I just thought of this, if there was an encryption layer implemented with fuse which is mounted in some directory on the client machine, the default rsnapshot mechanism could actually work, and this would mitigate the data accessibility issue and the performance issue since that file system could be contained within a chroot and the encryption/scrambling would be done transparently on the client, so no data is plainly accessible. Does anybody know such a FUSE implementation that does on-the-fly file encryption?

P.P.S.: EncFS does exactly what I want with its --reverse option which is exactly designed for this purpose:

Normally EncFS provides a plaintext view of data on demand. Normally it stores enciphered data and displays plaintext data. With --reverse it takes as source plaintext data and produces enciphered data on-demand. This can be useful for creating remote encrypted backups, where you do not wish to keep the local files unencrypted.

Great!

Monday, 16 January 2012

What's common between Windows 7 and GNOME 3 / gnome-shell?

Update: I managed to make sound work. For some weird reason, a mute switch option of some the many (and who knows how useful) switches of my sound card was enabled. Now the damn thing works. Did I mention that since I did the upgrade all my sound cards (I have a USB sound card, too) have listed as available inputs all the inputs of my internal sound card (mic, front mic, line in, CD, etc.) in Audacity? That makes for a very confusing and loooong sound input sources list! The upside is that I can finally record clips from televisions that do not provide such a feature and FastVideoDownload doesn't handle.

I also seem to have found a possible fix for the caps-ctrl issue in Xfce4 (obviously, setting "-option ctrl:swapcap" in ~/.Xkbmap, instead of that Alt modifier).

As I said in my previous post, I will tell you what do GNOME 3 and Windows 7 have in common.

Before everything else, I want to make it clear that when I am saying GNOME 3, I am referring to Debian Wheezy's GNOME 3, since I recently upgraded from Squeeze on my laptop. I'll probably drop a line or two about that, too.

First, I'll tell you about the (boring, probably for many) experience with Windows 7. As I said before, my new job requires me to use a Windows machine, so up until a few months ago I was using Windows XP with some additional software and tweaks to make it usable. Then came the Windows 7 „upgrade”. I am using quotes since the more appropriate term would be „fresh installation on a new partition”, not even close to what Debian users are used to call an upgrade.

So after a fresh Windows 7 installation, my first shock was the fact there was NO Quick Launch*. Some of you might be laughing, but I had never used Windows 7 up until then, just saw it on a laptop of a friend of mine (Ovidiu, one of the guys with whom I am doing this podcast, went to Denkfest with, and made these interviews). That was the first shock. Initial discussions about this with Windows users lead me to believe Quick Launch was dead and for some unexplained reason, I believed them. Later, much later, a week ago, to be precise, I found out that you can bring back the Quick Launch through some convoluted way**. Up until that point I had to have some icons pinned to the task bar, but some others on the desktop (and I hate that) because some of them, like Cygwin, if pinned, would start a cmd console, since Win 7 pins the process, not the starting script.

Among other things which broke in Win 7 and used to work fine in XP, the Virtual Dimension application which provides me with a virtual desktop, was the first one which was broken. I have been using a liniar 4 desktops-wide virtual desktop for over 5 years and I am worthless and inefficient if all my apps are on the same desktop. Mail application is always on the first desktop, work and file managers are on the second, the third is for extras and multimedia editing while the fourth is my gateway to the internet, containing the browser, instant messenger, or whatever.

The shortcuts I use to get to the various desktops are Win+1 ... Win+4 keyboard shortcuts, but the M$ Evil Empire decided that those shortcuts are going to start or bring foreward the first, second and so on applications pinned on the task bar. And you can't change those shortcuts***. Nor is disabling just those possible since they are all disabled through a huge switch which disables ALL Win+x keyboard shortcuts, among which Win+E (file expolrer) and Win+D (Show Desktop) were also. Luckly, Win+L (lock screen) was not disabled. So I disabled al those Win+ shortcuts, since I need virtual desktops.

Now, imagine if I had to start a Cygwin console and I had all sorts of apps open! Win+D was disabled, so I had to minimize the apps covering the desktop shortcut for Cygwin, click on the icon to start it, bring back the minimized windows and go on with my work. What a waste of clicks, mouse movement, energy and time, just because some dudes thought a Quick Launch-like feature was useless****.

You might wonder already what do those '*' sings mean. Well, sadly, that's what GNOME 3 / gnome-shell and Windows 7 have in common.

Gnome 3 was a shock for me. An empty desktop right after upgrade. No panels, no shortcuts*, no power indicators, no wicd indicator, no virtual desktops, no desktop icons, (I have a few dirs and docs there). Sounds like an Evil Empire decision, doesn't it?

Luckly I have been using Tilda as my always-ready console and I could fire up iceweasel from the console in order to understand where my panel disappeared.

I then realised that the upgrade brought me Network Manager, that app which wicd replaced. As a consequence, I had no working wlan since Network Manager made sure to mess up with the network manager I chose.

After looking through the documentation of Network Manager and realising I either had it set up to leave wlan0 alone or I didn't understood NM's documentation, I simply stopped the service, which let Wicd its job flawlessly.

The first thing I searched was „Gnome 3 panel” or something of that sort and I was confronted with the obvious option to appeal to the Forced Fallback Mode which was disabled. I figured I either had an old version, or Debian disabled this feature (hoping they provided an alternative). There was also the option to conform to this convoluted way of working** with Actions and such uselessness like that. I still wonder, what is the purpose of the „Favourites” bar on the left side, since it's accessible only after wasting a lot of mouse movement and time? For Joe's Pesci sake, I use focus under mouse just to avoid needless mouse and keyboard manipulation. Why? Why? WHY would I want every time I need to start or SWITCH to another application to move the mouse to the upper-left corner then take my hands off the mouse to type, move the mouse downwards or move across the whole width of the screen to get to my beloved virtual desktops and pick the app I want?

Making a long story short, after even trying XFCE4 (which for some unknown reason resets almost immediately my keyboard layout to the default layout with the Caps on Caps, instead of my preferred and set Ctrl on Caps - yes, it's global), I managed to find GNOME Shell Frippery** which made the experience better.

Later I found out that GNOME 3's file manager, Nautilus, has decided that an „up on level” button is useless, since the default is to use that uncopy-pastable button location bar instead of a sane text location bar. And it seems the GNOME developers decided this*** and I should conform to it.

To add insult to injury, those icons on my old panel are apparently useless**** and even in the fallback version I can't get them back. Or so the GNOME developers decided.

At some point this sunday, don't know how or why this change happened, producing sound was impossible. I know the problem is pulseaudio since when I kill the pulseaudio daemon from the console I can play audio. BTW, great timing, just when I needed sound the most, before releasing episode 32 of our podcast (yay, I reaslised that xfce just decided to reset my caps to be caps, after setting to ctrl a few minutes ago).

I know I praised pulseaudio when I first tried it, but failing to make it work out of the box or after some tinkering is a deal breaker for me, so I removed it. Now I find it that is a default in GNOME, yet all it manages to do is prevent audio from working. At least on my machine.

Other problems? Gnome Power Manager manages to hang and block my session, GNOME managed somehow to fail to start at some point. Yeah, and that sound problem which I didn't fix yet, didn't went away after removing all the pulseaudio packages which could be removed (e.g.: ryhtmbox depends on libpulse0, same do some other apps like audacity, so I couldn't remove all pulse related packages).

I got involved with Debian and GNU/Linux because it was tweakable and customisable, didn't use to force all sorts of option on me and now I find with its increasing popularity it becomes more and more like a product of a corporation which decides to change some things just to change and totally disregadring user experience and uses.

So, in the light of all of these problems I think it's time to probably consider trying KDE. Is it any good lately?

Wednesday, 30 March 2011

Do you agree that Linux desktops are here?

I found some time ago this site which aims at proving that Linux desktops are currently over 1%. So, I registered my and my wife's Debian GNU/Linux desktops to contribute to the statistic.

http://www.dudalibre.com/gnulinuxcounter?lang=en

As publicity for GNU/Linux, this project has potential to put Linux on the radar of more producers, so, please, add your desktops to the counter!

Saturday, 20 June 2009

MSI PR200WX-058EU sleep - 91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit

Nailed it!

After a rocky start, I managed to find the commit that broke sleep for my laptop.

91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit
commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01
Author: H. Peter Anvin
Date: Wed Jul 11 12:18:57 2007 -0700

Use the new x86 setup code for x86-64; unify with i386

This unifies arch/*/boot (except arch/*/boot/compressed) between
i386 and x86-64, and uses the new x86 setup code for x86-64 as well.

Signed-off-by: H. Peter Anvin
Signed-off-by: Linus Torvalds

Simply reverting this commit wouldn't fix the problem entirely since the screen was always blank after the successful resumes; OTOH, the script used for testing was supposed to do stuff after resume, stuff which would have effects visible on the hard-disk, so it was visible on the next reboot if the last sleep/resume cycle was successful or not.

Great. Oh, and git bisect rules!

Now, if you're interested in finding a regression in the kernel and you might be interested in how I automated the thing, here are some small scripts I used:

linux-build - a wrapper script around make-kpkg to build .deb packages of the linux kernels I build; I used it way before this bisect, but now I modified it in such a way the kernels are clearly versioned and indicate the commit to which they correspond, too
sleepit - a script that automated the actions needed for a linux kernel to be tested; is really trivial and highly specialized on sleep/resume debugging; it assumes to be ran in the directory where you'd later want to grab dmesg-s outputs from
sleeptest - a wrapper script that is smart enough to detect if the current kernel is a kernel to be tested or a stable (regular kernel) one

if the kernel is a stable one:
- looks for the signs left by the last test kernel and depending on them, mark the kernel bad or good in the bisect; this would result in a new checkout which would be processed or, if the bad commit was identified, the script would stop
- in the case of a new bisect, the new checkout is cleaned up, patched, built, then the script installs the new linux-image .deb[1] and update-grub[2], leaving the reboot command at my discretion for the eventual case something went awry; a failure to compile the kernel in an automated fashion would have dropped me in an interactive console which meant I had to manually do the steps necessary to be ready to boot into the next kernel
if the kernel is a test kernel run the sleepit script

The main script was the sleeptest script which is ran as root to allow sleep commands, installation of the kernel and update-grub; when building, the build is done via su to my user.

As a supplemental speed up, I configured libpam-usb to authenticate root and myself through a USB storage device, which is quite cool. I am still pondering if I should keep this enabled or migrate to something like libpam-rsa[*].

Of course, the scripts contain stuff hard-coded into them (my user name for one), but they can easily be modified to remove those limitations (generally they use variables).

linux-build


#!/bin/sh
# License: GPLv2+/MIT
# Author: Eddy Petrișor
#
# Acest script trebuie rulat din directorul nucleului cu comanda:
#    linux-build [--no-headers] [--rebuild]
#
# This script must be ran from the kernel tree directory with
#    linux-build [--no-headers] [--rebuild]

FATTEMPT=../attempt

TARGETS="kernel-image kernel-headers modules_config modules"
[ "$1" = "--no-headers" ] && shift && TARGETS="$(echo $TARGETS | sed 's#kernel-headers ##')"

if [ -f $ATTEMPT ]
then
 ATT=`cat "${FATTEMPT}"`
 if [ $# -eq 0 ]
 then
     ATT=`expr $ATT + 1`
     make-kpkg clean
 else
     if [ $# -eq 1 ] && [ $1 = '--rebuild' ]
     then
         # nothing to do, we are  already set
         echo 'Preparing for rebuild'
     else
         echo 'Illegal parameters'
         exit 2
     fi
 fi
else
 ATT=1
fi

# no problem if is rewritten on rebuild
echo "$ATT" >$FATTEMPT

# must define MODULE_LOC for mol module compilation
DIR=`pwd`
cd ..
MODULE_LOC=$(pwd)/modules
# this didn't work
# export ALL_PATCH_DIR=$(pwd)/linux-patches
cd ${DIR}

echo "Modules should be here: ${MODULE_LOC}"
echo "Stop by ctrl+c, if the independent modules aren't there"

# press ctrl+c, if needed -- disabled for now
#read

export MODULE_LOC
export CONCURRENCY_LEVEL=$(grep -c 'processor' /proc/cpuinfo)

[ -d .git ] && PREFIX="g$(git log --pretty=oneline --max-count=1 | cut -c 1-8)-" || PREFIX=""
APPEND=$PREFIX$(hostname)

#time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -`hostname` --config menuconfig --initrd --uc --us kernel-image kernel-headers modules_config modules
#time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -`hostname` --added-patches 'ata_piix-ich8-fix-map-for-combined-mode.patch,ata_piix-ich8-fix-native-mode-slave-port.patch' --config silentoldconfig --initrd --uc --us kernel-image kernel-headers modules_config modules
time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -$APPEND --config silentoldconfig --initrd --uc --us $TARGETS

sleepit


#!/bin/sh

FAILEDRESUME=/failed-resume
RESUMED=/resumed

modprobe i915
invoke-rc.d acpid stop
echo "$(uname -r)" > $FAILEDRESUME
dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync
echo 'resumed, oh my god' > resumed
echo "$(uname -r)" >> $RESUMED
rm -f $FAILEDRESUME
sync
sleep 10
reboot

sleeptest


#!/bin/sh

RESULTSDIR=/root/var/debug/sleep/regression
UNAMER="$(uname -r)"
FAILEDSLEEPFILE=/failed-resume
RESUMED=/resumed
SOURCEDIR=/home/eddy/usr/src/linux/linux-2.6

check_same_commit ()
{
local COMMIT
COMMIT=$(git log --pretty=oneline --max-count=1 | cut -c 1-8)
[ "$COMMIT" = "$1" ] && return 0 || return 1
}

get_rev_from_unamer ()
{
echo "$1" | sed 's#.*-g\([0-9a-f]*\)-heidi#\1#'
}

mark_bad ()
{
cd $SOURCEDIR
su -c 'git reset --hard HEAD' eddy
su -c 'git bisect bad' eddy
cd -
}

mark_good ()
{
cd $SOURCEDIR
su -c 'git reset --hard HEAD' eddy
su -c 'git bisect good' eddy
cd -
}

compile_next ()
{
cd $SOURCEDIR
if [ -f $FAILEDSLEEPFILE ] ; then
 LKVER=$(cat $FAILEDSLEEPFILE)
else
 LKVER=$(tail -n 1 $RESUMED)
fi
PREVCOMMIT=$(get_rev_from_unamer "$LKVER")

if check_same_commit "$PREVCOMMIT" ; then
 echo "It looks like you got your result!"
 exit 1337  # of course $? isn't 1337, but anyways
fi

su -c 'make clean && rm -fr debian && git reset --hard HEAD && patch -p1 < lkver="$(cat">>> BAD <<< $LKVER ($(get_rev_from_unamer $LKVER))"   mark_bad  else   LKVER=$(tail -n 1 $RESUMED)   echo "Marking >>> good <<< $LKVER ($(get_rev_from_unamer $LKVER))"   mark_good  fi  compile_next && \  cd $SOURCEDIR/.. && \  echo 'Installing the linux-image and running update-grub && reboot' && \  dpkg -i $(ls linux-image-*_$(cat attempt)_*.deb) && \  update-grub  fi

You have my permission to use, modify and redistribute these scripts or modified versions based on these under the terms of the MIT license.

[*] because the libpam-rsa package seems to be unmaintained (especially upstream), while libpam-usb seems to inactive (maybe is considered finished by upstream?)

[1] I didn't automate the removal of the previous test kernel, but that could have been done easily

[2] I haven't made a custom grub section for the test kernels in such a way they would boot by default at the next reboot since I considered that to be too cumbersome for the moment (although I had /vmlinuz symlinks) and it was simpler to select manually the kernel

Wednesday, 17 June 2009

[help] kernel: same config as debian, but mine doesn't boot

Update2: I finally managed to figure out what was wrong. The pristine kernel was missing this patch ata_piix-ich8-fix-native-mode-slave-port.patch which I got from the linux-patch-debian-2.6.18_2.6.18.dfsg.1-24etch2_all.deb package.

Fortunately the package was still available from oldstable, but I am wondering when will snapshot.d.[no] return.

Now I can get back to detecting which is the commit that broke sleep, although I suspect I am searching for 9666f40:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9666f400

Update: I managed to install grub2 and the initramfs fails to find the root because it can't find the "st" volume group. Since I compiled in the kernel the support for LVM, it was clear that any issue that might have appeared was due to the initramfs.

All this seems to be due to this:

(initramfs) dmsetup ls
/proc/misc: No entry for device-mapper found
Is device-mapper driver missing from kernel?
Failure to communicate with kernel device-mapper driver.
/proc/misc: No entry for device-mapper found
Is device-mapper driver missing from kernel?
Incompatible libdevmapper 1.0.2.27 (2008-06-25)(compat) and kernel driver
Command failed

So now I just have to figure out how to patch the old kernels to work or how to install an older libdevmapper.

My laptop doesn't resume properly from sleep (although hibernate/resume works), although it worked at some point in the past with 2.6.18 (at least the one in Debian Etch worked, kind of).

In an attempt to git bisect in order to find which was the commit responsible for the regression, I tried to compile the vanilla 2.6.18 Linux kernel with the exact configuration (with minor differences) as the Debian Etch kernel, but I was surprised to see that my make-kpkg compiled kernel didn't boot.

The differences are:

--- config-2.6.18-6-amd64    2009-06-17 00:57:56.000000000 +0300
+++ /boot/config-2.6.18-heidi    2009-06-17 10:36:49.000000000 +0300
@@ -1,7 +1,7 @@
#
# Automatically generated make config: don't edit
-# Linux kernel version: 2.6.18
-# Thu Dec 25 21:04:29 2008
+# Linux kernel version: 2.6.18-heidi
+# Wed Jun 17 10:36:49 2009
#
CONFIG_X86_64=y
CONFIG_64BIT=y
@@ -1045,7 +1045,7 @@
CONFIG_IDE_GENERIC=m
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
-CONFIG_BLK_DEV_IDEPNP=m
+CONFIG_BLK_DEV_IDEPNP=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
@@ -1069,7 +1069,6 @@
CONFIG_BLK_DEV_HPT34X=m
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=m
-CONFIG_BLK_DEV_JMICRON=m
CONFIG_BLK_DEV_SC1200=m
CONFIG_BLK_DEV_PIIX=m
CONFIG_BLK_DEV_IT821X=m
@@ -1144,7 +1143,6 @@
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_AIC79XX_REG_PRETTY_PRINT=y
-CONFIG_SCSI_ARCMSR=m
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=m
CONFIG_MEGARAID_MAILBOX=m
@@ -1360,6 +1358,7 @@
CONFIG_ADAPTEC_STARFIRE_NAPI=y
CONFIG_B44=m
CONFIG_FORCEDETH=m
+CONFIG_DGRS=m
CONFIG_EEPRO100=m
CONFIG_E100=m
CONFIG_FEALNX=m
@@ -1418,6 +1417,7 @@
#
CONFIG_TR=y
CONFIG_IBMOL=m
+CONFIG_3C359=m
CONFIG_TMS380TR=m
CONFIG_TMSPCI=m
CONFIG_ABYSS=m
@@ -2088,7 +2088,6 @@
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_F71805F=m
-# CONFIG_SENSORS_F75375S is not set
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_FSCPOS=m
CONFIG_SENSORS_GL518SM=m
@@ -2116,7 +2115,6 @@
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
-CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
@@ -2350,6 +2348,7 @@
CONFIG_VIDEO_BTCX=m
CONFIG_VIDEO_IR=m
CONFIG_VIDEO_TVEEPROM=m
+CONFIG_USB_DABUSB=m

#
# Graphics support
@@ -2737,6 +2736,19 @@
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
+CONFIG_USB_SERIAL_KEYSPAN=m
+# CONFIG_USB_SERIAL_KEYSPAN_MPR is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA19QW is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA19QI is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA49W is not set
+# CONFIG_USB_SERIAL_KEYSPAN_USA49WLC is not set
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
@@ -2756,6 +2768,8 @@
#
# USB Miscellaneous drivers
#
+CONFIG_USB_EMI62=m
+CONFIG_USB_EMI26=m
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
@@ -3002,7 +3016,6 @@
CONFIG_ADFS_FS=m
# CONFIG_ADFS_FS_RW is not set
CONFIG_AFFS_FS=m
-# CONFIG_ASFS_FS is not set
CONFIG_HFS_FS=m
CONFIG_HFSPLUS_FS=m
CONFIG_BEFS_FS=m
@@ -3201,6 +3214,7 @@
CONFIG_SECURITY_NETWORK_XFRM=y
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_ROOTPLUG is not set
+CONFIG_SECURITY_SECLVL=m
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0

The initramfs simply stopped at an early point with some errors which look really weird taking into account debian kernel's shouldn't be that different from mine (photo described below):

Loading, please wait...
unknown keysym 'endash'
/etc/boottime.kmap.gz:23: syntax error
syntax error in map file
key bindings not changed
usb 1-2: device descriptor read/all, error -84
ata_piix 0000:00:1f.2: invalid MAP value 2
resume: libcrypt version: 1.4.1
resume: Could not stat the resume device file '/dev/sda5'
            Please type in the full path name to try again
            or press ENTER to boot the system:

I suspect the key map error, the usb error and the resume error to be unrelated to the boot problem.

For some reason I suspect the ata_piix error to be related.

After pressing enter more messages appeared:

The image reads further:

mount: mounting /dev/root on /root failed: No such device
mount: mounting /dev on /root/dev failed: No such device or directory
mount: mounting /sys on /root/sys failed: No such device or directory
mount: mounting /proc on /root/proc failed: No such device or directory
Target filesystem doesn't have /sbin/init.
No init found. Try passing init= bootarg.

(BusyBox prompt follows here).

I looked over the net for some hints, but i wan't able to find a solution.

Since I am forced to use Lilo (/boot on LVM) and I didn't managed to make grub-pc work on this system I am kind of stuck and don't know what to do to make the damn kernel boot.

I am running Debian Lenny, but I am willing to backport a few packages, if ncessary.

Help would be really appreciated.

Thursday, 21 May 2009

Kernel issues - Debian and pristine

Ever since I bought this laptop I was quite content with it running Debian GNU/Linux (Lenny) and, except the sleep functionality not working (bug reported, but I have no answer for any new approaches) I have no other major issues.

Actually, the problems are partially solved since I am in a "pick your favourite bug" situation:

with the 2.6.26 kernel from Debian the information about the power source is incorrect (kernel bug which goes away for me right since 2.6.27)
while with a newer (>=2.6.27) pristine kernel, power information is accurate, but the entire bluetooth stack (init.d scripts and apps) needs a restart after a hibernate/resume cycle to work again; I added my info in the corresponding upstream bug and I hope it gets fixed

Still, I am really curious, since the Debian Kernel Team policy is to accept only patches accepted upstream, theorecally that would mean newer upstream kernels should work wrt that bluettoth issue, assuming this isn't a regression (it doesn't look like one from my experience).

So what is present in Debian kernels that isn't in the pristine ones that makes BT work afer resume? If you know the answer, please add it to the bug report.

On the Debian side of the kernel, which are the chances that newer Lenny kernels will include the power fixes necessary for MSI laptops to report power related info correctly? I know, I know, I should probably report a bug, but I want to know first if there will be a lenny-n-half release, otherwise it doesn't make sense.

Friday, 16 January 2009

Symbian develoment on Linux (almost native) (Part 1)

After my previous failed attempt at installing Carbide C++[1] to allow me to develop stuff for Nokia E71 (i.e. Symbian), recently I have been digging up the internet on information on how to develop directly in Linux.

After much digging and lots and lots of confusion, I actually managed to find some useful information which allowed me to install a cross compiler and the C++ SDK for my phone (S60 3rd Edition Feature Pack 1) on my MSI PR200 laptop which runs Debian Lenny (amd64). The compiler was built from source and its build arch is x86_64 linux, so it should be able to be a little bit faster than the precompiled binaries for i686.

The key of the solution was the GnuPoc project and the fork that offers support for platforms newer than S60 v2 which is avilable at:

http://www.martin.st/symbian/

Besides the GnuPoc download, you'll need to install the appropriate toolchain and compiler:

EKA1 + a modfied gcc release - for S60 v1 and v2
EKA2 + CodeSourcery's GCC (I chose the source variant) - for S60 v3 or newer and UIQ

The instruction on Martin's page are really good, so I won't repeat them here.

After this, there's the non-free part, the SDK installation which needs to be downloaded from forum.nokia.com (you need an account on forum nokia, but the download is free of charge). The installation procedure is also explained on Matrin's page and they work properly (at least they worked for me).

By the way, there are two variants of the SDKs: a full SDK (Java, C++) and one only for C++. The instructions refer to the C++ version, but I don't know if they'd work with the full version.

I did everything until this point and I haven't completed and/or tested if things work, but I'll do it and update the information.

[1] actually I needed a tool chain, but having a graphical IDE was looking really good

Tuesday, 12 February 2008

wireless prohibition

After setting up wireless via WPA2-PSK with AES and realizing I really need the new firmware, I ended up using the latest wireless-2.6 kernel from branch 'everything'.

Now I am again browsing wireless-ly and I am wondering if the digital divide isn't already here:

Is there any reason why this discrimination is done? do they fear that they won't be able to sell the series?

Friday, 8 February 2008

wpa2-psk with aes on a broadcom wlan0 (2.6.24)

Update: I managed to find out why the wpa_action stuff was needed. Please ignore the lines written like ~~this~~; they are there just for reference.

One more update: it seems that the firmware needs to be in sync. I ended up using the wireless-2.6 kernel from the everything branch.

I just managed to get my wlan from my laptop to work with WPA2-PSK with AES with the free Broadcom driver (now named b43, formerly bcm43xx).


0b:00.0 Network controller: Broadcom Corporation BCM94311MCG wlan mini-PCI (rev 01)

0b:00.0 0280: 14e4:4311 (rev 01)

In order to do this I needed linux 2.6.24-1 from unstable and the b43 driver.

bounty:/home/eddy# lsmod | grep b43
b43                   119976  0
rfkill                 12816  3 rfkill_input,b43
mac80211              132236  1 b43
led_class              10120  1 b43
input_polldev           9872  1 b43
ssb                    39428  2 b43,b44
pcmcia                 45720  2 b43,ssb
pcmcia_core            46500  2 b43,pcmcia
firmware_class         15232  2 b43,pcmcia

~~The final trick was to convince wpasupplicat to reload the config with:~~

bounty:/home/eddy# wpa_action wlan0 reload
wpa_action: reloading wpa_supplicant configuration file via HUP signal

This is the wpasupplicant.conf file that I used:

bounty:/home/eddy# cat /etc/wpa_supplicant/wpasupplicant.conf | grep -v '^\s*#' | sed 's/psk=.*/psk=aaabbb___ENCRIPTED_SEE_wpa_password___cccddd/'
ctrl_interface=/var/run/wpa_supplicant
ap_scan=1
network={
ssid="toblerone"
scan_ssid=1
proto=RSN
key_mgmt=WPA-PSK
pairwise=CCMP
group=CCMP
psk=aaabbb___ENCRIPTED_SEE_wpa_password___cccddd
}

and this is the relevant interfaces area:

auto wlan0
allow-hotplug wlan0
iface wlan0 inet dhcp
wpa-driver wext
wpa-conf /etc/wpa_supplicant/wpasupplicant.conf
wpa-ap-scan 2

This is what it looks like when is working:


bounty:/home/eddy# iwconfig wlan0
wlan0     IEEE 802.11g  ESSID:"toblerone"
Mode:Managed  Frequency:2.412 GHz  Access Point: 00:1B:FC:45:33:70
Bit Rate=48 Mb/s   Tx-Power=27 dBm
Retry min limit:7   RTS thr:off   Fragment thr=2346 B
Encryption key:1337-0000-C121-d73D-0207-RE41-0000-3210 [2]
Link Quality=98/100  Signal level=-36 dBm  Noise level=-68 dBm
Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
Tx excessive retries:0  Invalid misc:0   Missed beacon:0

And here's the result of a scan (relevant section):

  Cell 02 - Address: 00:1B:FC:45:33:70
ESSID:"toblerone"
Mode:Master
Channel:1
Frequency:2.412 GHz (Channel 1)
Quality=93/100  Signal level=-42 dBm  Noise level=-68 dBm
Encryption key:on
IE: IEEE 802.11i/WPA2 Version 1
   Group Cipher : CCMP
   Pairwise Ciphers (1) : CCMP
   Authentication Suites (1) : PSK
Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
         24 Mb/s; 36 Mb/s; 54 Mb/s; 6 Mb/s; 9 Mb/s
         12 Mb/s; 48 Mb/s
Extra:tsf=0000000072ff542d

and here is proof it works:

bounty:/home/eddy# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.77.0    0.0.0.0         255.255.255.0   U     0      0        0 wlan0
0.0.0.0         192.168.77.254  0.0.0.0         UG    0      0        0 wlan0

bounty:/home/eddy# ping debian.org
PING debian.org (192.25.206.10) 56(84) bytes of data.
64 bytes from gluck.debian.org (192.25.206.10): icmp_seq=1 ttl=36 time=201 ms
64 bytes from gluck.debian.org (192.25.206.10): icmp_seq=2 ttl=36 time=199 ms
64 bytes from gluck.debian.org (192.25.206.10): icmp_seq=3 ttl=36 time=200 ms

--- debian.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 199.691/200.519/201.046/0.786 ms

Woooohooo! :-)

Thanks to all people involved in b43 development and all the ones made this possible (Debian developers).

Posted from bed, via wlan.

Update: You need the firmware blob, which can be extracted from a Windows driver with bcm43xx-fwcutter (now called b43-fwcutter); ~~I already had it from my previous attempts to configure wlan with bcm43xx. I am not sure if I should use the new tool.~~ You really need the firmware and driver to be in sync.

Update: it seems the b43 driver page (the entire linuxwireless.org site) went down sometime yesterday evening, since yesterday afternoon I was browsing through the site without any issues. (Note: I live in Europe, for reference)

Wednesday, 14 November 2007

Linux: plain weird network behaviour; Windows is OK

Update: problem fixed, thanks for the comments; it was MTU related issues.

Note: This is a long post, but I expect it to bring in some questions for many Linux people; I advise you to read this when you have enough time.

After the last events related to connectivity at home which have been lasting since Thursday evening and the weird fact that it seems that NAT doesn't work at home, but does at work for the same setup, I called the ISP's support and wasted 15 minutes trying to convince them at least to send a technical team with another modem just to test if that is the reason for the breakage.

Of course, talking to the ISP support and trying to convince them that there might be a problem on their side since NAT works with another provider but doesn't with them, was fruitless.

Tonight I was stuck and tried another approach, convinced I would confirm my suspicion that there is something wrong on ISP's side. Still I am not yet sure what to conclude from what happened.

So, to get a better view of what is going on, I'll describe the setup I have and what are its limitations and characteristics. People in a hurry can skip to the paragraph starting with "I tested" and stare in wonder at will.

So the connection I have is done through a DSL modem which gets the MAC of the network card connected to it and exposes that as its own to the ISP's network. This MAC seems to be quite persistent and special measures must be taken in order to be able to use another NIC to connect. The modem (or probably some machine in ISP's; the IP is 10.10.0.1) offers a DHCP address and everything should work fine.

Because of this "your MAC is my MAC" issue, when I connected the first time, I used an USB NIC since a broken router or some temporary failure would have allowed me to use the Internet connection directly from my laptop. I can say this decision has proven in time to be wise.

The router I used until now is a NSLU2 with Debian installed on it. The built in network interface always faced the internal network.

The router (which I call ritter) served as a NAT router for two machines inside the network, my laptop and my apartment mate's laptop. All until last Thursday, after which it never got back properly.

I tested (doing NAT on my laptop; the laptop shouldn't have been affected by the power problems):

with ritter behind the laptop NAT, at home
with two different virtual machines as NAT "clients", at home
with ritter behind the laptop NAT, at work
with a virtual machine as NAT client, at home
directly from the laptop
NAT made through a SNAT rule
NAT made through a MASQUERADING rule
with TTL mangled (increased by one, although is was never in the ballpark of a low TTL)

Of course, I have n-checked that /proc/sys/net/ipv4/ip_forward was set to 1, the tables had policy ACCEPT and there were no extra rules, except the basic NAT-ting stuff, the routes were correctly set on both the clients and the machine doing the NAT.

All I could see is that:

the machine doing the NAT was always working fine
at work all NAT clients worked fine
at home any of the NAT "clients" were

able to resolve addresses while the DNS server was in ISP territory and another LAN
ping the outside world (if ping was available - in D-I is not)
hanging when trying to get a http page
telnet-ting directly to the port 80 was ok (but I didn't try to "GET / HTTP/1.0")

So after all of this, I was thinking of trying to see if a new client (the laptop of my apartment mate, a Windows XP machine) would work using the connection at home. It didn't work.

Then I thought of trying to do "Internet Connection Sharing" as is called in Windows. Of course, there was some pain to find Windows XP drivers for the ASIX AX88172 network card (remember, the modem needed to see the MAC of that NIC), but I managed to find the proper one.

And, almost sure the NAT wouldn't work for this case either, I configured the new connection as a shared one. I didn't even disabled the firewall, as I was thinking I could take those down gradually.

I wasn't expecting this, not even by pure chance, but my laptop which was a NAT "client" now was able to browse, ping and do whatever was normal through NAT, while the Windows machine was doing the "Connection sharing".

I was utterly flabbergasted. And that was just the beginning.

I was expecting that the problem coincidentally went away, but after a minute I was proven otherwise. It still didn't work with Linux as the NAT-ting machine. I connected back the USB NIC to the Windows machine and I saw the same thing. NAT was just working.

At that point I was to observe an even more shocking fact: the IP that the Windows machine got was different from the one the Linux machine received, in spite of the fact that the network card was the same, so it would have made sense to get the same one. More than that, the IP that the Windows machine got was from an entirely different network, although it was a valid IP belonging to my ISP.

I was thinking that one reason why it works with Windows might be that there could be some TCP protocol twist that is differently implemented in Windows and the equipment from my ISP gets along better with the Windows network stack.

As a way to test that, I am thinking of forcing somehow the IP on the Linux machine to see if anything changes. But before doing that, I felt the urge to post these, maybe some kind soul will shed some light on this issue for me or drop a hint.

Another reason might be different DHCP servers answering, but I don't know how I can see in Windows who offered the lease.

If anyone has any clue why these weird things happen, please drop a line. I would greatly appreciate it. TIA.

Monday, 6 August 2007

nslu2, the kernel...

Update: it didn't work .... :((

eddy@ritter ~ $ grep '##### kern' -A 2000 /var/log/syslog | grep -E '(reset|####)'
Aug 6 11:19:00 ritter manual message: ##### kernel was changed to linux-image-2.6.18-4-ixp4xx_2.6.18.dfsg.1-12etch2.1_arm.deb #####
Aug 6 11:23:08 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:35:15 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:52:41 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2
Aug 6 11:54:47 ritter kernel: usb 3-1: reset high speed USB device using ehci_hcd and address 2

I cross built this:

Changes:
linux-2.6 (2.6.18.dfsg.1-12etch2.1) stable; urgency=low
.
* Non-maintainer upload.
* ixp4xx kernel:
Disabled options:
- USB_EHCI_SPLIT_ISO
- USB_EHCI_ROOT_HUB_IT
Enabled options:
- USB_BANDWITH

And it took me on my amd64:

real 105m0.331s
user 79m43.423s
sys 12m46.324s

Is not an upload, is just my try to fix this issue. Let's see if it works.

Oh, thanks to Riku for pointing out that cross building the kernel is trivial.

Wednesday, 17 January 2007

disable the pcspkr - hacks

I blogged 5 days ago about the annoying beeper and different methods that were supposed to turn it off.

I asked for a clean solution, but I received suggestions for some hacks. Now I decided to make a new post since there was a comment added today and I remembered about a possible hack (dpkg-divert) I was thinking about.

Now here are some of those hacks and the reason behind rejecting them.

I thought at some point of using dpkg-divert, as opposed to deleting the module, but this is suboptimal. And I am not sure dpkg-divert would work for such a whacked "use case"
Using a hardware method is also suboptimal, for several reasons:

I will hear the beep if I don't plug in the dummy jack
the same goes when I am using a headset - the volume is loud and seems to not be affected by the general sound volume, so it would be disruptive and annoying while listening music and typing the wrong command;
the Intel HDA sound driver lacks headset jack detection in the current Debian kernel. This means there is no effect if I plug such a jack

The default beeping should be disabled by default, still, that would partly solve my problem. I could set that in my .inputrc file or system wide, still, that would mean a beep on some occasions (like gdm log in) - not sure if this would mean it would be more scarier, since it happens rarely or not :-)

Still, the question remains, why don't the originally proposed methods work, especially blacklisting?

Update: the headphones jack works with some small changes, bug submitted (#407252)

Monday, 8 January 2007

New laptop #4 (I have decided the model)

The race had 3 finalists:
- HP nx6235 (in Romanian) - incredible low price, fully works (with some patches and except the fingerprint reader) (thanks Marc)
- Dell Inspiron 6400 (aka E1505)
- ASUS F3T-AP008

I have decided against the 64bits AMDs and in favour of long life batteries and Centrino Core 2 Duo architecture.

HP nx6235 failed to meet my criteria of design, reliability (I have heard that local HP service has problems with HP products; waranty is 1 year). The shared video memory was a concern regarding overall system performance.

Dell, well... Dell 32 bit arch... I would have been nice to jump into the 64bit wagon, but I guess that will have to wait for now. Design is nice, the hard disk is bigger by 20GB (we all know disk storage is never enough ;-), Centrino tehnology (Intel seems to be OS friendly and drivers seem to exist for all the components), battery time (I can wait to test the real battery time), has some video memory of its own (HyperMemory - some kind of video cache)

ASUS - the selling points over the Dell were the AMD Turion 64 x2 processor(s), the design (looks nicer than Dell, IMHO), the nVidia video card (which was also a concern due to the proprietary driver and its security holes - sorry, no link), and the not yet working built-in webcam. There was also an bonus: a bag and a mouse, but that almost never works for me. And I had to wait for at least a week for it.

The winner is the Dell Inspirion, they also have it on stock, I already placed the order and I will pick it up after I finnish this entry ;-) .

I can't wait to fire up Debian Installer on this baby.

Update: when I said I'll skip he 64 bit wagon I was thinking (for some reason) that the amd64 bit port does not perform well on the Intel Core 2 Duo processors and I would be forced to use the i386 arch. Thanks for correcting me.
Update: it seems eMag was using false information in their advertising regarding Dell, the system is NOT using Centrino technology since it has a Broadcom wireless lan.

Sunday, 7 January 2007

New laptop #2

A new laptop means more things than just new hardware.

I will no longer have a PowerPC machine; it has been fine, I have been spoiled by Apple hardware and design, but I also had my share of bitterness (like it or not flash is ubiquitous, wlan works on i386 hardware via ndiswrapper, wine is good enough when no free alternatives are around)
my father will use my current laptop in OS X; as sad as it may sound I am glad I will not have to administer the machine that much, and, like it or not, the OS X is better suited for him since he is not "that" skilled with computers
I can't work on powerpc/bigendian specific issues, thus, the long pending and incomplete glest big-endian support patch will have to find a new person to take care of and finish it. BTW, the glest project needs developers, the current ones are busy and can hardly do maintainance. Is a pitty, the game has incredible graphics.
I will be able to test the Debian Games Team's games... and I will be able to play again cursively Oolite. (BTW, Oolite should autobuild now, but it seems it hasn't been autobuilt, just the upload architecture is on version 1.65-4)

Saturday, 6 January 2007

New laptop #1

I am in search of a new laptop and this is difficult.

What I (know I) want:

1 GB of RAM (I am tired of not enough resources issues)
battery life, the longer the better (I expect at least 3.5h - real, not advertised - in coding, reading style)
at least 100GB storage
functional sleep and/or hibernation (I was spoiled by my current PowerBook G4)
a video card that is supported in Debian (I'd choose Intel, Nvidia, Ati, in this order) since I want to be able to play the games we package
working wlan (open drivers are a plus), bluetooth, and LAN
a dual core processor (Intel or AMD)

What I don't know for sure aka "help lazyweb":

the video memory should be dedicated (does this affect performance that much these days? even on Intel Centrino sytems?)
AMD Turion 64x2 (TL-35), that means amd64 arch (are they worth the money saved in comparison with the Intel Core Duo? 3.5h battery time on AMD compared to 5h on Intel would mean, no)

Dear lazyweb question:

Does a 32bit wlan ndiswrapped works on amd64 system running in 64 bits mode?
What real battery times do people have on their Turion 64x2 and Intel Core Duo respectively?
What brands should I avoid?

New laptop #3 (hardware compatibility - mandatory rant)

I am deeply missing a real wiki.debian.org/Hardware page. The current one is a joke. It would be nice to have something in the style of the Gentoo hardware page with nice pages which show what works and what doesn't.

I know about the Debian GNU/Linux device driver check page which is a good start, but:

in most places (installation reports in and out of Debian) you will not see lspci -n output
there is more to it than lspci (think lsusb, think stuff behind new/strange/uncommon buses - like the sound chips on some Apple ppc based machines)
you need a running linux system (a live CD can be used for the test), but:

Debian's live CD is in its infancy (I know, Knoppix should do fine)
most live CD-s are i386 only (it doesn't matter that much, since amd64 machine should work with i386 live CDs)

most places where you can buy hardware from:

either they don't have the hardware in stock,
either want an answer immediately,
or they don't allow you to access the machine

(I think) the database is outdated and people are not adding stuff to it
there is no way to make a search based on the product name of the whole system and/or the component

Current situation (bad stuff)

wiki.debian.org/Hardware is a joke. For example, I was expecting to see a Network compatibility list, but instead I see some useless info.

What annoys me the most in this matter are dead links or links to dead/inactive projects.

Information is scattered all over the place (redirects and directories can help), is mixed/ungroupped, or is outdated.

Current situation (good stuff)

The information partly exists. Probably, in time, the wiki will be cleaned up, but until then you have to dig for the information.

There are plenty of sites and pages with information, but, still, you need to weed out the old/irrelevant stuff.

What to do about it

Before you say anything, no, there is no need for yet another Linux compatibility list project.

Ideas:

contribute to the wiki

add a wiki page for every machine you installed under Hardware/Compatibility/{Laptop,Desktop,Misc}/$PRODUCTNAME (would it make sense to have it under Hardware/Compatibility/{Laptop,Desktop,Misc}/$PRODUCTNAME/$DIST ?); link your page to the installation-report (you did fill an installation report for every installation you did, didn't you?)

reorganize pages with directories (as proposed above)
remove dead llinks
add redirects for pages that just say "See foo"
unify pages that contain almost the same information

automatically make a wiki page for each (successful and unsuccessful) Debian installation-report based on the machine type (would need some logic to weed out the home made systems) - but only select the success/fail rate based on strictly hardware issues
help Kenshi with his efforts for better hardware support

Update: Justin told me about the Ubuntu Laptop Testing Team project which is kind of nice. There is also a scraper script which was supposed to help making queries like "info about laptop X" and "which laptops have these features", but that script didn't work for me.