Saturday, December 29, 2018

The insane OOM (out of memory) Killer

In the late nineties I worked on AIX for the first time. Back in those days there were several flavours of Unix available, all with their differences and idiosyncrasies. Linux was a fledging and fitted on just one CD. I came across a feature of AIX which I thought was crazy - the OOM (out of memory) killer. In this variant of Unix malloc always succeeded, even when there wasn't enough memory. The idea was that malloc returned a pointer to heap memory but wouldn't actually start to use it until the first reference was made. At the point at which it did then memory had jolly well better be available. If it was then all well and good. If not then the OOM killer came into play. The OOM killer would choose a victim process and kill it. The result was that memory would be freed and the access occurring at the time would succeed. Sounds insane, right? Right. I laughed and thought that this one feature rendered AIX useless compared to the other Unixes and would lead to its demise. How wrong I was. Fast forward a few years later. It was added to Solaris. Sigh. Fast forward to today. It has been added to Linux.

The OOM killer is a kernel development that mirrors what happens when banks try to innovate. It's what I call "the conspiracy of crappiness". It goes like this: some group or other tries to innovate but comes up with a really bad idea that doesn't work well and everyone hates it. The competition discover the move and for some inexplicable reason they copy it. Now everyone hates the competition as well and none of the players can be distinguished in this area. Bank charges on current accounts is an example. So is charging for withdrawals at ATMs (although customers have objected so vehemently to that one that there has been some back peddling). Well, in the world of Unix we now have the OOM killer.

There's a good article at LWN, that explains why this is insane. There's another article that gives tips on how to mitigate the nastiness, but surely that it yet another testimony to the fact that it is nasty. I also came across this article that discusses the nastiness and has an excerpt of .an amusing article that discusses the fairness, or otherwise, of how the victim is chosen. Here is the excerpt:


An aircraft company discovered that it was cheaper to fly its planes with less fuel on board. The planes would be lighter and use less fuel and money was saved. On rare occasions however the amount of fuel was insufficient, and the plane would crash. This problem was solved by the engineers of the company by the development of a special OOF (out-of-fuel) mechanism. In emergency cases a passenger was selected and thrown out of the plane. (When necessary, the procedure was repeated.) A large body of theory was developed and many publications were devoted to the problem of properly selecting the victim to be ejected. Should the victim be chosen at random? Or should one choose the heaviest person? Or the oldest? Should passengers pay in order not to be ejected, so that the victim would be the poorest on board? And if for example the heaviest person was chosen, should there be a special exception in case that was the pilot? Should first class passengers be exempted? Now that the OOF mechanism existed, it would be activated every now and then, and eject passengers even when there was no fuel shortage. The engineers are still studying precisely how this malfunction is caused.

Update: 27 March 2022

Since that aircraft analogy I have found an article on the perils of overcommit which gives a more dispassionate assessment, but still concludes it is a terrible idea: https://www.etalabs.net/overcommit.html

Sunday, December 02, 2018

I can't stand the JBoss Application Server

I wonder which application server people chose when working on Java projects that need to publish dynamic web pages. I have used tomcat in the past and found it to be pretty good. But for the last few years I have been in an environment where JBoss was chosen. JBoss comes with all sorts of enterprisey EE things such as a JMS implementation and whilst initially this may seem attractive I have decided that I don't like it. I would now recommend that any project that needs JMS and dynamic web pages avoids an enterprise application offer. Instead I think it is better to chose the web page and JMS solutions separately.

Years ago I wrote a book review for ACCU on a JBoss tutorial book. I gave the book a bad review because it was largely XML fragments concerning JBoss configuration. But I now see that this is what struggling in a JBoss environment is all about. I still think the book was wrong to have such large XML sections though. The precise XML needed to make JBoss do what you want seems to wibble depending on the exact version of Jboss you have and also possibly on what colour socks you are wearing. But it gets worse. Recently (wrt the time of writing this, December 2018) JBoss went proprietary. Red Hat now calls it JBoss Enterprise Application Platform or JBoss-EAP for short. Not to be confused with the old open source version which was just called JBoss. In an attempt to deal with the confusion Red Hat renamed the old one to Wildfly and open source development is now done under that name. Wildfly does seem to be much better than JBoss but it's all relative; it is still derived from JBoss and so still suffers from the tremendous environmental difficulties caused by obscure and constantly changing XML configuration.


So, for people who want JMS and web pages with dynamic content, I recommend ActiveMQ and Apache Tomcat respectively.