Programmatic OSGi filters in Scalamodules 2.0

2009-09-15

This is a brief note on new the filter support in Scalamodules 2.0. I humbly suggested this functionality to Heiko, and it’s coming in the new version. (It is also on github now, of course.)

What are OSGi filters? Well, they are a subset of LDAP filters, which you most likely aren’t familiar with, unless you’ve dabbled in various black arts of enterprise. They are expressions that can be matched against service registrations in OSGi. Registrations are usually decorated with simple service properties on the familiar name/value form. For instance:

foo=bar
zot=5

Very familiar, I’m sure. Incidentally, this is represented in OSGi with an old-school Java Dictionary. Don’t worry, non-masochistic Scalamodules users won’t ever see that. Here’s a filter that matches it:

(&(foo=bar)(zot<10))

You get the general idea. There is negation and or as well as the and, and the usual set of operators. See the OSGi spec for full details.

OK, so how is the filter represented in OSGi? Answer: It’s a String.

If you silently go “eew” inside now, read on. Again, we feel that Scalamodules users shouldn’t have to see it.

Sure, strings are fine, but this is still something that needs to be a well-formed expression. So why don’t we add programmatic support for it, to ensure both the well-formedness and possibly other constraints? It’s sort of a reverse parser – instead of getting the AST out of an expression, we build the AST and produce the expression as a String when it is to be used.

Here are some ways to build the filter in Scalamodules. First, we import the Filter object:

import org.scalamodules.core.Filter._

The below is one of the nicer constructs; it uses Scala-style API pimping to provide implicit conversion to a builder object, which supports the === and its peers:

("foo" === "bar") && ("zot" <== 10)

Direct use of the Filter methods instead would be:

and(set("foo", "bar"), lt("zot", 10))

or:

set("foo", "bar") and lt("zot", 10)
set("foo", "bar") && lt("zot", 10)

Note that arguments don’t have to be strings, they can be anything that turns into a reasonable string with toString, including primitives.

The set method, which becomes a (foo=x)-style equality filter, actually takes varargs. Passing no arguments indicates a present filter, which is a convention for asking if the variable is set:

Filter.set("foo")

This turns into a (foo=*) filter, which is interpreted by OSGi as any value. While the following is a multi-value filter:

Filter.set("foo", 5, 6)

This turns into (foo=[5,6]), which requires that foo is one of the two values.

So why all this brouhaha? The motivation for all this is simply to make filter construction a more reliable and verifiable process than simply putting together strings, which (at least when I do it) is error-prone and sometimes code-intensive. If it gets bad enough, I usually end up with small string-construction frameworks to do this for me, anyway. And we want to fail fast – detecting and reporting errors as early as possible. Let’s just sayt that not all OSGi implementations produce the best error messages on malformed filters.

But wait! What if you have a filter string? Suppose you got it as an input from somewhere. You don’t want to have to parse it and pick it apart, only to reconstruct it programmatically! The Filter.literal method handles the case where you get a valid filter from somewhere else, and want to use it verbatim. So if you have a string you’re happy with, you can turn it into a filter and we’ll take your word for its validity. However, it might fail later, when and if it is passed to OSGi and you’ve been less than diligent.

This post is an overview and not a reference manual, we (will) have scaladocs for that. But the above should give and outline of the general idea – I hope.


Erik Naggum (1965-2009) RIP

2009-06-20

Today, I learned that Erik Naggum had been found dead.

I have been a regular on a local IRC channel with Erik for years, and it was only last night we started obsessing about his absence, which was getting to be longer than usual. He was liable to disappear for short periods, but since we knew his medical condition was rather bad (he had recently been hospitalized as well), a call was placed to his closest family, as well as the authorities. Interesting quirk: If you call the police to report a concern like this, get someone who lives far away to make the call. Their take is that if you can’t be bothered to drive over yourself, it can’t be that important. However, I digress, since this likely would have made no difference in this case. This morning, he was found dead in his apartment.

I don’t know the exact cause of death, but it is not unlikely to be a complication related to his long-time tormentor ulcerative colitis (UC), which is definitely something you don’t want to be diagnosed with.

I didn’t count myself among Erik’s closest friends, and I hadn’t actually seen him in person for years. However, every time I did meet him, he struck me as very friendly and sociable, maybe surprisingly so if you only knew him from his infamous usenet posts. His virtual persona on our channel was sort of a mix: Sometimes confrontational, most of the time sociable and pleasant, but always interesting. His puns were lethal, even in an intensely competitive punning environment such as ours.

And come confrontation time, what biblical proportions of hell he could raise. He is the only person I could imagine deploying IRC protocol weaknesses to hold the entire channel hostage over a disagreement on character sets. I’m not kidding, either. Obsessive and intense at times, yes, but somehow never remotely irrational, and always interesting, challenging and educational, if you only had the time to sit yourself down and follow him through line after line (IRC is a line-oriented medium) of intricately woven reasoning. Which I didn’t always have, unfortunately. Following Erik was naturally time-consuming, I think, because the reality he talked about, as he understood it, was very complex and deep.

Of course, I also regret not having met him more often in person. But, again, his condition did not help here.

He did talk about code he was working on, relating to relational algebra, relational databases (my last Erik firestorm came down on me when I made a jibe at overuse of rdbms’es for business logic – oh boy!) and sequel-like queries for system management. I think it’s safe to say some effort will be made to salvage whatever legacy rests here.

He will be sorely missed by all of us, and some undefinable quality of (virtual) life on our channel will probably never return. In a rather macabre twist, his client is still active on the channel at the time of writing, and will probably time out soon. This is some new form of death that our generation, inventors of virtual life, have brought with us like a nasty side-effect, brewing up trouble in some left-behind code. As they warned us in a certain tv show that we both loved: magic always has consequences. Dealing with them comes soon enough.


Touching people’s screens

2009-06-10

Are you a screen-toucher? Do you drag your oily paws around on its shiny surface all day, while discussing points of interest with your co-workers? Or is there an invisible wall between your fingers and the screen, a mental imperative, a RoboCop’s fourth directive, to avoid direct touch at any cost? I have that wall.

What about other people? Can they touch this? If someone touches my screen, it’s hammer time.

Other rules hold for touch-screens of course – I use an iPhone myself. As for non-touch-screens, I know there are many kinds of people out there (namely, four), but let me just point out this: the day a Stargate thingy or some trans-dimensional portal appears – or something that looks like one, anyway – I will definitely be among the people still having arms by the end of the day.

In the meantime, here’s a poll.


Open Source, Closed Runtime

2009-03-23

As is wont to happen now and again, the other day I received an IllegalArgumentException. Dragging my carcass down the dusty stacktrace, I find the culprit:

    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {
        if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();

Oh boy. That makes me mad. (This actually happened weeks ago, I just now calmed down enough to write this.)

We are looking at a constructor, and we are looking at years of wasted time for mankind. Java is now thankfully open source, but this is a cause of the closed runtime syndrome, being the term I’ve decided to rant about in this post.

In all fairness, this code isn’t half bad. If you detect an illegal argument, you should throw an exception instead of continuing. Argument checking and failing fast are wise practices. But one important aspect is not handled here: Revealing which argument was the illegal one, and why. It’s not like it’s difficult; this would be enough:

    throw new IllegalArgumentException(corePoolSize + "/" +
        maximumPoolSize + "/" +
        keepAliveTime)

Using the source and the actual exception, this gives me more clues about what’s going on, without having to pull out my debugger. But we can still take another big step in the right direction. What if we were to get this information from just the exception, without the need to consult the source? Sounds utopian, but it turns out to be quite straightforward. Here’s an example:

    throw new IllegalArgumentException
        ("corePoolSize: " + corePoolSize)

Of course, this involves dividing up the checks and throwing custom exceptions for each illegal input. Preferrably, the exception should list all illegal inputs detected. Actual error handling logic now: More work for the programmer!

More work it is. A hell of a lot of work, actually, like all things code-related. But I think this is an important complement to open source – the open runtime! The closed runtime simply throws an exception (at best). Working with closed runtimes are just as inconvenient as closed source, and given the choice, I’m not so sure I would choose the open source every time.

The open runtime, on the other hand, consciously goes about telling you what’s wrong, instead of hiding it – it considers the task of assembling and emitting error information an important part of its logic.

Maintaining an open runtime is the responsibility of all code loaded into the VM; the application, the libraries, the framework(s). And yes, the standard library as well. The Java object system provides one vital component: exceptions with messages. Equally important is the facility of exception chaining, which I won’t go into here – just do it. However, the third component is often overlooked: The Object toString() method.

Implementing a sensible toString is about the best thing you can do for prosperity, world peace a close second. It means everyone can benefit from using your object in exception messages, as well as log messages. It means every log message and exception they appear in will become a little more informative. If you provide a library used by many, the benefits are boundless. You know a good toString method when you see its output: It tends to describe important state (for value objects) and/or identity (for entities). Important state here being e.g. the factors that determine how the instance will behave, or what its logical meaning is. (If any.)

So everyone has the responsibility of opening up the runtime, and the further “down” you tend to be, the bigger your responsibility. It’s a trickle-up effect! What would the world be like if an int presented itself as e.g. java.lang.Integer@123123? A lot more difficult to debug, for one thing. Such transparency makes more or less sense for all objects, especially if they are in heavy rotation. So, if they end up in a log message or an exception message, they contribute by adding meaning to it – every time. Adding a toString is a lot better for your karma than not, which is basically like being a time-sucking vampire. A closed runtime sucks time and energy from all who touch it, from fellow developers to IT staff who have to keep it running.

The Java standard library should definitely know its role in making Java runtimes more open. But e.g. ThreadPoolExecutor doesn’t implement toString – how many man-years have been lost to debugging because of that? (It’s not that I want to know, I was going for rhetorical.)

To sum up: Just implement toString sensibly, assemble enlightening exception messages, and always wrap the cause. Afterwards, the world is a little better, your runtime (and possibly others) will be more transparent and less closed, and everyone has more time to write code, because they don’t have to dig around in a debugger to find out what the hell is making your code scream. And, bonus, you won’t see your code in my blog, I promise.


Perils of Utils: 3 Hurdles for Low-Level Reuse

2009-02-13

I recently started thinking again, this time about low-level reuse – yes, the utils library. Trivial to reuse, this is the layer you build on top of the standard library to make life a little easier in general. The StringUtils.isNullOrEmpty() method, and other things that are just missing from the standard library.

Sounds simple! Well, that method, at least, should be straight-forward to reuse. But it occurred to me that I’ve coded these libraries a few times now. For the love of removing code – why? What, if anything, makes such a basic piece of code hard to reuse? Apart from IPR issues, I’ve identified some personality traits, if you will, that I find to be hurdles to reuse. While obviously not exhaustive, these three are:

  • Dependency Addiction
  • Lack of Inner Motivation
  • Bad Language

Undesirable in any person, how do these traits appear in code? I’ll make the case right here that the general problem is connectedness in your code. Connections that go downwards, upwards and sideways. Read on:

Dependency Addiction

This is the trivially identifiable hurdle: the downward connections. Say you’re on a project that could really use a good utils library – maybe with lots of low-level WET boilerplate. (Agile wiseguy insert: “Write Every Time”, so not DRY.) You have some utils that you want to reuse, but it turns out they depend on around half of the Jakarta Commons! Reuse is inhibited by various factors:

  • Conflicting versions of those libraries are already used in the project, or
  • the project has a stricter policy on third-party libraries, or
  • the codebase is smugly designed to be lean and mean, meaning that your library represents a rather considerable addition to the project’s footprint.

Seeing the uneasy frowns descending on your new co-workers, how do you deal with it? Ruthless purges!

One purge tactic is to identify parts that consume dependencies for trivial purposes. Are you really only using a few of the classes from Commons Collections? Isn’t it sometimes worth re-inventing a wheel, if the custom-designed wheel is a lot slicker? Implement the required functionality yourself, and admire your newly-invented wheel. (Which is usually eminently testable, since you know what you need it for.) If it makes reuse work better, chances are that you can be absolved of the sin of wheel-reinventing. (And like many sins, wheel-inventing is fun, too.)

Second, some of the highly-dependent code might be persuaded to move elsewhere. Maybe it pulls in lots of dependencies because it provides a major, maybe un-utils-like, capability. Ask yourself if it is really a good candidate for this utils library. Find out how many usages the culprit has, and try to get a view of its actual utility. Consider being downright unfair on the hapless piece of code, for the greater good of reuse. It might be better off as its own little module.

In short, find the undesirable downward connections. Some come from the library, some go to the library. Identify and eliminate!

Lack of Inner Motivation

This one is slightly more interesting. I also think of it as origin artifacts, and again, undesirable connections. These are undesirable upwards connections.

Sometimes, we see an elegant piece of code that we want to reuse. So we weed out any references to the application code – the host application, i.e. the origin – and parametrize here and there. We get a general piece of functionality that we can move to the utils library. The original application code now simply calls the utility, with its specific parameters. It ends up leaner, more focused, and generally higher-level. You get the opportunity to clean up the logic chunk under consideration. Even when the utils library isn’t the right place for it, moving code out can be a worthwhile exercise in terms of quality. (It can also expose opportunities and trigger yet more radical code cleaning.)

But the utils may not be the right place. The problem arises when the utility isn’t really as general as it looks; maybe it embodies tacit assumptions, or handles special needs of the origin. They are the origin artifacts, the hidden upwards connections. This reveals its reduced utility in other applications, and it springs surprises on innocent re-users at awkward times. If it hangs around, it pollutes the utilness of your utils. Worst case: Similar utilities, with other quirks, make their own way into the same utils library. (Bad language – more on that below.)

The  motivation for a utility must be clearly stated and obviously useful, in and of itself. The motivation must be intrinsic, not extrinsic. Can you describe the function without referring to the origin, or explaining a series of not-too-abstract-sounding preconditions that just happen to apply to the origin? Maybe it is not actually to be a general, reusable utility.

Or maybe it just needs more parameters. When investigating a possible util-impostor, you can fight to keep it, by documenting the quirks. One tactic is to identify the potentially surprising twists and turns, and expose them as options – i.e. more parameters! Parameters are at least good documentation points, and helps you expose the connections.

In any case, the default behavior should be left as nicely unsurprising as possible. It may still not be all that generally valuable over time, so you should keep eviction notices handy.

Bad Language

So far, this has been mostly a trite rehash of common wisdom. The most remotely interesting item is this one, which deals with the sideways connections, or lack thereof. Again, these are hidden and unspoken, but in contrast to the above, there are both undesirable and desirable connections. Lispers might recognize this part as the language-building philosophy of Lisp programming. Warning: this gets vague, we are not in HOWTOs anymore.

So here’s how we look at it: The utils library you’ve been hammering into shape actually represents, if not a new language, then at least an extension to the language, in the same way that the Java standard library also defines Java. Of course, technically, Java 5 is the mostly same language as Java 1.0.2 (and backwards compatible!), but for most practical purposes they’re very different, and the evolution of the standard library is the real difference. (If I didn’t make that point with you, consider Java 1.0.2 and 1.4.2 instead.)

Library design, at least for low-level reusable libraries, is to some extent also language design.

So what are the undesirable sideways connections? Inconsistency, plain and simple. Language design is hard, and challenges include internal consistency and uniformity: Keeping the implicit connections in mind and keeping them logical, keeping the conceptual disconnects out. For instance, if you have overlapping functionality (similar utilities with different quirks, for instance), that’s a disconnect. If there are related (or even overloaded) methods, and their argument ordering varies, that’s a disconnect too. What you get is a confusing mess of a language. What you really want is a practical (maybe even elegant), incremental improvement to your existing language.

And those are only the syntactic connections. The semantic connections are the ways that the various parts can be combined. A sizable utils library has many parts that can be combined in infinitely many ways; it is combinatorial. This is really powerful, actually too powerful for humans to handle, so it must restrict itself from providing connections that are undesirable. Why doesn’t java.lang.String have an openFile() method? Because it’s insane, that’s why. A String can obviously represent the name of a file, but it doesn’t go ahead and provide this connection. String isn’t the obvious place to look for file handling, because strings are more general than files, therefore we allow files to talk about strings (with e.g. file.getName()), but not the other way around.

The good connections, on the other hand, know their place. Notice how the I/O libraries deal with e.g. InputStream, and not the multitude of things that can provide InputStreams. This level of indirection makes for an extra step when you wire things up (INSERT HERE: gripes from people touting the ‘concise’ syntax for that particular case in their favorite – though possibly leaky-runtimey – scripting language). But it also adds a degree of freedom, and more ways to combine the basic parts. To combine with the I/O libraries, you don’t have to be a File, you can just provide an InputStream.

The Java standard library is rather conservative with frivolous connections, which has probably been good for its longevity.

Good connections must obey some ordering of things by generality, usually in some tree-like structure. The connections should enable combinatorial composition, and avoid flooding the API with maybe-possibly-helpful methods. The ordering is subtle, tree-like and never quite explicit – but it is there – and it will become painfully clear (or at least painful) to users when things are out of order. When your connections aren’t good, things don’t combine well and boilerplate starts to gather like moss in unexpected corners. Or they don’t get used at all, because they’re hiding in the wrong place.

There’s no hard and fast way to fix this – over time, it is the hardest part of growing a good utils library. It simply takes a lot of single-minded whacking of things into shape, just like the Java standard library probably did. But on the whole, I find it useful to imagine myself as a language designer.

Back to Basics

If you read this far and find all this to be basic stuff – it is. It boils down to some basic lessions of design: cohesion and coupling. You want high cohesion (inner motivation, no overlaps, consistent library design) and low coupling (no origin artifacts, minimal dependencies, abstractions ordered by generality). And obviously, you want consistency too.

The best test of a good utils library is taking a break and returning to it. Does it feel natural? Do you know roughly where things are? Or do you find yourself adding more stuff to it, only to discover later that the functionality really was there – just not where you looked? That means it needs more work. But of course, something like this is never completed anyway.


Stop Making Senseless Videos

2008-12-12

I don’t care how 2.0 your new development environment is.

I don’t care if your web site has a stylish white background and three or four big, friendly, rounded icons in primary colors.

The Santa user

The Santa user

I don’t even care if your icons are cute and stylized like the illustration Santa user.

I don’t care if you’re not original; it can still be something I want to know about.

So don’t… just don’t make me watch a video about whatever it is. Please. I don’t want to watch a video.

OK, so you have a video. Congratulations! Nice. But does that make you deserving my undivided attention?

How do you know I’m not listening to some music, that I don’t want to pause?

How do you know I’m not in a boring meeting, and have about 40% of brain capacity to spare, ready to peruse something potentially useful?

How do you know I want to spare 10 minutes? I could have skimmed the equivalent information in text form in half a minute.

And how do  you know your video doesn’t suck? Count how many of your sentences start with the word “so” or “ok, so”. More than one third, and you should write a couple of paragraphs about it instead. Programmers often have excellent written skills!

This doesn’t just go for the x on rails and general 2.0 crowds, but sites like infoq as well. Please consider that videos have completely different consumption modes. If I can read an article in five minutes while listening to my music, it sure beats turning off the music for half an hour to have it read to me. (Especially when every other sentence begins with “Ok, so …”)

Want to be original? Don’t have a video!


Good, Fast or Cheap? Mu!

2008-11-15

When pursuing these three goals, which do you choose? Can you choose? How many? One? Two? Which one(s)? Surely not all three? I think mu.

The whole setup is off – it’s a misleading question. Speaking as a software engineer, Good is the center of gravity in this little triangle, and your only long-term yardstick. For argument’s sake, let’s assume something actually does end up Good, Fast and Cheap. That’s an actual big whoop, a real big deal – obviously, a lot of good work has been done. So you click your stop-watch, record the time it took and count up the money you spent on it. But: How do you assert how Good it really is? You don’t know yet, you have to keep track of that Goodness over time, from that point on. The world changes (it’s probably quite different already), so you will likely have to deal with additions and changes. This will reveal an important aspect of its Goodness: How easy it is to change, in terms of time and money. Also, hidden non-Goodness can pop up. Maybe someone cut a corner back in the Fast days to keep it Cheap? It happens, for more or less legitimate reasons. That will make it less Good later, or incur a little more time and cost to fix it, which makes it less Cheap and Fast overall. Things change, along with your scores for Fast and Cheap, even if they’re not on anyone’s MBO anymore. But are they really the factors you should be measuring in the first place?

Rant, rant. You can probably see where I’m going: I think measuring the Cheap can be misleading, because anything remotely worthwhile is also worth maintaining, and maintenance usually ends up being the biggest cost. The Goodness will deteriorate unless you put in that time doing the maintenance. Does this reduce your Fast score? Sure, but it’s like Oscar Wilde said: The only thing worse than maintaining is not doing maintenance. (Well, he would have, I’m sure.)

With any piece of software, it’s like this: When it’s Done – i.e. when noone is working on it anymore – it’s dead or dying. At that point, you can measure the time and money spent and say how Fast and Cheap it all was – which is useful for accounting and project management of course, but it’s not really relevant to the software, being dead – and well on its way to being replaced by a another project sometime later. Some things will take a long time find their shape, and will only end up as Good on paper, i.e. a measly one out of three. But some of those things really change the world, like the internet, the NASA space programs and the Manhattan project. (Yes, changing the world can be bad as well.) These projects require a whole new way of thinking to gel – slowly – in the heads of many brilliant people. You couldn’t have rushed setting up the internet by throwing more money at it. Most projects are obviously lesser efforts, but quality (here defined as being Good) generally does cost money, and it does take time.

Turning the tables completely on the whole question and flipping the polemics switch, we can phrase it like this: Unless it is actively burning time and money right now, it can’t be all that Good anymore, so forget about the whole thing being fast and cheap. Those are operational parameters that can be more sensibly applied to the day-to-day business of maintaining of the ever-important Goodness. Can we change this? Can we add that? If yes, and fast, you’re good. That’s a bit crudely put, but the polemics switch was on, after all.

If this little (but really simply syndicated) rant seems somehow familiar, it may be due to a high content of well-established, non-controversial truthism, but possibly also because you read my very similar Answer to a Question on LinkedIn.


Can Mutants be House-trained?

2008-11-04

I was pointed to a lightning presentation on mutation testing the other day. I like the idea of mutation testing, which basically is to mutate code randomly – though the operations are well-defined to keep things well-formed -  creating lots and lots of mutants. A mutant is a variant of your code, with one or more changes applied. Mutation testing tells you how many of these variants are actually caught by your tests. If a test fails on a variant, that mutant has been detected is killed. Obviously you want your tests to kill as many of them as possible, for the highest possible death rate. I piped up on the subject in the forum at smidig, and this is sort of a rehash in English.

In the presentation, a piece of code and its tests are gradually hardened to withstand the assault of the mutants. More mutants are killed, and we prune the mutation space to achieve smaller numbers of mutants. The final move, so to speak, is to declare the method arguments to be, yes, final. This prevents a lot of mutation, since mutants that try to change the parameter simply won’t compile. The mutation space is smaller. This is important, since the number of mutants grows fast, and quickly overwhelms even a small code base.

Now, assigning to parameters is a bad practice in some quarters. (In my own quarter, it’s a howler.) And forbidding the practice of assigning to parameters does seem to help with the mutation rate, but it’s not really tempting to go over the code and harden it piecemeal, littering it with explicit finals all over the place. Also, those modifiers are a double-edged sword. What purpose do they really serve? First, they’re not really needed (unless you do the inner class thing), so one could argue that the only purpose they serve is to prevent mutation. I hate it when some external factor, such as a tool, affects my code so it looks weird. Second, a human programmer could just as easily drop the modifier anyway, if it got in her way. Some programmers remove redundant modifiers on general principle, I do it on morning grumpiness as well.

So, it doesn’t really help. Not really. Worse, it’s quite intrusive code-wise. After all, the mutation testing is just one of many things that the developer needs to know about when writing code, and it gets tiresome after a while to remember that, oh, this weird little thing must be done to keep that automated thing happy, and this one because the so-and-so framework will use reflection to access only, say, protected fields … look out for comments scattered all over saying // make muclipse happy and so on. Stuff like this adds to the time needed to get new people into the project, and generally to the load on programmers’ brains. That’s a scarce resource, so keeping the code as idiomatic and non-quirky as possible is a good thing. Really, try it some time.

Could we lower the mutation rate on a more sweeping scale, and without affecting the code so much? Sure, don’t we have something that we often use to forbid a practice? We do, and it’s called a code standard. A code standard could say “don’t assign to parameters, fool”, and some other tool could enforce that part.

Your IDE is great for it – IDEA points out my code standard misdemeanors with yellow lights.

If the code standard forbids the practice, it seems natural that the mutation tester should know about it. Why should it bother with exploring variants that would be caught by checking against the code standard? In this particular case, it would imply that the mutation tester should behave as if all parameters were final. Presumably, that should lower the number of mutants all over the place, not just in the places we have littered with (all those increasingly mysterious) finals.

This is a lot of philosophizing for a fellow who hasn’t actually tried mutation testing. The idea of marrying mutation testers with code standards could be a silly one, and if so I would love to hear why. If anyone know any reason why these two, and so on and so forth; then comment and talk geeky to me.


Hadoop and HBase for OSGi

2008-08-27

Here’s a tip if you need to access HBase from an OSGi environment. I tried packaging them into two separate Hadoop and HBase bundles, but I ran into an unexpected, and quite unseemly, issue. To make a short story sweet (or so): You will need to package them together in a single bundle, and write a manifest that exports the HBase and the Hadoop packages.

The issue, shockingly, is that HBase takes the liberty of declaring a class in a Hadoop package! Specifically, the org.apache.hadoop.ipc.HBaseClient class extends Hadoop’s Client class, defined in that same package. This is the kind of necessary evil that pops up from time to time between two projects, especially in early days – after all, they are in 0.17.1 and 0.2.0, respectively. It does break an important OSGi rule: A package can only be imported from a single bundle, it cannot contain classes contributed from multiple bundles. Which means that other bundles can only ever depend on one of them, never both.

I haven’t thought of any import/export acrobatics that could resolve this, and I’m not sure if anyone should. The HBase bundle needs to see that package exported from the Hadoop bundle, or it can’t extend its Client class. Given that, the HBase bundle then needs to export that same package. I can’t imagine any way that would work, except by accidents in the implementation. Even if it somehow is an edge case within the spec, it certainly isn’t something I would like to spend too much think-time concerning myself about, let alone maintain. The quick fix is to package them together. Do that, and your stuff works.

Feel free to ask for the manifest, if you’re interested. I didn’t have it at hand while writing this post.


Generics: Duping your inner compiler

2008-06-25

Maybe I’m slow, but I recently figured out a handy use for Generics:

    public static S3Service connectS3(String id, String key) {
        AWSCredentials creds = new AWSCredentials(id, key);
        try {
            return new RestS3Service(creds);
        } catch (S3ServiceException e) {
            return failConnect(id, e);
        }
    }

where failConnect is:

    private static <T> T failConnect(String id, Exception e) {
        throw new AWSException("Unable to connect " + id + " @ Amazon!", e);
    }

It feels like being a knowledgeable (runtime) adult, duping an ignorant (compile-time) child. The child is happy, because the connect method has its returns in place, and the failConnect method obviously returns the right type. So I can reuse it here:

    public static SimpleDB connectSDB(String id, String key) {
        try {
            return new SimpleDB(id, key);
        } catch (Exception e) {
            return failConnect(id, e);
        }
    }

failConnect can obviously “return” anything. And it does! Right? Ah, kids are such dupes.