Software Networks
Something that has been bugging me as I have played with the Small Worlds tool for visualizing source code as networks, is that it seems to make no (or little) distinction between the dependencies (the links) between objects (the nodes): inheritance seems to be treated as being equal to containment or use. The result is that an object can be considered by the tool as having vast numbers of global dependencies, when that’s not really the case.
For instance, yesterday I loaded Apache Tomcat into the tool, by just adding all jars in the library directory. Small Worlds identified many Global Butterflies (that is, objects that when changed would affect many other objects directly and indirectly depending on it). Looking closer, I saw that these were all found in the Apache Ant packages. In the Tomcat part of the network, a few Local Hubs, Butterflies, and Breakables were found.
I studied some of the Local Hubs and saw that one of them—I can’t remember its name, but it was a central object in the JSP compiler—had a strange dependency on the class Project from Ant. Perhaps it is so that the JSP compiler now uses stuff from Ant to aid the compilation of JSP files. Anyway, since Ant seems to have a remarkably high number of dependencies between its classes, a change in the JSP compiler would, according to the tool, cascade to several hundred other classes, most of them within the Ant framework.
Ant is a build tool, where the makefiles are in XML, and specify a series of targets that can depend on each other, making sure that they all will be executed in sequence, depending on which build products need to be updated. Each Target object has a number of Task objects, which are in fact instances of concrete subclasses of Task – such as Copyfile, and Javac. Now, my point here is that Small Worlds show Task subclasses as dependents although they are not “known” of by most of the other Ant classes. Most Ant classes only deal with Task objects, ignorant of which concrete subclasses these actually are instances of.
This is a common technique to minimize dependencies and keep changes from cascading throughout the larger system. But Small Worlds isn’t sensitive to this – at least not yet. I’m sure that this is something they are working on, but I wonder what they will do about it. This was something my friend Olof and I talked about when we had lunch yesterday.
One interesting feature of Small Worlds is its “What If” view. Basically, it shows all objects as tiny dots, grouped in boxes representing the Java packages. If you click on an object, this shows an animation of how changes to that object might cascade throughout the system. If you do this with Compiler in Tomcat ––– Hmm, I was going to say that Small Worlds would show such a change cascading into the Ant framework, but this isn’t the case. I must have done something to fool myself. Actually this is pretty interesting: Compiler privately keeps an Ant Project object, that isn’t exposed in its interface. It does have a dependency, but the tool apparently assigns weights to its dependencies, such as limiting the scope for a cascading change. I didn’t realize this!
But the argument is still valid. If you simulate a change to the Project class, it cascades across a very large part of the system. This class is considered a Global Butterfly. Many classes in the Tomcat framework are affected, but this would be the absolutely worst-case scenario you could imagine. The tool apparently treats private and protected dependencies as more “harmless” than public ones, but it still is too insensitive regarding the type of dependencies.
The earlier mentioned abstract Task object inherits ProjectComponent (also abstract), which holds a reference to its Project object. So a change in the Project object would, according to the tool, affect such Task subclasses as Copyfile and Javac. I’m not saying that a dramatic change wouldn’t, but I would like to be able to simulate more average changes as well. This would require a more sensitive analysis of the dependencies. An object that uses only a small fraction of the interface of another is less likely to be affected by changes in the latter, compared to an object that uses, and therefore “knows about”, a larger part of the interface.
Also, an inheritance relationship might be less sensitive to change than a use relationship, if few of the methods in the superclass are overrided (or invoked, for that matter) by the class in question. The same class might have a use relationship with another class, where several methods are invoked. If we see it as a node in a network with a change cascading, it might cascade the change both to its superclass and the used other class, but as the change travels across these links, the signal is reduced more in the superclass link than in the link to the other class. Think of it as electric wires with different resistances. For each link a change passes, it is reduced. Some links reduce the signal more than others, thus affecting the nodes on its path to a lesser and lesser degree.
Sure, if the tool shows what would happen when the change is as dramatic as removing the node in question, you can follow the path yourself and judge for yourself how likely it is for the change you are envisioning to actually cascade as broadly. But this involves a quite significant bit of manual work, and I’m positive that the tool could do more for you.
Somewhere I read that the 2.0 version of Small Worlds will be a significant leap ahead in sophistication. I wonder if they are working on something like this. (See also my post “Small Worlds”.)
Follow-up: What I wanted to say, regarding inheritance dependencies, was that they shouldn’t be treated differently from use dependencies. It didn’t mean to say Small Worlds does this (which it doesn’t); I was merely clarifying this to myself.