29. Abuse of Class Hierarchy

Part of CS:2820 Object Oriented Software Development Notes, Fall 2015
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

An Aside?

A student approached me offering code sort of like this:

class Basic {
        void someMethod() {
                if (this instanceof SubClassOne) {
                        SubclassOne thisOne = (SubclassOne)this;
                        operateOn( thisOne.somefield );
                } else {
                        SubclassTwo thisTwo = (SubclassTwo)this;
                        fiddleWith( thisTwo.otherfield );
                }
        }
}

class SubClassOne extends Basic {
        SomeClass somefield;
}

class SubClassTwo extends Basic {
        OtherClass otherfield;
}

The student was having problems with this. My observation was that, of course this code could be made to work, but this code is also significantly inefficient and it abuses the spirit of object oriented programming with class hierarchies. Something like this would be far better:

abstract class Basic {
        abstract void someMethod();
                // a concrete class could be used here too
}

class SubClassOne extends Basic {
        SomeClass somefield;
        void someMethod() { // overrides the basic version of SomeMethod
                operateOn( thisOne.somefield );
        }
}

class SubClassTwo extends Basic {
        OtherClass otherfield;
        void someMethod() { // overrides the basic version of SomeMethod
                fiddleWith( thisTwo.otherfield );
        }
}

Why is this better?

This puts all the code that deals with fields of each subclass in that subclass instead of making the superclass aware of those fields. That is generally a good idea, although it is worth noting that it is not always the most sensible way to do things.

Specifically, if we had a factory method that allocated either an instance of SubclassOne or SubclassTwo the sensible place for this factory method will likely be in class Basic, since it always returns an instance of a subclass of Basic. If part of the initialization of the subclass instance by the factory method is specific to the type of that instance, it could be done in a method of the instance, but if that method simply sets a public field of the new instance, it would be just as easy to set that directly in the factory method.

Efficiency?

A lesson we learn repeatedly is that just because something is represented by a single symbol, for example, the + operator, this does not mean that it is computationally simple.

Indeed + is computationally about as simple as it gets when the operands are integers where + means addition. In that case, + can typically be evaluated in one machine instruction which is executed in as little as one CPU cycle. Even with one of the simpler J-code interpreters, it might only take 10 or 20 machine cycles to add two operands.

In contrast, + is much slower when it is applied to strings. There, it means concatenation, and each concatenation involves allocating a new StringBuilder object, copying the text of the operands into this new object, and then converting the StringBuilder to a String. This is far from simple, it may involve hundreds of machine instructions just to allocate the new object.

Here, the question we face is, how simple is it to implement the Java instanceof operator compared to calling a method of a subclass that overrides the corresponding method of the superclass. Here, programming language documentation can actually be misleading. The documentation describes instanceof as a basic operator, which sounds like it must be simple, and then the documentation describes calling a method on an object in terms of searching the subclass hierarchy from the actual class of the object all the way back to the root of the class hierarchy (class Object) looking for an implementation of the desired method.

First note, in purely interpreted languages like Python, that is how it is done. Don't expect efficient computation in Python or similar languages. That language is all about economical use of human resources in the form of programmers. There, the focus is on the human effort it takes to get something to work, not on the computing resources it takes to make it work. Use Python for throw-away programs, feasibility studies, design experiments, and prototypes.

Compiled languages like C++, where the compiler output is directly executable machine code make exactly the opposite choice, placing higher burdens on the programmer in order to allow for the possibility of very efficient execution. Java is in the middle, compiling your code to J-code and then using a a J-code interpreter. Highly tuned J-code interpreters can do just-in-time translation to machine code, allowing execution at speeds that challenge C++, but the simple free J-code interpreter that comes with Java has a performance penalty of roughly 10 to 20 compared to a good C++ compiler for comparable code.

The Java compiler does the search of the class hierarchy for the method that is being called. This search is done at compile time, and in the J-code, the cost of calling a method of a class that is overridden is at most one or two J-instructions slower than calling a static method where neither the concept of overriding nor the concept of object is involved.

Now, compare this to the question of class membership testing.

You might imagine that each object could include a field that identifies its class, and indeed this is true. Each Java object includes a field that points to an object of class Class that describes the class of the object. Let's call this field myClass in this discussion.

If you say anObject instanceof aClass, you could imagine this being equivalent to anObject.myclass==aClass. This would be nice, but it is not so. The problem is anObject instanceof aClass will be true for the actual class of anObject and for every superclass of that class. At the very root of the object hierarchy, anObject instanceof Object is always true, since every class is ultimately a subclass of class Object.

So how can we implement the instanceof operator? It must involve a search of some kind. The most appealing way to arrange this is to put, inside each object of class Class, a set of all the classes for which instanceof is true for that class. This set will rarely be large, since most class hierarchies are relatively shallow, and if we use a hashed set representation, searches of this set can be very fast. Nonetheless, they are searches, and as such, they are significantly slower than simple pointer comparison.