Thanks for the input - the more cross disciplinary knowledge flow the better. I like your take away in particular. The author is basically saying 'All this OOP stuff that most of the industry has been doing for 20 years? All wrong! Do it my way.' Hearing that biology still uses directed non-cyclic graphs for classification (eg, single inheritance) shows just how powerful single inheritance really is and that we maybe shouldn't be so fast in throwing it away because it doesn't match something.
I think sticking to the 'age old wisdom' of using direct inheritance for IS-A relationships and composition for HAS-A is still the right way to go. It's been tested, it works, and using HAS-A for everything makes for less understandable code imo. Often you can fix an IS-A by simply refactoring your graph with a better understanding of the problem space - as it turns out biologists have done as well.
In my personal experience, I've never found a true IS-A relationship in my code. I've found lots of interfaces, though.
A lot of the time, you'll think you're defining a superclass when you start writing things like "Square IS-A Shape; Rectangle IS-A Shape". But Shape will turn out to not define any common behavior, just a category you want to restrict your inputs and outputs to; and you'll want to be able to assert that anything is, in addition to whatever else it is, a Shape. So shape is an interface.
A lot of the time, you'll think you're defining a superclass when you start writing things like "User IS-A Person; ProjectOwner IS-A User". But it'll turn out that you want to keep People around even when they stop being Users, and to keep Users around even when they stop being ProjectOwners. So you'll rearrange things and find that you're now asserting "User IS-A Role; ProjectOwner IS-A Role; Person HAS-MANY Roles." And Role turns out to be, again, an interface.
The only example I can think of that does fit single-inheritance is when modelling objects that directly express a genealogy. For example, GithubForkOfProjectA IS-A ProjectA, or Ubuntu IS-A Debian. But these aren't typically things you'd express as types; they're just instances (of, respectively, GithubProject and LinuxDistribution.) Each instance HAS-A clonal-parent-instance.
I guess there's one possibly-practical use of inheritance which I've nearly implemented myself: if you force your database schema migrations to always follow the Open-Closed Principle, and you want to migrate the rows of a table as you encounter them to avoid taking the DB offline, then you could have two separate models for a Foo table, FooV2 and FooV3, where "FooV3 IS-A FooV2". Each row has a version column, and is materialized as the model corresponding to that version. Your code that expected FooV2s would then be satisfied if it was passed a FooV3.
Does anyone actually do this, though? I don't just mean "row-by-row migrations", I mean the "with two models, one version inheriting from the other" part. And, if so, what do you do when you make a change to a model that doesn't obey the Open-Closed Principle: where FooV3s break the FooV2 contract?
The one case where languages without traditional OO inheritance are painful is where you want to do "like this other thing, except for this one particular thing that it does differently". And sure, maybe that's always bad design - but it comes up a lot in real-world business requirements. For all the people saying "you should use alternatives to traditional OO" I've never seen an actual example of how to do this better - you can patch those instances at runtime (urgh), you can create an object that implements the same interface and delegates to an instance of the base type (much less readable in every language I've seen, and effectively reimplementing inheritance without the syntactic sugar).
I think the right solution is simply to have firmer constraints about the relationships between parent and child classes - just like decoupling a class's implementation from its interface, it should be possible to separate out the interface it exposes to child classes as well. The one library/framework I've seen that does this really effectively is Wicket - it makes extensive use of final classes, final methods, and access modifiers to ensure that when you need to extend Wicket classes you can do so via a well-defined interface that won't break when you upgrade your Wicket dependency. It works astonishingly well.
You're correct in that special-case "like this other thing, except for this one particular thing that it does differently" objects happen all the time due to business rules.
But the Decorator pattern is not "inheritance without the syntactic sugar." Decorated objects, unlike subclass instances, are allowed to break the contract of the object they decorate: they don't have to claim to obey any of its interfaces, they can hide its methods, they can answer with arbitrarily different types to the same messages, etc.
If a language made defining decorators simple, I think it'd remove a lot of what people think of as the use-case for inheritance. (I mean, you aren't supposed to use inheritance for Decorator-pattern use-cases--it will likely break further and further as you try--but people will keep trying as long as the first steps are so much easier than the alternative.)
> If a language made defining decorators simple, I think it'd remove a lot of what people think of as the use-case for inheritance. (I mean, you aren't supposed to use inheritance for Decorator-pattern use-cases--it will likely break further and further as you try--but people will keep trying as long as the first steps are so much easier than the alternative.)
I actually agree with this, and I'd be interested to hear of language efforts in that direction. But until there's this easy way to do decorators, just telling people "don't use inheritance" isn't going to work.
> you can create an object that implements the same interface and delegates to an instance of the base type (much less readable in every language I've seen, and effectively reimplementing inheritance without the syntactic sugar).
In case you're hankering for a language that makes delegated composition easier, Go's "embedded fields" are sugar for exactly that:
http://golang.org/ref/spec#Struct_types
Class inheritance should be about the data not the behavior. If you're looking for "IS-A" inheritance, you should look at: what parameters does the class require on construction. If your implementation isn't fundamentally based around the same construction parameters, then it shouldn't be a derived class.
For example: HTTPResponseHandler IS-A TCPResponseHandler because its primary role requires a TCP socket on construction. The only inherited methods should be management of the data from construction.
Problems with this only arise when people use subclass to gain the interface but not the data. You should never need to do this and it's a problem with using classes instead of interfaces for your parameter declarations – it's not a problem with class inheritance.
Except that interface are not supposed to have behaviors.
Shape can lay itself out with regards to its enclosing rectangle for example.
From my experience, using IS-A only works on simple concepts, and almost never on more than 2 layers. But it's still useful.
I suppose that if languages had the same "automatic function redirection" to sub-components, such as Golang, then people would use composition a lot more, though. That's actually the thing that seduces me the most about this language.
> Shape can lay itself out with regards to its enclosing rectangle for example.
How so? I mean, sure, Shape can declare, say
bool is_inside(Rectangle enclosing_rect)
...but how would Shape know how to calculate that? It'd be an abstract method. All of Shape's methods would be abstract methods. Thus, a Shape is an interface: a contract an object makes with the system to say that it has a given API.
But this is an excellent example of where Shape can have concrete code in it though. is_inside is true iff the shape's bounds are inside the rectangle. So assuming you have a means to test rectangle bounds:
Now you don't need to implement cookie cutter is_inside methods everywhere, but rather the simpler get_bounds(). In fact, you can add is_inside after all your shapes are implemented and it will just work, as long as they have get_bounds.
Maybe you want to add some sanity checking to make sure bounds never have negative width, so you implement get_bounds() in Shape, and implement get_left(), get_right(), get_top(), get_bottom() in the child classes. Now you don't have to add cookie-cutter assert(left < right) stuff everywhere, just in one place.
Other similar ideas: if you write an "intersects(Point)" method on each shape, then you can pull a concrete implementation of sampling-based area approximation up to the base Shape class, and leave analytical area calculation to the children. This is useful if you are working with distance-field, parametric or noisy shapes, for example.
I've been building a game engine in JavaScript as an exercise for the past few weeks. Using IS-A for my base classes and HASA composition for various needed and shareable behaviours works wonderfully, is super testable, and makes sense. Seems like the answer in this case does indeed lie somewhere in the middle of two extremes.
"In JavaScript (and other languages in the same family), classes and subclasses share access to the object’s private properties."
Self-encapsulation ensures internal variables are only directly referenced twice: via accessors. This adheres to the DRY principle. The above line should be written:
balance( balance() - cheque.amount() );
Or more explicitly as:
setBalance( getBalance() - cheque.amount() );
Even if the superclass is later modified to use a transaction history, which would violate the Open-Closed Principle, the subclasses would continue to function correctly. (The transaction history would be implemented inside the accessors, making the additional code transparent to subclasses.)
Source code that eschews self-encapsulation will be brittle. Developers must grok the DRY principle. Code that directly sets or retrieves internal variable values in more than two places should be refactored to eliminate the duplication.
Also, the Account class is incomplete. The Account class should have a withdraw method to mirror the deposit method. The ChequingAccount would overload the Account's withdraw method to take a cheque object, such as:
I am unfamiliar with the JavaScript OO syntax to call the superclass, but this revised design is otherwise sound. In this way, the parent class can vary its account balance implementation (e.g., introduce a transaction history) without affecting its children.
> maybe shouldn't be so fast in throwing it away because it doesn't match something
I think this touches on a big difference between science and software. In science, models are not "thrown away" because there is no need to throw them away. Multiple models can exist side-by-side, whereas it seems to me that this is not the case in software.
For example, Newtonian mechanics were great for a really long time. Then relativity came along and was found to be superior to Newtonian mechanics in some situations. Does that mean that we threw out Newtonian mechanics? No! It's still a useful model, regardless of the fact that it's wrong in some cases.
The point I'm making is that biology uses "single inheritance" not because it's right, but because it's useful in some cases. In other cases, where it breaks down, you will be forced to use a different model.
> still uses directed non-cyclic graphs for classification (eg, single inheritance) shows just how powerful single inheritance really is
Perhaps I'm missing something, but I think the article makes the point that single inheritance results in a tree structure, not a DAG. That aside, I feel it's unfortunate that he chose to illustrate his point by way of analogy, as it seems some of the arguments here are refuting the analogy and not his original point.
My personal experience has been that as software grows in complexity, isa relationships tend to fall apart. Generally this is because although they have some characteristics in common, they only have SOME characteristics in common. Further, the more the tree grows in breadth, the fewer the shared characteristics there are. Over time, this overlap becomes so small as to utterly rob the isa relationship of meaning. Saying X is a Y simply means that (generally for historical reasons) you chose to emphasise the commonality of X and Y over the equally valid relationship that X is a Z. Oh and often an A and a B.
I say grows in breadth because, with inheritance, having a tree grow in depth suggests specialisation in each branch WITHOUT behaviour shared between some (but not all) nodes in different branches of the tree. In practice I've found this happens so infrequently as to barely worthy of consideration.
>> when you see these kinds of errors it probably means that you're classifying things incorrectly
I accept that this may be entirely true. However, I have seen significant resources (time, mental effort, etc) dedicated to discovering the correct classification - although that presupposes that there is a correct classification, so perhaps I should say more useful classification - of a class hierarchy in a project, and have not been able to find something adequate. Whether this was due to our stupidity, or whether this is because there is indeed no clear way to express the relationship in hierarchical terms, the fact remains that it made inheritance an unsuitable way for us to model our problem domain.
> Often you can fix an IS-A by simply refactoring your graph
Your use of the word suggests to the reader that this is an easy undertaking. In practice I have found this is often note the case. Saying that we CAN refactor the inheritance tree doesn't mean that it is simple to do, and I posit that by the time you have learnt enough about your domain to recognise you have modelled it incorrectly, the hierarchy is of such complexity that this is generally a very difficult undertaking. Again, I'm not arguing against refactoring, but rather that refactoring inheritance hierarchies is often difficult.
I think ericHosick (above/below?) says it best:
> In the real world, people can build out classification systems and easily add in edge cases when we find new ways things can be classified. In software, that could lead to a complete change in the software architecture.
My experiences are obviously anecdotal, but since I find that I - and many other software developers whom I talk to - struggle to make domain modelling using class hierarchies work, his argument that is inherently flawed rings true.
> That aside, I feel it's unfortunate that he chose to illustrate his point by way of analogy, as it seems some of the arguments here are refuting the analogy and not his original point.
I disagree. I think it's extremely fortunate. You are acting as if you know all the answers - and you may even think you know all the answers - but I think refusing to consider that biology went through something similar (and I bet there were numerous biologists who wanted to destroy the classification system as it was 'wrong') and found an answer that differs with yours does not mean there is no link. You could be wrong too. Maybe we are missing tools to allow us to alter our classifications later and that classifications are the best solution when given the correct tools.
I think sticking to the 'age old wisdom' of using direct inheritance for IS-A relationships and composition for HAS-A is still the right way to go. It's been tested, it works, and using HAS-A for everything makes for less understandable code imo. Often you can fix an IS-A by simply refactoring your graph with a better understanding of the problem space - as it turns out biologists have done as well.