Sunday 7 August 2011

OOP - the Next Generation

TL;DR

Class-inheritance-based OOP has been widely adopted, despite a mixed reception. I attempt to distill the good and the bad to design a new statically-typed approach to OOP. The semantics of OOP and polymorphism through interfaces are very powerful features which we need to keep; the primary casualties are inheritance, properties, events, and "too many concepts" in general; the problems we need to solve are more flexibly-grained isolation, re-use and extension (all of which are imperfectly addressed by inheritance at present). A follow-up post will discuss "Minx", my candidate language which demonstrates what a next-generation static OOP language might look like.

OOP: the next generation

It is undeniable that statically-typed object-orientation has had a profound impact on the industry, speeding up the development of large systems and helping to make their code more intuitive to the reader. But at what cost, many would say? It seems to me that there are 2 main viewpoints on OOP: one very much in favour, one very much against. In this analysis I hope to tease apart the strengths and weaknesses of OOP; in a follow-up post I will suggest a new approach which hopes to keep all the advantages while ditching the disadvantages.

OOP strengths (inspiration taken from http://wiki.tcl.tk/13398):

Semantics

Analogy is an important concept throughout computing, allowing us to borrow a familiar vocabulary for new ideas. We can create any system we can conceive - but we can use only things that we understand, that we have a mental model of; and so it only becomes useful to create things that we have some real-world analogy for. The semantics of object-orientation allow us to capture this benefit in the development of our systems by introducing a vocabulary of objects which can be "created" and "destroyed", are described by a number of attributes, can have various "states", and can be moved between these states by a series of operations. file.open is effortlessly intuitive, readable and succinct - and therefore easy to maintain.

Encapsulation

In simple cases, standard OOP provides a great way to "hide" the details of how a feature is implemented. In particular it allows us to create hidden data, which can be manipulated only through publicly visible methods. The author can ensure that the public methods keep certain properties of the hidden state invariant, saving the caller from having to worry about it. At the same time, hiding these details allows us to change them at will without breaking any caller code. This model allows for truly fantastic library code that can save a developer hours of time - C++ smart pointers being a great example.

Polymorphism/dynamic dispatch

If our implementation details are abstracted away from the consumer of our code, we can create different objects with an identical set of visible methods and return them interchangeably, dynamically dispatching to the appropriate method at runtime. Our code can now be more flexibly coupled to its dependencies, making change less painful, and code more reusable.

Isolation of separate concerns

A single program needs to deal with many different "concerns" - e.g. the user interface, the back-end database, authentication, business logic, validation, etc. It can be very easy to fall into the trap of muddying the application by mixing the code for different concerns - e.g. writing the validation code in such a way that it is tightly coupled to the current UI: changes in the UI now break the validation. OOP does not enforce anything here - but it does provide an "appropriate place" for each type of code (as member functions of the appropriate class), in the hope that you will be motivated enough to keep your logic separated out.

Extension of existing functionality

Often we find our libraries/frameworks do not provide 100% of what we need, and instead we must extend existing functionality. A scenario I've had to deal with in a number of technologies is creating a toggle button out of a standard push button. Being able to tack a bit more on to an existing object without having to worry about breaking the underlying capabilities is great for productivity.

OOP weaknesses

Not everything is an object

This isn't an issue with all OOP, just the more militant forms where object is the base type in the system and everything must inherit directly or indirectly from it. The most pressing example I'm aware of is the .net framework's static Math object. If ever there was an example of ideology winning out over common sense, it was there - creating a static object with tens of unrelated methods (logarithms, trigonometry, rounding...) rather than admitting that not everything is an object, and putting them in a module where they belong. New systems need to return to supporting, rather than enforcing, object-orientation.

Update a friend reminded me on the Google+ comments for this post that strictly speaking Math is only a static class from C#'s perspective - the VB.net terminology would indeed be Module.

Inheritance hierarchies create strong coupling

Textbook examples of inheritance highlight the capabilities which have helped us produce more maintainable applications, more quickly - reuse, extension, polymorphism and encapsulation. However these examples are different from real-world applications in one important respect: change. The problems we solve are prone to change, so maintainable code must be prepared for any change to any aspect. However the "is-a" relationship inheritance denotes claims itself as fixed, intrinsic: "Triangle is-a Shape", just like "2 > 1". Inheritance was not conceived with change in mind, and has no place in applications which are. Another side effect of this is the importance of "getting the design right". However, Art is never finished, only abandoned., so we need systems that assume that our initial design will be wrong, and make correcting it easy.

What alternatives are there? Delegation - either via composition or prototypical inheritance - provides a lot of flexibility in extending and re-using functionality. However using composition for extension ends up being very verbose - Peter Norvig discusses here why the Java Properties object inherits rather than delegates: the Java implementation team was rushed, and took the course that required writing less code. This is a significant issue - our languages must encourage safe, maintainable behavior: it must be easy (and quick) to do the right thing. On the other hand I am unaware of dis-advantages to statically-typed prototypical inheritance (perhaps due to the limited spread of Omega, the only example I'm aware of). Interfaces provide polymorphism and encapsulation with less coupling; together with delegation we can provide superior functionality to inheritance - we just need a syntax to unlock it.

Insufficent Isolation/Classes are big balls of mud...

Un-disciplined applications end up implementing the "big-ball-of-mud anti-pattern" - i.e. having no strict structure and allowing any piece of code to be coupled to any other. This is why dependency management has become such a focus of research, giving rise to approaches like Inversion of Control and Test Driven Development. What mechanisms do modern statically-typed OOP languages give us?

One mechanism is to divide a project into multiple separately-compiled units, taking on the complexity of manually enforcing the dependencies between them. This can create a whole new set of maintenance problems. The other main technique is private/public encapsulation, allowing code to be visible within a given class but invisible outside it. This is a great step forward, but the average class will likely contain members at a number of layers of abstraction:

  • underlying data structure (private fields)
  • primitive operations on the data (private methods)
  • higher operations - there are often a hierarchy of these (public methods).
  • separate concerns that need to interact with the underlying data structure (e.g. serialization, validation, constructor functions that take business objects)

Our current object-orientated systems cannot isolate these - they must all sit in the same class. Often we resort to inheritance hierarchies and the protected modifier to isolate some concerns from the others. Looking outside our classes, we find the public members of classes within an assembly are still a big ball of mud with each other. Object-orientation gave us a bit more granularity, but the original problem remains. I've worked on projects with 30,000-line classes, with circular class dependencies, and with 50+ sub-projects. I recently tried to refactor a 6,000 line class and ended up calling it a day while it was still 4,000 lines - the dependencies were just too hairy to re-map in a reasonable time-frame. Using class and library boundaries to isolate code is a mixing of concerns, and ultimately increases the effort involved in maintaining the application. We need:

  • More flexible ways to achieve the appropriate dependencies for our code, at an appropriate granularity
  • Languages that incentivise layering functionality, rather than putting it all in the same place
  • Languages that make it easy to break up big balls of mud when they occur on any scale.

...but sometimes encapsulation should be breakable

I maybe sound like I want to have my cake and eat it, but I'm serious: developers need more fine-grained isolation control, while at the same time having easier mechanisms to subvert it when needed. If you're adding a checkbox to a form, 99% of the time you want the default behaviour and you want to be shielded from changing something that could accidentally break it - so we want to be isolated. However the other 1% of the time we need to change the behaviour - and we discover that it is hidden behind an impenetrable wall (especially if you're working with one of the .net framework's sealed classes, and even inheritance isn't available). Isolation exists to improve productivity and stability by helping the developer do the right thing - not to dogmatically enforce any principle, however well-meaning.

... and sometimes encapsulation should be unbreakable

I once came across the following in some C# code I was reading:

public interface IDatabaseWrapper {
  /* technology-agnostic datastore actions*/
}

public class SomeDataReader {

  public void Load(IDatabaseWrapper wrapper) {
    if (wrapper is ConcreteDatabaseWrapper) {
      var concrete = wrapper as ConcreteDatabaseWrapper;
      var a = concrete.somePropertyNotOnTheInterface;

      /* You get the picture*/
    }
  }
}

When I investigated, the story behind this atrocity was a terrible tale of DLL dependency hell - the result of which was the interface wasn't updated when the datastore class required more members. Perhaps this was just a pragmatic compromise against the time needed to address the dependencies problem, but it completely defied the point of the original design and increased the maintenance cost of the code.

Notice one thing: it was trivial to completely disregard the supposed encapsulation of the interface when the developer wanted to. All that stood in the way of casting back to the concrete implementation and trampling over the Liskov Substitution principle was the developer's own conscience. This should not be possible.

Again I could fairly be accused of wanting it both ways - after all I have already argued in favour of more flexible and easier-to-bypass isolation. I see however no contradiction: Encapsulation must be possible (perhaps easy) to bypass in the objects you produce; but impossible to bypass in the objects you consume. This preserves the contracts we set up in our code: input parameters of any type satisfying the signature can be supplied interchangeably. This must be unbreakable - and so no run-time type-checking is permissible.

Sometimes state should be publicly exposed

Here's a scenario which I'm sure a lot of you will have come across: You are creating a business object representing a customer, say. You add the private fields - primary key, customer name, customer code, address... but other code will have to make use of this private data, so you have a decision to make:

  1. Make all code that uses this data a public method of your customer object (which will generally violate separation of concerns), or
  2. Allow this data to be accessed publicly.

We don't want to violate separation of concerns, so we choose 2. Now how will we expose the data? Will we make the fields themselves public? No of course we won't because that's "The Wrong Thing To Do" - all fields must be private! Weren't you paying attention at Uni? Far better of course to add public properties to expose all this data. Then our callers will be isolated from changes to our database schema, and we maintain encapsulation over our internal state. Great call, high five!

What a crock of shit. On paper we've encapsulated our state, but in practise all we now have is a public set of properties which are just as tightly bound to the database schema as the private fields. Our mistake was in the first line when we decided we were going to create a traditional business object - i.e. encapsulating the state behind public methods. We'd be much better off admitting that we are dealing with a data structure, and just using a set of public fields. Others have discussed this in far more detail.

Too many concepts

In a previous job, my career progression was linked to passing microsoft certification exams (but that's another story). After a few exams I noticed a trend in the exam questions: very often they concerned areas of the API where a novice would assume a different syntax/naming convention. Or to re-phrase, areas where the API was not as intuitive as it could have been. Meditating on this I decided that an optimally designed API would be impossible to write an exam for, as the most logical/intuitive choice should always be the correct one, so no knowledge of the technology would be required. Exam questions therefore highlight design inadequacies.

Now consider the standard "programmer knowledge" style interview questions you've likely either been exposed to, or exposed others to. OOP concepts feature prominently in this list - define inheritance, what does the protected modifier do, how can you create a class which only one other class has access to? I posit that it is an inadequacy in the design of OOP that you need to learn so many concepts to use it. Expressed more aptly, What the hell is all that crap? And who cares? I just wanna write a program.

Properties are (90% of the time) an exercise in denial

I'm a big fan of the readability of properties and their intuitive semantics. However most usage that I've come across is a variation of the should-have-used-data-structures example above. There are nice things you can do with them, like lazily-instantiated state, but there is an argument for making all properties like these into methods, as the semantics lead the caller to assume that little (if any) processing is performed on call. So in summary, I'd advocate dropping properties because

  1. Exposing private fields with matching public properties does not encapsulate/future-proof anything
  2. 90% of the timea data structure is more appropriate (and making a conscious decision to use data structures rather than objects can inform other aspects of the design).
  3. The other 10% ("doing something clever") can be achieved with "get" methods instead, which at worst means a slight loss of readability (len = string.getLength() instead of len = string.length) and at best means a clearer signpost that significant computation may be involved in executing the call.

Events are just a special-case of callbacks

My argument here is similar to that for properties: despite the nice semantics, they are fundamentally just a way to expose callbacks - yet they are designed for scenarios where zero-to-many subscribers are possible, and only a single object may make use of the subscribers. In how many scenarios are exactly these constraints desired? A button should have exactly one onClick callback - how many stupid UI bugs would never have occurred if this were enforced by the design? At the other end of the scale, consider DOM-style event bubbling, where a handler placed on an element must respond to events raised on a separate element. When the .net designers added this feature for WPF, they had to create the RoutedEvent system from scratch to meet its needs - standard events just weren't flexible enough. What do we gain from making a decision to support this special-case in the language?

In Summary

  • OOP semantics are great (operations on objects) and need to be retained.
  • Not everything is an object, and we will get the design wrong initially. We need pragmatic languages that allow for this.
  • Existing OOP has too many concepts - we should prune where we can (starting with properties and events).
  • Encapsulation must be weaker when creating instances, and stronger (by bypassing dynamic type-checking) when consuming instances.
  • Inheritance needs to go because inheritance hierarchies are too change resistance
  • Polymorphism through implementing interfaces is highly change tolerant, and can partially replace inheritance...
  • ... and the rest can be replaced with a concise delegation syntax.
  • We need more isolation mechanisms that we can deploy at whatever granularity is required.
  • We need better ways to work with public data structures (i.e. sets of fields), without pretending they should be private.

These are the qualities a successor to modern static OOP must have. In a follow-up post I'll outline how my candidate language, "Minx" attempts to address these recommendations.