Friday 2 September 2011

Rethinking inheritance: promiscuous methods in Minx

TL;DR

Announcing Minx, a statically typed, functional, OOP language which hopes to demonstrate how flexible a static OOP system can be without inheritance or classes. Vaporware alert! The Minx language does not exist in any form at the moment besides this post - it's just a concept, but at the least I hope it will inspire other languages.

  • Design is focused on simplicity and elegance, removing more from OOP than it adds
  • Features are focused on dependency management, with novel systems for isolating and layering functionality
  • Highly decoupled sharing and extension mechanisms
  • All state defaults to being immutable and non-nullable
  • A simple templating system for generic types/functions

About Minx

In a previous post I outlined the features a next generation OOP language would have - that is, a language that includes all the benefits of modern OOP while eliminating a lot of the frustrations. In this post I showcase a candidate language "Minx" which aims to fulfill these aims. Minx is named after my wife, Catherine ;).

Minx (noun): An impudent, cunning, or boldly flirtatious girl or young woman

Minx is for you if:

  • You get frustrated at how easily large applications descend into unmaintainable messes, even with the best of intentions
  • You want the flexibility of a dynamic language, but statically typed
  • You believe dependency management is enough of a concern to be worth building into the language
  • You like simple and elegant solutions, and think modern OOP has too many concepts

Example Code

As an example, here's a Minx implementation of the wikipedia example code for the "Template Method" design pattern. This pattern always struck me as the best cheerleader for classical inheritance because of how well it leverages the "protected" modifier. However, we can do better :). Because dependency management is such a feature of the language I've included a multi-file example.

exclude:

#Default rule
*
  exclude Games.Implementation.*

Games.Public.SampleGames.*
  import Games.Implementation.*

Games.Public.minx:

# Standard turn-based games
Game =
  tryInitializeGame {playerCount int}-> bool,
  makePlay {player int} ->,
  endOfGame -> bool,
  printWinner ->

# Games with a concept of multiple rounds of play; The default
# value means any Game can be used as a RoundBasedGame
RoundBasedGame = Game +
  endOfRound : {roundNo int} ->    

RoundBasedGame.playOneGame = {playerCount int} ->
  unless tryInitializeGame(playerCount)
    # report invalid playerCount
    return

  var! playerNo = 0
  var! roundNo = 0
  until endOfGame()
    makePlay playerNo
    playerNo++
    if playerNo == playerCount
      playerNo = 0
      endOfRound roundNo
      roundNo++

  printWinner()

Games.Testing.minx:

newMockGame = {lastMoveNo int, reportError ({error string} -> )} ->

  int!? numPlayers
  int! nextMoveNo = 0

  return {
    tryInitializeGame : {playerCount int} ->
      if numPlayers?
        reportError "initialized game twice"

      numPlayers = playerCount
      return true,

    makePlay : {player int} ->
      unless numPlayers?
        reportError "must initialize game before first move"

      unless nextMoveNo < lastMoveNo
        reportError "moves were attempted after game has ended"

      int nextPlayer = nextMoveNo % numPlayers

      unless nextPlayer == player
        reportError "players called out of sequence"

      nextMoveNo++,

    endOfGame : ->
      unless numPlayers?
        reportError "must initialize game before checking if ended"  

      return nextMoveNo >= lastMoveNo,

    printWinner : ->
      if nextMoveNo < lastMoveNo
        reportError "cannot print winner until game has ended"

    } as Game

Games.Public.Playable.minx:

Playable =
  playOneGame {playerCount int} ->

# helper for next method
{count int}.times = {callback (->)} ->
  callback() for i in [1..count]

Playable.playMultipleGames = {playerCount int, gameCount int} ->
  gameCount.times -> 
    playOneGame playerCount

Playable.playKnockout = {playerCount int} ->
  playOneGame numPlayers for numPlayers in [playerCount..2] if playerCount >= 2

Games.Implementation/Monopoly.minx:

MonopolyGameData = 
  chanceCards ChanceCard[],
  playerPositions int[]! : [],
  isGameOver bool!

MonopolyGameData.tryInitializeGame = {playerCount int} ->
  # Initialize players
  # Initialize money
  return true

MonopolyGameData.makePlay = {player int} ->
  # Process one turn of player

MonopolyGameData.endOfGame = ->
  return isGameOver

MonopolyGameData.printWinner = ->
  # Display who won

newMonopolyGameData = ->
  return { 
    isGameOver bool! : false,
    chanceCards : [...]
  } as MonopolyGameData

Games.Implementation/Chess.minx:

ChessPiece = 
  isBlack -> bool,
  pieceDescription -> string

ChessGameData = 
  board ChessPiece?[]!

ChessGameData.tryInitializeGame = {playerCount int} ->
  return false unless playerCount == 2
  # Initialize players
  # Put the pieces on the board
  return true

ChessGameData.makePlay = {player int} ->
  # Process a turn for the player

ChessGameData.isEndOfGame = ->
  return isInStaleMate or isInCheckmate

ChessGameData.printWinner = ->
  # Display the winning player

newChessGameData = ->
  return { board : [...] } as ChessGameData

Games.Public.SampleGames.minx:

newChessGame = ->
  return newChessGameData() as Game

newMonopolyGame = ->
  return newMonopolyGameData() as Game

Main.minx

newMonopolyGame().playMultipleGames {playerCount: 4, gameCount: 2}

Basic Syntax

Improving code readability is an important cause. The most readable language I've come across is CoffeeScript, so I've pretty much gone with its syntax throughout.

The main change I've made is to make it case-insensitive: the developer should not be delayed because they typed "OkButton" rather than "OKButton". In other languages, case-sensitivity is needed to overload the same term to use it as property, field, and parameter all at once - which Minx addresses with its choice of features instead, i.e. no properties or constructor functions. The coding standards, on the other hand, are PascalCase for interface names, camelCase for everything else (after JavaScript). These will be implemented as compiler warnings rather than errors, as the readability gains from consistency are not worth a direct productivity cost.

Data Structures

Traditional class-based OOP has monolithic classes, consisting of private fields, private and public methods that can access these fields, and constructor functions that initialize instances. This mixes concerns for method call semantics:

  • Only methods which are part of the class's "Single Responsibility" should be defined within the class
  • Only methods defined within the class get obj.method() semantics
  • Therefore only methods which are part of the class's Single Reponsibility get desirable semantics; other responsibilities either get second-class semantics, or get erroneously lumped into the class definition, making maintenance harder

Minx replaces classes with data structures containing fields and separate methods that operate on them, after Go. Methods can be defined for any data structure or interface, after C# extension methods, and these methods can be used to satisfy interfaces (structural typing - or "static duck typing"), after Go. Minx takes both Go and C#'s features to their logical conclusion, allowing methods to bind to a data structure, satisfy interfaces, and allow new methods to bind. Compilation will not be fast. For a discussion of the effect this has on encapsulation, see the section on the Open/Closed Principle.

JSON is one of JavaScript's great gifts to the world and is fast becoming the serialization format of choice - for good reason, as it is succinct yet readable. Minx data structures are strongly typed JSON. The {name : value, ... } syntax becomes {name type : value, ... }, with types/values optional in some circumstances. Note we have dropped into the convenient JavaScript parlance of calling data structure fields "properties" (with no ambiguity, as we have ruled out OOP properties). Data Structures can be extended/combined with the + operator, providing they share no members with the same name.

A Minx interface is a data structure with missing values. Any values defined in an interface are defaults - for example, {name: "Minx"} as {name string, age : 0} binds the value to the interface to produce {name : "Minx", age: 0}. Note that the C# as keyword has been re-purposed for compile-time binding, rather than run-time. This system is similar to self's prototypical inheritance : the duality between classes and object instances is eliminated. The difference is that it is statically typed (like Omega), can be used to hide as well as add members, and has no clean separation between the original and "cloned" or bound object - they are instead different views of the same data.

Functions

In Minx, functions are actions (verbs), concerned with 3 categories of data, each represented by an interface:

  • The target of the action (in the grammatical sense - this or Me in other languages). Omit this for standalone functions, rather than methods
  • The parameters of the action. Omitted this becomes {}, the empty object
  • The result of the action (inferred from the body)

All the properties specified by the target and parameters are loaded into the initial scope when the function is called. To rephrase, we always explode the target into its constituents: there is no equivalent to this for accessing the whole. This means we have to choose an abstraction level to work with at the definition of the function - gone is the carte blanche for mixing field access, low-level method access and high-level method access. This is probably the most opinionated part of Minx: Complex objects should be layered, and every operation should explicitly choose a level to operate at. For instance, the example code is cleanly layered into:

  • Data structure definition
  • Methods which interact directly with the data structure (and together meet the Game interface)
  • Automatic binding to the RoundBasedGame interface
  • playOneGame which orchestrates the elements in the RoundBasedGame interface, meeting the Playable interface
  • Higher-order methods like playKnockout which can act on any Playable

As a result, Minx should hopefully be relatively resistant to big-ball-of-mud disease. If any of these data structures contains a single property, an instance of the type of the property can be supplied instead (assuming no ambiguity). Binding a method to a data structure produces a stand-alone function, with type denoted (parameterType -> resultType). The () can be omitted where unambiguous. Overloading is allowed, but method names within a single namespace must be unique. Minx also has anonymous functions whose "target" is the current scope. These can be used to create an object satisfying an interface on the fly (see MockGame in the example).

ValidRange=
  min int,
  max int

{value int}.isInRange = ValidRange ->
  return value <= max and value >= min

bool inRange = 5.isInRange {min: 3, max: 7}

# equivalent to:    
bool inRange = {value: 5}.isInRange {min: 3, max: 7}

# also equivalent to
(ValidRange -> ) isFiveInRange = 5.isInRange
bool inRange = isFiveInRange {min : 3, max : 7}



{title : "", firstName string, lastName string}.getFullName = ->
  if title == ""
    return "#{firstName} #{lastName}" 
  else 
    return "#{title} #{firstName} #{lastName}"

var teacher =
  title : "Miss", 
  firstName : "Agatha", 
  lastName : "Trunchbull", 
  position : "Headmistress"

var student =
  firstName : "Bruce", 
  lastName : "Bogtrotter", 
  likes : "cake"

teacher.getFullName()
student.getFullName()

Various languages have over the years taken different approaches to de-limiting code blocks - {/}, BEGIN/END, Then/End If etc. Eventually, the creators of the language B2 noted that human-readability required that the block itself be indented, and replaced the delimiters with indentation, codifying what was already common practice but not enforced. This is a very important idea - elevating a best practice into a language feature. Just as B2 (or in the mainstream, Python) exchanged indentation freedom for less error-prone block delimiters, we exchange naming freedom for more flexible method-binding. It is a best practice to name consistently, so method binding is reliant upon it.

Modifiers

Immutability

Consider C++ and Scheme: two polar opposites of language design, united in the significance of immutability for code safety and compiler optimization. All data in Minx default to being immutable to emphasize immutability as a best practice. You must add the ! modifier to a type to creatable a mutable instance.

Nullability

When attempting to parse an integer from text, what should you return if no integer can be extracted? Fundamentally we need to communicate the presence of a "special value", but many languages' best practises disallow this due to historical poor choice of values (i.e. -1) that were the source of many bugs. IMHO, null or "absence of a value" is the correct special value in the majority of cases. But how do ensure it's intuitive? In Minx all data defaults to non-nullable. By changing the default, the presence of nullability becomes clear documentation of the author's intention to use null as a special value. Add a ? modifier to the end of types to signify nullability - e.g. when parsing integers, int? allows an integer if one can be parsed, or null if not. Use the variable? operator to check whether a nullable value has a value, and variable| to access the value of a non-null nullable.

Aside: hopefully you - like me - look at the examples below and go "Ugh"! This is the intention - that to introduce mutable or nullable state you must uglify your code very slightly; to make wrong code look wrong. This should provide the correct incentive - to avoid either wherever possible, and so be pushed by the syntax towards more correct and efficient programs.

# use "var" to declare a local where the type can be inferred
var a = "a"

# add modifiers to var
var?! c = "c"
if c?
  var d = c|
c = null

# if the type cannot be inferred it must be specified;
# null can be omitted
string? b 

# immutable
var d = {name : "Minx", age : 0}
# compiler error:
d.name = "Chewbacca"

# we must go out of our way to specify "name" and "e" as 
# mutable properties.
var! e = {name string! : "Minx", age : 0}
e.name = "Chewbacca"

# compiler error as age was not specified mutable.
e.age = 1


Counter = 
  value int,
  increment ->

# callers cannot alter the value property directly
# but have the ability to increment it via the method
getCounter = ->
  var! count =
    value int! : 0

  return count + {
    increment ->
      count.value++
  } as Counter

Dependency Management via Namespacing

In Minx, the unit of functionality is the namespace, which is taken from the combination of filename and relative path; . or / characters in filenames and paths become . characters in namespaces - so the code with namespace System.StringHandling.Uri could be located in any of the following locations:

  • [ProjectDirectory]/System.StringHandling.Uri.minx
  • [ProjectDirectory]/System.StringHandling/Uri.minx
  • [ProjectDirectory]/System/StringHandling.Uri.minx
  • [ProjectDirectory]/System/StringHandling/Uri.minx

The choice is left to the developer, and allows projects to slowly evolve in size without at any point forcing them into a structure they are too big/small for. As you may have noticed in the example code minx code files are completely unaware of the namespaces they are contained within and are dependent upon. This is the fundamental requirement for dependency management - that your code has no hard-coded dependencies to any one thing. Sure it refers to particular named interfaces and requires certain methods - but if you supplied alternatives with the same signature it would never know the difference right? Now we can re-wire deep aspects of our applications's functionality just by changing which files have other files in scope.

But how should we specify what files can be in scope? I was inspired by a mailing-list post by Joe Armstrong about the possibilities of a "Key-value database" of Erlang functions. Decomposing classes back into a sea of functions opens the possibility of doing the same with object-oriented code. The "unique names" for each function are the fully qualified (i.e. with the namespace) function names; If we match these names with wildcards we have a basic query system for this database!

import  a.*
exclude a.b.*
import  a.b.c.*
exclude a.b.c.d

This system is very flexible and requires few statements to specify the locus of functionality required. It is a compiler error to specify these in an order where later statements eclipse previous statements (e.g. import a.b.*; exclude a.*;). In the example above, member d in namespace a.b.c is excluded - but the rest of the a.b.c namespace is included. Dependencies are handled by the three configuration files: import, export, and exclude.

import

import specifies a list of libraries we want to import functionality from, and what namespaces we want to import from them. If omitted, the project imports nothing from any libraries. Lets say I want to import all the functionality from Standard.dll, except for the System.StringHandling.UriEscaping namespace, which I want to import from Alternative.dll. I want to define this in one place so the rest of the application can be unaware of its dependencies - and so I can swap it out with ease when needed. Think about what a system like this can do for dependency injection:

Standard.dll
  import  *
  exclude System.StringHandling.UriEscaping

Alternative.dll
  exclude *
  import  Alternative.AlternativeUriEscaping

export

export controls the output of the project. Usually this will be a single executable or library - but there may be scenarios where that's undesirable. In circumstances where other systems may require you to batch your code up into multiple projects in order to separate out different concerns, Minx allows you to direct different functionality into different targets. It even allows you to inline to the max by incorporating your imported libraries into a single executable if you so wish. At compilation it's verified that your outputs are non-overlapping, have no circular references, and that every output has access to the functionality it is dependent upon (whether this is in another output or one of the imported libraries).

#Default rule
*
  exclude System.*

MyApp.Database.dll
  export MyApp.DataAccess.*

MyApp.BusinessLogic.dll
  export MyApp.Logic.*
  export MyApp.Calculations.*

MyApp.Main.dll
  export MyApp.*
  exclude MyApp.DataAccess.*
  exclude MyApp.Logic.*
  exclude MyApp.Calculations.*

exclude

This is where the exciting stuff happens. exclude specifies at the namespace level what code should be isolated from each other. It is an explicit self-documenting architecture for the project:

#Default rule
*
  exclude *

MyApp.Controller.*
MyApp.UI.*
MyApp.Logic.*
MyApp.Database.*
  import  *

MyApp.UI.*
MyApp.Logic.*
MyApp.Database.*
  exclude MyApp.*

MyApp.UI.*
  import MyApp.UI.*

MyApp.Logic.*
  import MyApp.Logic.*

MyApp.Database.*
  import MyApp.Database.*

The first rule ensures that if we mis-type a namespace anywhere, rather than just getting carte blanche to ignore these scoping rules, the feature has access to nothing. Features in the Controller namespace have access to everything, and the UI, Database and Logic layers are completely segregated from everything in the app except themselves. Code always has access to it's own namespace but not sub-namespaces so extra rules are needed. It is a compile error to specify rules in an order where later patterns are less specific than earlier patterns because of the effect on readability.

We can exert far more control than just this however. Want to swap out the sockets implementation throughout the whole application? Done with a couple of lines. Want to exchange a single method in a library for one of your own? Done. Want to mock a component for developer-testing? Again, done. I'm pretty sure that exclude will quickly grow to be huge, in which case it will be necessary to have a namespace system of its own (e.g. MyApp.Database/exclude contains the sub-rules for the Database namespace) and it will be desirable to have different exclude files for different scenarios - but it would take real-world usage to work out how best to extend this.

Generics/Templates

Interfaces/methods can be templated through the use of unknown types. "Don't care" types can be represented by a single _ (or omitted where unambiguous), and types that occur in multiple places can be matched with _Type patterns. Moving into the realm of "things I'm pretty sure should be possible", the latter also bring a variable Type into scope which can be queried for meta-programming. To inject source code at compile-time, use a ~{} block in the same way you would inject code into a string with #{}. Unlike C# or C++, the type parameters are inferred from usage:

GenericList = 
  getItem {index int} -> _ItemType,
  setItem {index int, item _ItemType} -> ,
  count -> int

# usage implies method only valid for ItemType = string
GenericList.forEach
{withEach {item string} ->} ->
  withEach (getItem(i)) for i in [0..count - 1]


# currying functions
setupLocalServer = 
{parameters} -> 
  setupServer parameters + {ipAddress : "127.0.0.1"}


# static "methodMissing" that maps between two interfaces.
Interface1 = 
  name string,
  age int,
  address string

Interface2 = 
  getName -> string,
  getAge -> int,
  getAddress -> string

Interface1._methodName = ->
  return ~{ methodName.substring(3).toLower() }


# flexible method-call semantics
_.isInRange
_ -> bool
  return value <= max and value >= min

5.isInRange {max: 7, min: 3}

{max: 7, min: 5}.isInRange 6

Minx and the SOLID design principles

The Single Responsibility Principle

To quote the wikipedia article on the subject:

...every object should have a single responsibility, and that responsibility should be entirely encapsulated by the class... [Robert C Martin] defines a responsibility as a reason to change, and concludes that a class or module should have one, and only one, reason to change.

Namespaces (i.e. files) are the Minx unit of functionality, so the equivalent rule should be that every namespace should satisfy exactly one concern, and therefore only have a single reason to change.

The Open/Closed Principle

This principle states objects should be open for extension, closed for modification. The traditional interpretation under inheritance is to make class members containing key logic unmodifiable, while allowing others to be overridden so behaviour can be extended. However in Minx where all encapsulation ("closing") is instead achieved via scope, the principle encourages returning abstracted interfaces which hide internals and allow extension by permuting/ adding to the visible methods. The traditional approach is epitomized by the Template Method pattern, and I hope my example code demonstrates how Minx allows a much more elegant and flexible implementation.

The Liskov Substitution Principle

This principle states that any object which is a "sub-type" of another should be interchangeable with it, without it affecting program correctness. Minx makes satisfying this principle easy: on supplying an object to a function, any members or metadata which are not part of the requested interface are hidden from scope. There is no dynamic type-checking, so different instances satsifying an interface are completely indistinguishable. This was in fact one of the motivating reasons for the language (see my previous post).

The Interface Segregation Principle

This principle states that interfaces should have fine granularity so your callers are less dependent on changes to your API. Minx's design makes callers immune to this to an extent, as they can create their own sub-interface to depend upon instead. Minx also allows more sophisticated scenarios where methods which act on lower-level interfaces can be used to constuct higher-level interface instances, avoiding the need to lump all methods into a single interface.

The Dependency-Inversion Principle

A. High-level modules should not depend on low-level modules. Both should depend on abstractions.

B. Abstractions should not depend upon details. Details should depend upon abstractions.

Minx makes this easier, as it allows splitting functionality into de-coupled layers without compromising on the semantics. Dependency-injection scenarios are trivial using the namespace functionality (see namespaces). Choose the appropriate abstraction level for your purposes.

Undecided

static

Does static modifiable data have any place in a well-constructed application? I want to say "no", because it causes so many problems and muddies a very clean design. However there are bound to be circumstances where it is the best tool for the job. As it stands, Minx allows it with the same visibility rules as all other members. To stop this getting out of hand I'd be tempted to break the symmetry of the design and make modifiable static variables private to their containing file just to reign in the scope for wrong-doing. Kludge.

arrays vs sequences

You will have noticed the use of the [] notation for collections in the example code. I'm toying with making this primitive a memoizing sequence rather than an array as would be traditional in OOP. Then we could define methods that return iterators, or infinite sequences of natural numbers like [1..] while still defining a nice efficient array from 1 to 10 like [1..10]. I'm sure we can find a way for arrays to still be a fast special case in this system, and increase the functional power dramatically at the same time.

Concurrency

I've deliberately avoided discussing concurrency as it's not something that's come up a lot in my career so I have no strong views on how to add it in

Exceptions

I wussed out on exceptions in the end. I seem to be in a minority in thinking that Java was onto something with checked exceptions: exceptions are part of a methods signature, and static typing should be able to ensure that calling code has taken these exceptions into account. You could minimize the burden by inferring what exceptions are thrown, meaning you'd only have to explicitly specify them on virtual method calls, i.e. methods declared on interfaces - but you could declare a method as throwing no exceptions if you wanted the compiler to enforce it for you (e.g. on application entry points). Of course even Java created unchecked exceptions too, as there will always be exceptions that you cannot reasonably anticipate - running out of memory for example - so I relented, rather than cripple the language

Conclusions

Hopefully I've shown you that statically typed OOP without inheritance is desirable and feasible, and that in general OOP has a lot of fat that can be trimmed without compromising its utility. What do you think? Would you like to write a large-scale application in Minx? Are there any huge flaws/glaring omissions? How would you go about building a system like it - it hasn't escaped my notice that it's bloody hard to build a statically typed languages this flexible. Can it be built? If not do you know of a quick counter-proof?

Sunday 7 August 2011

OOP - the Next Generation

TL;DR

Class-inheritance-based OOP has been widely adopted, despite a mixed reception. I attempt to distill the good and the bad to design a new statically-typed approach to OOP. The semantics of OOP and polymorphism through interfaces are very powerful features which we need to keep; the primary casualties are inheritance, properties, events, and "too many concepts" in general; the problems we need to solve are more flexibly-grained isolation, re-use and extension (all of which are imperfectly addressed by inheritance at present). A follow-up post will discuss "Minx", my candidate language which demonstrates what a next-generation static OOP language might look like.

OOP: the next generation

It is undeniable that statically-typed object-orientation has had a profound impact on the industry, speeding up the development of large systems and helping to make their code more intuitive to the reader. But at what cost, many would say? It seems to me that there are 2 main viewpoints on OOP: one very much in favour, one very much against. In this analysis I hope to tease apart the strengths and weaknesses of OOP; in a follow-up post I will suggest a new approach which hopes to keep all the advantages while ditching the disadvantages.

OOP strengths (inspiration taken from http://wiki.tcl.tk/13398):

Semantics

Analogy is an important concept throughout computing, allowing us to borrow a familiar vocabulary for new ideas. We can create any system we can conceive - but we can use only things that we understand, that we have a mental model of; and so it only becomes useful to create things that we have some real-world analogy for. The semantics of object-orientation allow us to capture this benefit in the development of our systems by introducing a vocabulary of objects which can be "created" and "destroyed", are described by a number of attributes, can have various "states", and can be moved between these states by a series of operations. file.open is effortlessly intuitive, readable and succinct - and therefore easy to maintain.

Encapsulation

In simple cases, standard OOP provides a great way to "hide" the details of how a feature is implemented. In particular it allows us to create hidden data, which can be manipulated only through publicly visible methods. The author can ensure that the public methods keep certain properties of the hidden state invariant, saving the caller from having to worry about it. At the same time, hiding these details allows us to change them at will without breaking any caller code. This model allows for truly fantastic library code that can save a developer hours of time - C++ smart pointers being a great example.

Polymorphism/dynamic dispatch

If our implementation details are abstracted away from the consumer of our code, we can create different objects with an identical set of visible methods and return them interchangeably, dynamically dispatching to the appropriate method at runtime. Our code can now be more flexibly coupled to its dependencies, making change less painful, and code more reusable.

Isolation of separate concerns

A single program needs to deal with many different "concerns" - e.g. the user interface, the back-end database, authentication, business logic, validation, etc. It can be very easy to fall into the trap of muddying the application by mixing the code for different concerns - e.g. writing the validation code in such a way that it is tightly coupled to the current UI: changes in the UI now break the validation. OOP does not enforce anything here - but it does provide an "appropriate place" for each type of code (as member functions of the appropriate class), in the hope that you will be motivated enough to keep your logic separated out.

Extension of existing functionality

Often we find our libraries/frameworks do not provide 100% of what we need, and instead we must extend existing functionality. A scenario I've had to deal with in a number of technologies is creating a toggle button out of a standard push button. Being able to tack a bit more on to an existing object without having to worry about breaking the underlying capabilities is great for productivity.

OOP weaknesses

Not everything is an object

This isn't an issue with all OOP, just the more militant forms where object is the base type in the system and everything must inherit directly or indirectly from it. The most pressing example I'm aware of is the .net framework's static Math object. If ever there was an example of ideology winning out over common sense, it was there - creating a static object with tens of unrelated methods (logarithms, trigonometry, rounding...) rather than admitting that not everything is an object, and putting them in a module where they belong. New systems need to return to supporting, rather than enforcing, object-orientation.

Update a friend reminded me on the Google+ comments for this post that strictly speaking Math is only a static class from C#'s perspective - the VB.net terminology would indeed be Module.

Inheritance hierarchies create strong coupling

Textbook examples of inheritance highlight the capabilities which have helped us produce more maintainable applications, more quickly - reuse, extension, polymorphism and encapsulation. However these examples are different from real-world applications in one important respect: change. The problems we solve are prone to change, so maintainable code must be prepared for any change to any aspect. However the "is-a" relationship inheritance denotes claims itself as fixed, intrinsic: "Triangle is-a Shape", just like "2 > 1". Inheritance was not conceived with change in mind, and has no place in applications which are. Another side effect of this is the importance of "getting the design right". However, Art is never finished, only abandoned., so we need systems that assume that our initial design will be wrong, and make correcting it easy.

What alternatives are there? Delegation - either via composition or prototypical inheritance - provides a lot of flexibility in extending and re-using functionality. However using composition for extension ends up being very verbose - Peter Norvig discusses here why the Java Properties object inherits rather than delegates: the Java implementation team was rushed, and took the course that required writing less code. This is a significant issue - our languages must encourage safe, maintainable behavior: it must be easy (and quick) to do the right thing. On the other hand I am unaware of dis-advantages to statically-typed prototypical inheritance (perhaps due to the limited spread of Omega, the only example I'm aware of). Interfaces provide polymorphism and encapsulation with less coupling; together with delegation we can provide superior functionality to inheritance - we just need a syntax to unlock it.

Insufficent Isolation/Classes are big balls of mud...

Un-disciplined applications end up implementing the "big-ball-of-mud anti-pattern" - i.e. having no strict structure and allowing any piece of code to be coupled to any other. This is why dependency management has become such a focus of research, giving rise to approaches like Inversion of Control and Test Driven Development. What mechanisms do modern statically-typed OOP languages give us?

One mechanism is to divide a project into multiple separately-compiled units, taking on the complexity of manually enforcing the dependencies between them. This can create a whole new set of maintenance problems. The other main technique is private/public encapsulation, allowing code to be visible within a given class but invisible outside it. This is a great step forward, but the average class will likely contain members at a number of layers of abstraction:

  • underlying data structure (private fields)
  • primitive operations on the data (private methods)
  • higher operations - there are often a hierarchy of these (public methods).
  • separate concerns that need to interact with the underlying data structure (e.g. serialization, validation, constructor functions that take business objects)

Our current object-orientated systems cannot isolate these - they must all sit in the same class. Often we resort to inheritance hierarchies and the protected modifier to isolate some concerns from the others. Looking outside our classes, we find the public members of classes within an assembly are still a big ball of mud with each other. Object-orientation gave us a bit more granularity, but the original problem remains. I've worked on projects with 30,000-line classes, with circular class dependencies, and with 50+ sub-projects. I recently tried to refactor a 6,000 line class and ended up calling it a day while it was still 4,000 lines - the dependencies were just too hairy to re-map in a reasonable time-frame. Using class and library boundaries to isolate code is a mixing of concerns, and ultimately increases the effort involved in maintaining the application. We need:

  • More flexible ways to achieve the appropriate dependencies for our code, at an appropriate granularity
  • Languages that incentivise layering functionality, rather than putting it all in the same place
  • Languages that make it easy to break up big balls of mud when they occur on any scale.

...but sometimes encapsulation should be breakable

I maybe sound like I want to have my cake and eat it, but I'm serious: developers need more fine-grained isolation control, while at the same time having easier mechanisms to subvert it when needed. If you're adding a checkbox to a form, 99% of the time you want the default behaviour and you want to be shielded from changing something that could accidentally break it - so we want to be isolated. However the other 1% of the time we need to change the behaviour - and we discover that it is hidden behind an impenetrable wall (especially if you're working with one of the .net framework's sealed classes, and even inheritance isn't available). Isolation exists to improve productivity and stability by helping the developer do the right thing - not to dogmatically enforce any principle, however well-meaning.

... and sometimes encapsulation should be unbreakable

I once came across the following in some C# code I was reading:

public interface IDatabaseWrapper {
  /* technology-agnostic datastore actions*/
}

public class SomeDataReader {

  public void Load(IDatabaseWrapper wrapper) {
    if (wrapper is ConcreteDatabaseWrapper) {
      var concrete = wrapper as ConcreteDatabaseWrapper;
      var a = concrete.somePropertyNotOnTheInterface;

      /* You get the picture*/
    }
  }
}

When I investigated, the story behind this atrocity was a terrible tale of DLL dependency hell - the result of which was the interface wasn't updated when the datastore class required more members. Perhaps this was just a pragmatic compromise against the time needed to address the dependencies problem, but it completely defied the point of the original design and increased the maintenance cost of the code.

Notice one thing: it was trivial to completely disregard the supposed encapsulation of the interface when the developer wanted to. All that stood in the way of casting back to the concrete implementation and trampling over the Liskov Substitution principle was the developer's own conscience. This should not be possible.

Again I could fairly be accused of wanting it both ways - after all I have already argued in favour of more flexible and easier-to-bypass isolation. I see however no contradiction: Encapsulation must be possible (perhaps easy) to bypass in the objects you produce; but impossible to bypass in the objects you consume. This preserves the contracts we set up in our code: input parameters of any type satisfying the signature can be supplied interchangeably. This must be unbreakable - and so no run-time type-checking is permissible.

Sometimes state should be publicly exposed

Here's a scenario which I'm sure a lot of you will have come across: You are creating a business object representing a customer, say. You add the private fields - primary key, customer name, customer code, address... but other code will have to make use of this private data, so you have a decision to make:

  1. Make all code that uses this data a public method of your customer object (which will generally violate separation of concerns), or
  2. Allow this data to be accessed publicly.

We don't want to violate separation of concerns, so we choose 2. Now how will we expose the data? Will we make the fields themselves public? No of course we won't because that's "The Wrong Thing To Do" - all fields must be private! Weren't you paying attention at Uni? Far better of course to add public properties to expose all this data. Then our callers will be isolated from changes to our database schema, and we maintain encapsulation over our internal state. Great call, high five!

What a crock of shit. On paper we've encapsulated our state, but in practise all we now have is a public set of properties which are just as tightly bound to the database schema as the private fields. Our mistake was in the first line when we decided we were going to create a traditional business object - i.e. encapsulating the state behind public methods. We'd be much better off admitting that we are dealing with a data structure, and just using a set of public fields. Others have discussed this in far more detail.

Too many concepts

In a previous job, my career progression was linked to passing microsoft certification exams (but that's another story). After a few exams I noticed a trend in the exam questions: very often they concerned areas of the API where a novice would assume a different syntax/naming convention. Or to re-phrase, areas where the API was not as intuitive as it could have been. Meditating on this I decided that an optimally designed API would be impossible to write an exam for, as the most logical/intuitive choice should always be the correct one, so no knowledge of the technology would be required. Exam questions therefore highlight design inadequacies.

Now consider the standard "programmer knowledge" style interview questions you've likely either been exposed to, or exposed others to. OOP concepts feature prominently in this list - define inheritance, what does the protected modifier do, how can you create a class which only one other class has access to? I posit that it is an inadequacy in the design of OOP that you need to learn so many concepts to use it. Expressed more aptly, What the hell is all that crap? And who cares? I just wanna write a program.

Properties are (90% of the time) an exercise in denial

I'm a big fan of the readability of properties and their intuitive semantics. However most usage that I've come across is a variation of the should-have-used-data-structures example above. There are nice things you can do with them, like lazily-instantiated state, but there is an argument for making all properties like these into methods, as the semantics lead the caller to assume that little (if any) processing is performed on call. So in summary, I'd advocate dropping properties because

  1. Exposing private fields with matching public properties does not encapsulate/future-proof anything
  2. 90% of the timea data structure is more appropriate (and making a conscious decision to use data structures rather than objects can inform other aspects of the design).
  3. The other 10% ("doing something clever") can be achieved with "get" methods instead, which at worst means a slight loss of readability (len = string.getLength() instead of len = string.length) and at best means a clearer signpost that significant computation may be involved in executing the call.

Events are just a special-case of callbacks

My argument here is similar to that for properties: despite the nice semantics, they are fundamentally just a way to expose callbacks - yet they are designed for scenarios where zero-to-many subscribers are possible, and only a single object may make use of the subscribers. In how many scenarios are exactly these constraints desired? A button should have exactly one onClick callback - how many stupid UI bugs would never have occurred if this were enforced by the design? At the other end of the scale, consider DOM-style event bubbling, where a handler placed on an element must respond to events raised on a separate element. When the .net designers added this feature for WPF, they had to create the RoutedEvent system from scratch to meet its needs - standard events just weren't flexible enough. What do we gain from making a decision to support this special-case in the language?

In Summary

  • OOP semantics are great (operations on objects) and need to be retained.
  • Not everything is an object, and we will get the design wrong initially. We need pragmatic languages that allow for this.
  • Existing OOP has too many concepts - we should prune where we can (starting with properties and events).
  • Encapsulation must be weaker when creating instances, and stronger (by bypassing dynamic type-checking) when consuming instances.
  • Inheritance needs to go because inheritance hierarchies are too change resistance
  • Polymorphism through implementing interfaces is highly change tolerant, and can partially replace inheritance...
  • ... and the rest can be replaced with a concise delegation syntax.
  • We need more isolation mechanisms that we can deploy at whatever granularity is required.
  • We need better ways to work with public data structures (i.e. sets of fields), without pretending they should be private.

These are the qualities a successor to modern static OOP must have. In a follow-up post I'll outline how my candidate language, "Minx" attempts to address these recommendations.

Saturday 11 June 2011

Ninja Passwords

So Sony have been hacked. Again. I was reading an analysis of the password data the responsible (or rather, irresponsible) parties obtained, and was dismayed at the low password quality on display. "letmein", "password", "123456" and the like were out in full force. I find this very sad, because when you create an account on any site, you give it the honour of curating a part of your identity; this may be a social part (e.g. social networking), a financial part (e.g. e-commerce or banking), or an aspect we take less seriously (e.g. the ability to leave comments on a blog). In all of these cases your password is the "front-door key" to that part of your identity.

Therefore you are putting a great deal of faith in every site you sign up with. I'd like to say "do not get an account with any website you don't fully trust" but that would leave a potentially empty list - instead try to sign up with as few as possible. Why would you entrust any minor part of your identity to a website just to access information, for example? This is why I find OpenID/OpenAuth great solutions - as Jeff Atwood has said before it's the equivalent of showing your driver's license to show who you are, rather than creating a new shard of your identity with yet another 3rd party.

It also means all your passwords must be strong. Every password controls access to part of your identity - and there are often ways to "upgrade" from one part of a person's identity to another. For example, take facebook - a lot of people don't see it as the security risk it is, and so are more lax about it. If you've connected your facebook profile to your family, I can potentially derive your mother's maiden name. By analysing who you've recently contacted, I can probably deduce where you're living. And there are far more creative things beyond these - everyone has a friend who isn't as security-conscious as them (they're the ones with the public profile listing their date of birth and mobile number), they will be immediately obvious and are now part of your attack surface!

This can be a lot of work - strong, memorable passwords are hard to produce. I want to share with you my method for overcoming the problem, and an example.

I'm not going to discuss password managers, which some people see as the solution to the problem, because

  1. I've never used them myself so can't comment
  2. You still need a master password for the manager
  3. There are times when a password manager can't help you - e.g. on someone else's computer, or for storing the password you need to access your machine in the first place.
  4. I have concerns about storing all my eggs in one basket, however secure

Generating passwords

Last summer my wife and I holidayed in Peru and I knew I'd be using a lot of internet cafes and other insecure locations for accessing my email. To minimise the risk of having my identity compromised, I generated a fresh "holiday password" that I could change back as soon as I got home (Disclaimer - I can't vouch for the efficacy of tthis method, but it was the best method I could think of - please describe any better ideas in the comments). This password has now been fully retired so I have no concerns about sharing it with you.

1) Start with something very personal; but that isn't obvious

The knife edge you must walk is to pick something important to you, so that you won't forget it; but something that can't be externally guessed, so a malicious attacker can't work it out. Imagine you are Winston from Nineteen Eighty-Four, trying to trick an enemy who can track and record your every waking action. People choosing their spouse's name, date of birth or even the name of a pet are very common, but this information is far too easy to access with the internet - for example I have no doubt that with a bit of googling you can find out where I went to university and what I studied, the name of my spouse, the date of our marriage or pretty much any other significant event in my life.

Update: Apparently this is known as Kerckhoff's principle

So stick to the small things. Very few people know I'm slightly obsessed with song lyrics for example. More people are aware I'm a fan of Radiohead (I daresay you can work it out from information about gigs I've attended), but fewer again would know that "Paranoid Android" is one of my favourites. In particular one line comes across with great power in the song,

"When I am King you will be first against the wall"

2) Condense it into something quick to type

So now we have the "seed" for our password, we have to reduce it down. This is purely because while longer passwords are more secure than shorter passwords, overlong passwords are too hard to reliably type. The easiest way to reduce a seed is to convert it to an acronym - but anything that you can reliably remember is fair game (second letter of every word? Reverse-order acronym? Whatever works for you).

WIAKYWBFATW

3) Add special characters/Numbers/Capitals in a way that is personal to you

Come up with some substitutions for the letters in your reduced form, the quirkier the better - remember the game we're playing: It must be memorable to you, but totally illogical to a 3rd party. I tend to look at the keyboard and try and imagine alternative meanings for the keys that fit what I want to say. In my example:

  • 1 for "First" and "I“
  • “U” for “you”
  • / for against
  • | for wall
  • lowercase for "am", "will, "be", "the”

Which leaves us with:

W1aKUwb1/t|

4) Practise!

The worst thing you can do at this point is to go ahead, change your password, and forget about it until you next need it! Make sure you can reliably reproduce the password at least 5 times on the trot (without looking at the screen!). This will help you develop muscle-memory for the pattern, so that eventually you don't even have to think about your password - your fingers know what to do for you.

So we now have a quick method for producing a password which a computer will struggle to brute-force, but which we will never forget, and can reproduce at a moment's notice. Hopefully you will find this a useful tool for protecting your online identity - please let me know your experiences/your own techniques in the comments.

Sunday 29 May 2011

Show your working

I recently came across this post by John D Cook favouring crude software models over complex ones. I completely agree - there comes a point where bells and whistles on models add only false confidence. However I have a slightly different take on it.

Consider weather forecasting: behind the scenes we know complex models are necessarily at work, computing probability distributions, correlating known historical trends to the current data, and CFD simulation. Yet what is the end product of this complex statistical analysis? Clear-cut maps, yes/no answers. It will rain tomorrow between 1 and 3; Friday will be perfect for a barbecue; there isn't a hurricane on the way. Apparently if complex calculations produce complex answers we have to paper over them and produce artificially simple answers.

What use is a weather report when I don't know which conclusions are sure-as-dammit, and which merely more-likely-than-not?  I'm not the only who finds these false clear-cut answers unhelpful (and disingenuous). My wife is a medical student and has told me a few stories about the various technological aids they work with in the hospital. One is an electronic ECG analyser, that diagnoses whether a patient's readout has a variety of abnormalitites. I was immediately enthused - it sounds like a fascinating problem - and I asked how this incredible opus of statistical inference, inspired by the breath of the Gods themselves has revolutionised their pitiful Medieval healthcare practises: "The seniors recommend we don't trust it".

I was stunned. How could this be? A computer should be ideally placed to perform such a task. However the perception of the end users is that a human is a more reliable interpreter of the ECG. This message struck a chord with systems I've helped write in the past - specifically invoicing systems. In one case we spent 3 months tracking down a variety of bugs where the client believed the total of the invoice was out by a single penny. The net result of these errors was that the client started manually checking the total of every invoice - defying the point of the system.

So what's the solution?

I'm reminded of a recurrent conversation between myself and pretty much every maths teacher I ever had:
Teacher: You haven't shown any of your workings for any of the answers on this test! If your answer's not correct you could still get partial credit for having the right working.
Me: But were the answers correct?
Teacher: Yes...
Me: So what's the problem?
I was too young (and arrogant) to see it at the time, but the problem was that mistakes are easy to make, and showing your working is a safety net to reduce the consequences when you make them. This is especially true with complex algorithms where probability and statistics are involved.

When our algorithms show their working, and we dispute the answers they produce, we have the option of following the steps through ourselves. It may be that the algorithm made a duff decision, in which case the user can disregard its conclusions - or it may be that the users would have missed something that the computer nailed, in which case this has now turned into a learning exercise.

I was very much persuaded by the arguments in About Face that users relate to programs in similar ways to colleagues. To show your working is to explain your point of view - and be perceived as a source who sometimes has useful things to contribute, but sometimes make errors. Crucially, however, in either case you give the user the ability to decide whether to trust your judgement. Without this, you are a stranger, and the user has no means of deciding whether to trust you.

There are of course usability issues to consider in this. The "working" must be intelligible - this is not an excuse to return to splurging out pages of output on every button-click. However intelligent solutions can be found. For example in the ECG example, why not produce an output that superimposes onto the received signal a caricature representing how the program has interpreted it? For the invoicing example, why not provide (on click of a button) itemised calculations with sub-totals, exchange rates and rounding policies explicitly described? And I'm still holding out for weather reports with error bars.

Whatever the mechanism, when we go beyond spitting out a single number and provide a way for our users to verify the algorithm's calculations, we humble our programs, make them more transparent, and this is the root of trust.

Complex algorithms produce complex answers. 
Any attempt to simplify complex answers will involve losing something, often accuracy.
Your algorithms will not always produce the correct results. 
Show your working.


Update: John D. Cook provided some thoughts via twitter:
... I emphasize to students that their work must be PERSUASIVE and not just correct if they want it to be used
If the real goal is to persuade the user, showing your working is just one route to this - and if implemented poorly (e.g. splurging out pages of text) then it may fail to make a good enough case. We are putting forward an argument to the user, and the considerations for prose arguments (brevity, clarity etc) still apply.

Wednesday 23 September 2009

Energy cost: reading on-screen vs on-paper

As an IT professional a lot of my training material tends to come in the form of electronic documents, typically PDFs or web pages. The easiest option is to print it out so I can read it at leisure in bed, on the train, in the garden etc. However I'm always conscious of the paper that gets wasted this way, so I opt for what feels like the "green" option of reading it electronically.
Recently it occurred to me that having my dual-core 2.8Ghz computer running while I read a document had the potential to be far more damaging to the environment than printing the document out, so I set out to calculate which was worse, using the best resources I could find. If anyone finds more accurate numbers, please let me know.
To make the comparison I will try to estimate the embodied energy cost of a single printed page of A4, and compare that with the energy used by my computer while running for the length of time it takes to read the page. I'm only addressing the marginal costs of course, e.g. I'm not considering the energy it took to create the computer or printer, and amortising that over the number of pages printed/read.
So what is the embodied energy cost of a single printed page of A4?
A University databook put the embodied energy of "Paper" (which I have assumed to be bleached white printing paper) at 36.4MJ/kg. So for 80gsm (grams/square metre) copy paper this comes to 2.9 MJ / square metre. A4 paper has an area of 1/16th of a square metre, so a single sheet of A4 has an embodied energy of around 180kJ.
How much energy does it take to run your computer?
Figures vary wildly between processors and monitors, so I've taken a rough estimate at 200W. Hopefully this should be an upper-end estimate representation of a standard desktop system. However, electricity generation is far from 100% efficient. Coal Fired power stations (which for simplicity we assume are the only source of UK Electricity) had an energy efficiency of 36% in 2008, so that 200W used requires 560W of fuel use. At 560W, it takes 320 seconds to clock up 180kJ; or 5.5 minutes.
So allowing all those caveats and approximations, if you're going to spend more than 5 minutes reading each page, the environment would be better off if you printed it out and switched off your desktop.