Thursday, December 04, 2008 2:47 PM
by
RickM
More Cores Requires More Abstraction, What Does This Mean For Image Processing?
Compilers and programmers are good at very different things. This is why they must come together in order to build software. The programmer has the vision and the intention, the compiler keeps track of all of the small machine related details and optimizations. Unfortunately, this is not an ideal world. At any given time a programmer will be worrying about any number of insignificant platform details while working. However, intelligent platforms that take care of many of these details are on the way and they are coming hand in hand along with functional programming and the many-core revolution.
Introduction
Recently, Steve Hawley built a Lambda expression interface on top of our existing ImageCommand infrastructure. However, he ran into a few roadblocks. The two biggest of which were performance and flexibility. It turned out that given our existing very mutable and very object oriented infrastructure, it is all but impossible to be flexible enough to implement all of our image commands as lambdas without severe performance costs. These performance costs were directly due to having to pass back and forth a great deal of semantic information about the particular changes to each pixel. I see this as a direct result of trying to graft on functional programming onto a very procedural API.
The next big programming revolution will be one of having intelligent systems deal with the details. Things like managing threads and wiring complex objects together will disappear from the lives of the average developer and will instead be left to those developing the platform those programmers are using. We will move from describing how to do things at a low level and instead use abstractions to describe what we want to be done. This is all necessary to for the production of software on the very complex and very distributed hardware platforms that are coming.
Performant Complexity Works Well in Non-Distributed Systems
Our own representation of an image is very complex. By this I mean it has a large number of properties. This is necessary because we have many customers all of which have slightly different goals. We need to be able to support anything our customers might need to do. They must be able to directly access the underlying memory that represents an image for the fastest image processing possible. This type of implementation is performant on the dual to quad core environments people are dealing with today. However, as the number of cores increase it becomes much more important for processing to distribute quickly instead of being heavily optimized for operation on a single processor.
This may sound like a really awful burden but in reality it is a huge boon to the average programmer. What it means is that we will be working at a much higher level of abstraction than we are today. Because of all of these extra cores, we will have platforms which will optimize our abstractions on the fly without us having to handle the messy details. This is much like what modern compilers and runtime environments do today, however it will be to a much greater extent.
Mathematica has been successful with an interesting approach for abstraction. Usually in the math world an image bitmap is just a integer matrix. However, In it’s most recently release Mathematica has taken the abstraction even further; unless you need deeper access, an image is just an image. This allows mathematicians and computer scientists to not worry about the underlying code, much of what is desired is inferred. It’s this kind of innovative development that will be driving the future of our programming languages.
An Inferred World
We are all moving in the direction of inferred programming. By this I mean, systems are getting intelligent enough that they can make very accurate estimations of what you want without you explicitly asking for it. They may not always be correct, but they are in the vast majority of cases and when they aren’t you can correct them. For example, F# has very intelligent and integrated type inference. I have found that it takes much of the type safety burden off of the programmer and still enforces compile time type checking. However, I believe this is just the tip of the iceberg.
Part of the reason for the shift to inference is that functional programming lends itself to this type of analysis. Similarly, if we wish for image processing to move in this direction we must change the level of abstraction we work at in order to provide an environment similarly amenable to analysis. This can be broken down into two parts: how we represent the image and how we deal with processing an image.
The first part of this would be to change how we think about storing images in memory. To do this it will be necessary to to design an image property manager which will take care of the messy details for you. In image processing, is it necessary to have a static height, width and pixel format? All of these decisions should be inferred. In a cleverly designed system these properties would be given values behind the scenes in whichever way would lose the least precision. This can be done by deferring fixing the properties of an image until output time. In other words, lazy decision making. If they are required they should be stated as constraints instead of procedures.
The second part is to change how we define our image manipulations to be in terms of transformations instead of procedures.
Transformations not Procedures
Along with inference and functional programming will come the idea of using transformations on data instead of procedures. In the Microsoft world, this is mainly evidenced by LINQ. In LINQ one describes the transformation of a data set instead of describing the steps needed to change that data.
Right now our commands for processing an image are defined by a set of ordered steps. This comes directly from the heritage of our API which sprung directly from procedural programming. If instead, we were to view image processing as a set of transformations we would do a great number of things to speed up processing.
For instance, many image processing commands can be simply represented as matrices. These matrices can then be combined into a single matrix for almost no cost. This new combined matrix can be applied once for the same result of applying each matrix that came before. In this way many different processing commands can be combined and done all at once. This is much more efficient than doing each command separately over any image of significant size.
Some types of image processing cannot be represented as a single matrix. This could be for a great number of reasons. In some cases it is because they perform a great deal of processing in order to decide on the actual transformation; processing that would be directly affected by previous transformations. In other cases the transformation is not uniform over the entire image. However, to represent them as transformations would at the very least make them easier to parallelize.
Also, this type of abstraction could have significant overhead at the single core level. However, when looking at the vastly many-core world coming, these types of optimizations could produce a result much faster than would be possible by simple procedural case handling.
Conclusion
The programming world is about to change in a way that will make it extremely difficult for even a very savvy programmer to hand optimize software. Even today an article by Michael Swaine on the Dr. Dobbs portal proclaimed that it’s time to get ready for this and learn functional programming. Instead of mourning this change, we should embrace it as it will come along with a new layer of abstraction which will, in the end, make our lives easier and our software more performant. As for image processing, it’s important to keep the ideas of inference and transformation in mind as we head into a multi-cored future. We need to focus on defining problems in a way that makes them easily parallelizable, instead of focusing on specific CPU level optimizations.
Yes, bits will be wasted, and so will overall processor time, but get over it. In a hugely parallel system, what counts is the time it takes to achieve the needed result, not the amount of work done to get there. A five fold performance decrease is a worthwhile parallelization cost when dealing with an order of magnitude increase in the number of processing cores.