Saturday, May 30, 2009

Poor-man's Testing Cluster

One of the biggest problems in embedded software development is how to most effectively test while the final target hardware is either not built yet or is too rare, unwieldy or expensive to give every software developer unlimited and unrestricted access to.

Typical approaches include software simulation of the target on some general purpose computer or to find some suitable stand-in for the final hardware platform.

Simulation by running the target OS on another CPU architecture (e.g. PowerPC target on an Intel based PC) is inaccurate for some of the important differences between CPU architectures like endianness and memory alignment. Simulation including the target processor architecture can be very slow unless the speed difference between the target platform (e.g. a low powered mobile device) and the host platform (e.g. a standard desktop PC or server) is sufficiently large to make the experience usable for developers - e.g. for the Android emulator included in the SDK.

The often simpler and more reliable alternative is to find a platform which is close to the final target or deliberately choose the target CPU complex to be close to some standardized platform. Vendors of embedded CPUs often tend to sell functional evaluation boards with reference designs for their chipsets - but those tend to be expensive because of the small volume of manufacturing.

For our project at Xebeo Communications, we deliberately chose to use an x86 based controller architecture - even though this did make life harder for the hardware and manufacturing/logistics teams, since for various reasons, building a small volume x86 platform is not as easy as with other CPU vendors & architectures which explicitly support the embedded market.

But using a controller which is close to a standard PC allowed us to run a stripped-down FreeBSD kernel as the target OS and preliminarily build and test most of the software on any standard PC, like our linux based desktops. By using an OS abstraction layer, we were able to run an test on most Unix-like operating systems to verify that the application software would be portable to other architectures if need be.

In order to become even more realistic, we started building low-cost eval boards using the cheapest commodity PC parts we cold find from discount online stores like TigerDirect:
  • low-end PC motherboard with matching low-end Intel CPU (celeron or similar)
  • cheapest matching memory configuration as needed
  • cheapest PC power supply
  • CPU heatsink with fan
  • PCI Ethernet card with same chipset as our design
  • compact flash to IDE adapter
  • small compact flash card for network bootloader
  • Rubber feet and screws fitting motherboard mounting holes
  • Serial cable for console access
The resulting systems could be build for under $200 in around 2002 and probably even more cheaply today. The resulting systems can be seen above, stacked on storage rack. These particular systems were used to run the various automated nightly and continuous builds.

At this price, every software developer could have one next to their desktop and network boot it from a tftp & nfs service exported from that desktop. By changing a boot option, the board could be used to run the target emulation mode or a simple workstation mode to run the automated test-suite, thus reducing the CPU load on the workstation.

Building and testing a large code-base is very resource consuming in CPU, memory and disk-I/O bandwith and can bring even relatively powerful workstation to the point of being barely usable for any other things while the build is running. Having a large number of low-cost CPU blades around to run build and tests can be a significant booster of developer productivity.