Tuesday, March 17, 2009

Essential Startup Software Development Infrastructure - 2000 Edition

When we started a company in early days of 2000, I spent some time setting up what should become our minimal IT infrastructure and software development environment (That's how I ended up with UID 500...). Since we did not have any money (yet), it had to be free/open-source software, and since we did not have any time for evaluation or in-depth research, we tried to go with what seemed to be the most obvious, conservative or mainstream choice at the time for each piece of the solution.

Initially our entire server infrastructure was based on a single Linux box from Penguin Computing since that was about all we could afford with an empty bank account. In the hope there would soon be more machines to come, it was running a NIS and NFS server for a centralized network wide login, DHCP and DNS (bind) servers for IP network configuration, a http server (apache) as the intranet homepage and SMTP (sendmail), POP and IMAP servers for basic email service. Many of these initial choices were undone again, once we had a real professional Unix sysadmin.

On top of that we built the initial infrastructure to support the software development team. From day one, we wanted the team to work a certain way. E.g. by to put working code in the center of attention. Always move the system in small increments from one working state to a new working state. And only what is integrated into the central repository really exists. Make changing things as easy and risk-free as possible - etc. The common development infrastructure should support this way of working and make it easy to follow these principles.

The key pieces of this initial infrastructure were:
  • Email including archived mailing lists
  • Version control system
  • Document sharing
  • Build and Test automation
  • Issue tracking
Email is probably the most essential tools to support team collaboration, not just for software development. Archived mailing lists provide an instant and effortless audit-trail of any discussion as it unfolds. And emails is also a very convenient way to distribute automated notifications. For our first mailing lists, we used very simply the built-in alias functionality of the mail delivery system itself (sendmail) and MHonArc as the web-based mail archive tool. All the setup is manual, but since we expected the team to change very slowly - reaching about 20 members at the peak.

At the time, the only serious open-source contender for software version control systems was CVS. The version control system is the vault where the crown jewels are kep and it is the most mission-ctritical piece of infrastructure. As soon as we had some money in the bank, we replaced CVS with Perforce, since we were familiar and comfortable with its model of operation (same advantages as CVS but keep meta state on the server, commit atomic sets of changes, etc.). We added a web-based repository browser and notification email support, sending out a mail for each submitted change, with a link to this particular change in the web based repository browser. The source-code repository was meant to be the most openly public part of the infrastructure and nobody should be able to sneak in a change unseen.

Our document sharing system was very simple. Since we already had version control as the central piece of our workflow, we simply used the version control system to stage our entire intranet website. To add or update a document, check in the new version and if necessary hand-edit the html link of some page where it should appear. This sounds crude, but we were all programmers after all and editing some html did not particularly bother us. The website provided easy access to the current version of any document and the version control system backing it provided all the history of necessary.

The build and test automation was essentially home grown (loosely inspired by DejaGnu). At its core was a Python script called runtest, which parsed a hierarchy of test definition files within the source tree and ran any test executable specified there. Test-cases had to generate output containing PASS or FAIL and each occurrence of such a keyword would count as a test-case. For the official automated build, runtest would log its results to a MySQL database, but the same script could also be used interactively by anybody in the team to make sure tests always worked or to troubleshoot breakages. The automated master build itself was simply a scrip which ran in an loop doing a checkout from the source control system and if there was any change, ran a clean build (using a combination of gmake and jam) and execute runtest on the full test-suite. As a framework, this was extremely flexible. Tests could be written in any language as long as they could write PASS or FAIL to the console and exit cleanly at the end. For example we ended up with a very powerful but rather unwieldy network simulation framework written in bash for running high-level integration tests, which could easily be run as part of the runtest suite.

The issue tracking system was not part of the inital setup but followed soon therafter with the conversion from CVS to Perforce. We were using Bugzilla (probably again the only viable free choice at the time) with a set of patches to integrate it closely with Perforce. By automatically enforcing that each checkin into the source control repository had to be linked to a ticket in the issue tracking system. This provided a very rudimentary workflow and scheduling system for keeping track of work items and for linking source changes to the reason why they were being made.