Saturday, May 30, 2009

Manhattanhenge

Today is a geeky holiday: the particular celestial alignment called Manhattanhenge where the sun sets exactly in line with the Manhattan street grid, which is about 30 degrees off from perfect east-west orientation.

This picture which is taken from 5th avenue, shows the sun setting at the end of 22nd Street. Not exactly at street level and somewhat before the astronomical time of sunset, since the mathematical horizon is obstructed the coastline of New Jersey in the background.

Some more pictures on flickr from 23rd st and 22nd st.

Poor-man's Testing Cluster

One of the biggest problems in embedded software development is how to most effectively test while the final target hardware is either not built yet or is too rare, unwieldy or expensive to give every software developer unlimited and unrestricted access to.

Typical approaches include software simulation of the target on some general purpose computer or to find some suitable stand-in for the final hardware platform.

Simulation by running the target OS on another CPU architecture (e.g. PowerPC target on an Intel based PC) is inaccurate for some of the important differences between CPU architectures like endianness and memory alignment. Simulation including the target processor architecture can be very slow unless the speed difference between the target platform (e.g. a low powered mobile device) and the host platform (e.g. a standard desktop PC or server) is sufficiently large to make the experience usable for developers - e.g. for the Android emulator included in the SDK.

The often simpler and more reliable alternative is to find a platform which is close to the final target or deliberately choose the target CPU complex to be close to some standardized platform. Vendors of embedded CPUs often tend to sell functional evaluation boards with reference designs for their chipsets - but those tend to be expensive because of the small volume of manufacturing.

For our project at Xebeo Communications, we deliberately chose to use an x86 based controller architecture - even though this did make life harder for the hardware and manufacturing/logistics teams, since for various reasons, building a small volume x86 platform is not as easy as with other CPU vendors & architectures which explicitly support the embedded market.

But using a controller which is close to a standard PC allowed us to run a stripped-down FreeBSD kernel as the target OS and preliminarily build and test most of the software on any standard PC, like our linux based desktops. By using an OS abstraction layer, we were able to run an test on most Unix-like operating systems to verify that the application software would be portable to other architectures if need be.

In order to become even more realistic, we started building low-cost eval boards using the cheapest commodity PC parts we cold find from discount online stores like TigerDirect:
  • low-end PC motherboard with matching low-end Intel CPU (celeron or similar)
  • cheapest matching memory configuration as needed
  • cheapest PC power supply
  • CPU heatsink with fan
  • PCI Ethernet card with same chipset as our design
  • compact flash to IDE adapter
  • small compact flash card for network bootloader
  • Rubber feet and screws fitting motherboard mounting holes
  • Serial cable for console access
The resulting systems could be build for under $200 in around 2002 and probably even more cheaply today. The resulting systems can be seen above, stacked on storage rack. These particular systems were used to run the various automated nightly and continuous builds.

At this price, every software developer could have one next to their desktop and network boot it from a tftp & nfs service exported from that desktop. By changing a boot option, the board could be used to run the target emulation mode or a simple workstation mode to run the automated test-suite, thus reducing the CPU load on the workstation.

Building and testing a large code-base is very resource consuming in CPU, memory and disk-I/O bandwith and can bring even relatively powerful workstation to the point of being barely usable for any other things while the build is running. Having a large number of low-cost CPU blades around to run build and tests can be a significant booster of developer productivity.

Friday, May 22, 2009

Submarine Mode

I understand that one of the Google's ulterior motives with Android is to promote a mobile experience where the user is always connected to the Internet and the G1 is pretty much built around that "always on" networking paradigm - including the special flat-rate data plans from T-Mobile.

On the other hand, data services are not universally cheap yet everywhere in the world and it would be nice to give the user more control over the mobile data usage. Both current commercial Android phones (HTC G1/Dream and G2/Magic) have two types of radio for data usage:
  • GPRS/EDGE/3G cellular data connection
  • IEEE 802.11 WiFi wireless LAN interface
Since the Wifi interface is faster and was not exactly invented with power saving mobile devices in mind, it is presumably more power hungry than the cellular interface.

In the implementation on the G1, Android gives precedence to the Wifi connection if enabled and available when the phone is active - i.e. when the screen is on. Once the screen goes off, the wifi interface is shut down after a few seconds until the screen is turned on again. There is an obscure option in the expert wifi settings (Wi-Fi Settings->(menu key) Advanced-> Wi-Fi sleep policy) to change that default behavior.

If background synchronization for gmail, calendar and contacts is enabled, the phone will periodically (about every 5min according to NetMeter) partially wake up and go online on the cellular data network - even if it is sitting in the middle of a well covered Wi-fi network.

While it is possible to administratively turn off the Wifi radio, it is currently NOT possible to turn off the cellular data connection, while leaving the Wifi interface running. The only option is the "airline mode", which disables all radio interfaces - including the ability to make or receive phone calls.

What seems to be missing is a configuration switch to turn off cellular data only, leaving on the phone service, SMS and the Wifi interface. There is an option to disable data when roaming, where the biggest cost might occur - but I would prefer not to be roaming at all and use a local, maybe pre-paid SIM card instead. (My home operator, who gets paid $1.50 per minute in roaming charges on every phone-call, would probably disagree.) So what I do when traveling is to use a prepaid SIM card for phone calls and administratively deprovision the data service capabilities (by calling the operator to have it disabled) and/or by making sure there are no APN settings configured for this operator. This way I can leave background synchronization and Wifi enabled and still receive email updates while in Wifi coverage - even though this synchronization only happens when I turn on the phone's screen.

Hoever sometimes it would be nice to be able to splurge and pay the 50c per kbps for doing a quick search or lookup of something on the Internet, even using a prepaid or otherwise expensive data plan - but without the phone taking advantage of the oportunity to sync all my email or download an new OTA update - completely drainging my prepaid card before I can stop it.

What I would like is an additional setting - a "submarine mode", where the celluar data interface is only used in an extremely minimal and controlled way - like the radio transmitter on submarine, trying not to be located. While in this mode, the use of the cellular data service should be reliably cut off, but phone service and Wifi should continue to work. In addition, I should be allowed to very temporarily bring up the cellular data interface and grant access only to the current foreground application (e.g. using IPtables with UID matching) - ideally with a data usage counter right in the notification bar, so I can see what is going on.

I don't know enough about how cellular data services work to know if something like this would be easily implemented, but I doubt that a feature to allow users to pinch pennies on phone usage would be very high on any operators priority list right now and as long as operators drive the requirements with cellphone manufacturers, it is their priorities which are being implemented.

Saturday, May 9, 2009

Subversion & Trac (SDI 07 Part III)

In this episde of the series on creating a minimal software development infrastructure, we are dealing with the centerpiece of the solution: setting up Subversion as the version control system and Trac to provide a unified and integrated system for bug/issue tracking, collaborative document editing (wiki) and source control repository browsing as well as a platform for further integration and extension. All the necessary packages are already installed on the server as part of the previous episode.

Assuming that our infrastructure server has a payload partition mounted under /data and our fictitious project is called "sdi07", we are setting up the disk space for our project as follows:
mkdir /data/
mkdir /data/sdi07/svn
mkdir /data/sdi07/trac
To properly initialize the databases for svn and trac, we need to run the following commands,
svnadmin create /data/sdi07/svn
trac-admin /data/sdi07/trac initenv
and answer a few basic questions for trac-admin. In particular we need to specify the name of the project as it will appear on the Trac main page and the path to the subversion repository ( /data/sdi07/svn as specified above). Since we are using subversion as the version-control system and sqlite as the back-end storage for trac, we stay with the default choices for all the remaining questions.

After initialization, any configuration choices can be changed by editing /data/sdi07/trac/conf/trac.ini or by running trac-admin/data/sdi07/trac/ with one of the supported commands.

Both Subversion and Trac have their own specific server implementations, but both also support access through an Apache web server. We choose to go the Apache way to unify front-end setup and user account management.

Since all operations originated from the Apache front-end are executed under the permissions of the low-privilege apache/apache Unix user, we sign over the entire project space to that user first:
chown -R apache:apache /data/sdi07
Since we have installed subversion with the apache2 use flag, the necessary modules and config files to support subversion access through Apache have been installed. In order to map subversion access under the URL http://localhost/sdi07/svn/, we add/modify the following to /etc/apache2/modules.d/47_mod_dav_svn.conf:
<Location /sdi07/svn>
DAV svn
SVNPath /data/sdi07/svn
</Location>
The simplest way to configure Trac within the Apache configuration is by using mod python. To map the Trac instance we just configured under the URL http://localhost/sdi07/trac/, add the following section into the module specific conditional configuration in the file /etc/apache2/modules.d/16_mod_python.conf:
<Location /sdi07/trac>
SetHandler mod_python
PythonHandler trac.web.modpython_frontend
PythonOption TracEnv /data/sdi07/trac
PythonOption TracUriRoot /sdi07/trac
</Location>
Both Subversion and Trac require a user identity - at least for any write operations. If the http accesses are authenticated, the user identify will be passed by Apache to the corresponding Subversion or Trac backends.

Among the many options for user authentication with Apache, the simplest to set up is to use basic authentication with a local htpasswd file. In order to require all access to our entire project webspace under http://localhost/sdi07/, we add the following section to /etc/apache2/httpd.conf:
<Location /sdi07>
AuthType Basic
AuthName "SDI07"
AuthUserFile /data/sdi07/htpasswd
Require valid-user
</Loction>
In order to create access for a new user, execute the following command:
htpasswd2 /data/sdi07/htpasswd <username>
After the empty file has been created apache:apache file ownership permissions. While this approach to user account management is very simple, it is admittedly not very flexible. We will discuss some alternative approaches to user account management later on.

In order to finalize the configuration and start up the Apache web front-end, we need to add activate the required optional Apache modules in /etc/conf.d/apache2:
APACHE2_OPTS="$APACHE2_OPTS -D PYTHON -D SVN -D DAV -D DAV_FS"

and then finally try to start the newly configured front-end with
/etc/init.d/apache2 configtest
/etc/init.d/apache2 start
while monitoring /var/log/messages and /var/log/apache2/error_log for
any errors. Once any potential configuration issues are fixed and Apache is starting up properly, we can add it to the default runtime configuration with
rc-update add apache2 default
After Apache is running properly, we should see a default wiki homepage for our project at http://localhost/sdi07/trac and we should be able to create the basic recommended source-tree layout for a new Subversion project as follows:
svn checkout http://localhost/sdi07/svn my_workspace
cd my_workspace
svn mkdir trunk
svn mkdir branches
svn mkdir tags
svn commit -m "create initial directory structure"
Typical causes for errors at this stage, might be file ownership - i.e. not all necessary files have access permissions for the apache user under which the Apache web-server is running or some typo in any of the configuration files.

Since this setup is based on the state of the Gentoo world of 2007 (from Sabayon Linux 3.3b live min-CD), some details certainly have changed with newer version and need to be adjusted. The versions of the key packages used here in particular are as follows:
dev-util/subversion-1.4.3-r1
dev-python/mod_python-3.3.1
net-www/apache-2.2.4-r1
www-apps/trac-0.10.4

Wednesday, May 6, 2009

Platform Setup (SDI 07 Part II)

In the last episode of this series, we have decided to use Gentoo Linux on a skimpy tabletop server as the platform for the software development infrastructure for our fictitious new project.

True to the hardcore image of Gentoo, the installation process for the Gentoo bootstrap binary distribution is a bit spartan (as of 2007). Fortunately Sabayon Linux provides a Gentoo derived distribution with a live CD to check out hardware compatibility and a simple installation process targeted at desktop end-users. Ok, I desktop distribution is probably not optimal for a server, but I needed to get the base system up and running as painlessly as possible (which it did).

Once the minimal base system is configured to connect to the local network and is ready for remote login by an admin user, we can start with the setup of the service infrastructure. Based on the list of the services we want to set up, this is the shopping list of additional packages, which we need to download from the portage repository and build locally:

echo "www-apps/trac cgi fastcgi sqlite enscript silvercity" >> /etc/portage/package.use
echo "dev-util/subversion apache2" >> /etc/portage/package.use
echo "net-mail/dovecot pop3d" >> /etc/portage/package.use
emerge -v apache2
emerge -v subversion
emerge -v trac
emerge -v mod_python
emerge --unmerge ssmtp
emerge -v postfix
emerge -v mhonarc
emerge -v dovecot


Before we start configuring any of the services, here are a few deliberate assumptions and choices we have made for this setup:
  • This server will run all the services required to support collaborative software development, but not any general purpose IT functions which are needed for any kind of team or work environment (networking, email, file & print sharing, Internet access, account administration, etc.)
  • This server will run behind a firewall on a gated, private network. The security on the server is geared towards keeping visitors from casually snooping around or anybody accidentally destroying something rather than keeping out any seriously determined cyber-criminals.
  • Thanks to the cyber-criminals mentioned above and other assorted scum hanging out on the Internet, running a public e-mail server has become an unpleasant hassle which might not be worth doing for a small organization. Instead, using a hosted mail service might prove very attractive. Instead of trying to route email traffic from and to the software development server, we could as well run a completely isolated email system on that server. Member of the software team would have to use two distinct email accounts to communicate within the project team and externally. Most modern email clients can easily manage multiple accounts accessed remotely over protocols like POP or IMAP. If there is already e-mail service provided on the local network, we can easily relay all messages there instead.
  • All the services which are part of the software development infrastructure require a single set of consistent set of accounts to uniquely identify each user. Less so to prevent unauthorized access than to track all user interactions which are part of the projects evolution and audit trail (checkins, ticket updates, changes to wiki pages, email messages sent to the project mailing lists, etc.). We need a simple way to provide unified account management and probably do that separately from whatever is done for the general IT infrastructure.

Tuesday, May 5, 2009

Cupcake is out of the Oven

The new version of the Android platform - 1.5 "Cupcake" - is now being shipped with the new HTC Magic phone from Vodafone and is also already available for some versions of the HTC Dream/G1. Since an OS update in the field is aways a scary business - T-Mobile is likely going to take it slow to upgrade all of the reportedly over a million sold G1 phones.

Cupcake seems to be a relatively minor major release - a few significant new features (on-screen keyboard, video), some UI face lift and some improvement behind the scenes (battery life, performance).

For my own use, there are two features which have made the upgrade to Cupcake a big deal for me.

The touch-screen virtual keyboard is the big one. I have never been a big fan of the G1's hybrid touch-screen plus keyboard design and the virtual keyboard is more than good enough for me. In fact it is a lot better, since I for small text input, work flow from touch-screen navigation to text entry and back is a lot quicker and smoother than before. Since the upgrade I have not used the physical keyboard any more and would be more than happy to loose it...

My second most favorite new features is the support for bulk operations in the gmail app. Like in the online interface, there is now a row of check boxes in the message list and if you start checking them, a set of operations like bulk-delete or bulk-archive becomes available. I do get a lot of email on my account and this kind of rapid triage is important for me.

I don't particularly care about being able to record video on my cellphone or have home-screen widgets, so many of the other features are lost on me. Since I don't have two phones to compare side by side, I am not even sure anymore what has really changed.

Battery life is probably better, but still somewhat of an issue, but since I have wifi on all day and use sync to a busy gmail account, I can't really complain. I get through a day on one charge, which is pathetic for a regular feature phone, but not bad for a portable computer.

Most of the apps I use are still working - including the ones I wrote myself. I had to do a small update for BistroMath to fix some issue with how the keyboard and landscape mode was detected (with Cupcake, the keyboard is always there...). I was mildly surprised, that NetMeter still works without a problem, since it uses non-standard APIs by going directly to the /proc filesystem of the underlying linux kernel for much of the information.

Sunday, May 3, 2009

Choosing the Operating System (SDI 07 part I)

For the first part of the discussion on how to set up a minimal software development infrastructure for a startup project, using only open-source software, we are looking at the lowest layer in the technology stack - hardware and operating system.

The first obvious reason for choosing an operating system for this development support server would be familiarity. If there is a particular OS or distribution the administrator is most familiar and comfortable with - this should probably be the most significant argument for choosing it.

At the time of this experiment, I did not have any recent experience with any particular OS for the last few years, so the choice would be based on what I could most likely set up most easily without much of a learning curve and where I could get help most easily when I run into problems.

The most obvious choice for an open-source operating system at this point is Linux, which runs pretty much on anything with a CPU - including almost any commodity PC hardware. For server platforms, my other preferred open-source operating system has typically been FreeBSD - which doesn't try to be anything else but a rock-solid server platform - but is a lot more picky when it comes to hardware and software support.

Even though not a very typical server hardware platform, the machine used for this experiment was going to be my mini tabletop server from AOpen. Linux would probably be my best bet to install and run on this type of hardware without too much trouble.

After choosing Linux, the next question is which distribution?

Assuming that an ambitious software project might have a development life-cycle in the order of 12 to 36 months, which is a very long time in the life of typical Linux distribution. We would like to assume that key systems like version control etc. could be set up at the beginning of the project and would not need to be touched or upgraded again during the most crucial initial development phase. If we need to do any upgrades down the line, we most likely would want these upgrades to be as minimal as possible. From past experience (admittedly with RPM mostly), the package management of most Linux distributions breaks down when trying to do point upgrades on a several year old system, which has not been kept up to date - sometimes just because packages are not archived for that long on a distributions website.

All of the major Linux distributions use some form of package management system for installing and upgrading optional software packages and for keeping track of the dependencies between packages. The most popular package management formats are RPM and DEB which are both based on distributing and installing binary packages. The odd one out among the top Linux distributions is Gentoo Linux, whose package management system, portage, is based on locally compiled source packages.

I am intrigued by the Gentoo portage package management system not for the usually claimed benefits like greater speed or better optimization, but by its potential to reduce non-essential dependencies. Dependencies often considered the root of evil in software package management...

Most open-source software packages themselves are extremely portable. They often not only build and compile from source on any Linux distribution but also most other Unix and Unix like systems sometimes even including MS Windows. One of the secrets behind this flexibility is for example the GNU autotools, which allows a package to probe and discover the existing system configuration and to configure its build to account for its current environment.

While most open-source software packages may have essential dependencies without which they cannot work, there are many optional dependencies which may be disabled if not needed. Once a package is built for a particular environment, much of that environment becomes a accidental or spurious dependencies for the resulting binary, which needs to be satisfied if we distribute a binary package.

This is just a hunch, but I would think that a source based package manager like portage should be able to get away with a lot less dependencies among packages than even the best ones based on binary packages. A non scientific sample comparison for Subversion between the gentoo-portage repository and the most popular Debian package system, seem to support that intuition:
the portage package has 4-5 mandatory direct dependencies and a few optional ones, which can be enabled or disabled during build, while the debian package is broken up into three different ones (subversion, libsvn1 and subversion-tools) with a dozen or more direct dependencies, not including some of the optional features from the portage package.

To further test this hypothesis, I have installed and upgraded a few packages on my now roughly two year old, out of date Gentoo Linux system without much of a problem. None of the typical problems like packages no longer available, incompatible with the system or causing a cascade of upgrades which might break other existing packages unless they are upgraded as well.

On the other hand, since I did not do the control experiment with a leading binary distribution to compare, who knows if it might have worked out here as easily.