Saturday, May 25, 2013

Raspberry Pi Internet Access Monitor

If your Internet access is down and you are not watching, is it really down? For most people the answer is most likely - who cares!

Since our Internet connection had recently been down a few times, when I did notice it and I was getting curious about how frequently this was happening.

Besides, it might be interesting to get a long term record on the stability and "quality" of our ISP and maybe even compare it with the results from users of competing ISPs.

What does it mean for Internet access to be working? Is it enough to check that the link between our home router and the ISPs access router is up and working (a DOCSIS cable plant in my case). Or should we include end-end application layer scenarios like the ability to get my email or my files from some place "in the cloud"?

But what exactly represents the "the cloud" or "the Internet"? In reality they are massively large distributed systems at a global scale, consisting of millions of components, points of failure and recovery.

For the purpose of these measurements, we need to choose a few relevant and representative  destinations as being the stand-ins for the whole Internet. Since many people seem to think, that Google,  Facebook, Amazon or other top-tier web properties ARE the Internet, we can as well use them as a reasonable proxy for the Internet. Depending on the mix of Internet or "cloud" services we use on a daily basis, it should be easy to come up with a short-list of destinations and services which we particularly care about. We also need to be very conscious of the load, these measurements are putting on the services and whether their owners would likely object. E.g. choosing very popular services which already have very high traffic loads, would help to mitigate the additional impact of the probes. Running these measurements should have less impact on any part of the Internet infrastructure than leaving a few browser tabs with AJAX apps open over night...

Without access to the low-level network and systems monitoring the most basic way to judge Internet connectivity would require to periodically probe whether some destination or service is currently reachable. There are quite a few open-source network and service monitoring tools available, but for our goal of  long-term automated connectivity testing, the most natural choice might be SmokePing, an open-source tool, byt the author or MRTG and RRDtool, very popular among IP network admins.

SokePing gets its name from plotting latency, jitter (variation in latency) and packet loss in a single graph, drawing jitter as a smoky cloud around the line of median round trip time, as in the example below:

A low-cost, low-power Raspberry Pi in headless mode, which can be left in headless mode attached to the Internet gateway, would seem like an ideal platform for such monitoring & measurements. And fortunately, SmokePing already comes pre-packaged with all its dependencies (Perl, Apache etc.) for Raspbian, so installing it is as easy as:

sudo apt-get install smokeping

After that, we can edit the config files in /etc/smokeping/config.d/ to determine what kind of probing to run and how to display the results in the web interface. All the configuration settings are documented here.

The smokePing prober did not start properly in the default configuration on my system, because of a missing reference to sendmail. Removing the sendmail line from /etc/smokeping/config.d/pathnames did resolve the problem.  The Raspberry Pi has more than enough compute power to run the smokePing prober for a small configuration as outlined below. Running the web front-end, which generates the graphs from the RRD timeseries database, can be a bit taxing on one's patience as rendering a each page takes about 10 to 15 seconds at 100% CPU load.

Measurement Setup

The most basic, low-level and least intrusive way to do connectivity probing in IP networks is using the ICMP echo protocol implemented in the kernel as part of the IP stack and by the ping network diagnostic utility. As targets for the ping probes, we choose the front-door address of some major Internet companies:, and All of these are logical addresses, which map to a long list are heavily replicated and geographically dispersed physical machines, assigned by a DNS load-balancer based on availability and proximity. These 3 companies and sites represent a good part of the Internet traffic and are unlikely to go down, specially not all 3 at the same time. Should the pings to all these 3 destinations suddenly start failing, the outage would most likely be our Internet access connection of the access network of our ISP. SmokePing is configured to use 10 probes to each destination every 5 min to collect packet delay and loss information.

In order to locate the actual server to probe, ping relies of the domain name system (DNS), itself a highly replicated and distributed infrastructure at the very core of the Internet. In order to isolate IP connectivity from name service issues, we are setting up a secondary set of probes using DNS queries directly to the physical IP address of the primary name server of our ISP and to the well-know address of the Google public DNS service (itself heavily replicated using BGP anycast routing). Since this is starting to hit more complex services in application space, we are reducing the polling rate to 5 probes each every 15 minutes.

Clearly, low latency, latency variance (jitter) and packet loss rates are an important part of network performance, but do not give the full picture. Ideally it would be nice to also measure the available bandwidth, most representative of perceived network "speed". However doing so requires expensive, heavy load probes, which try to saturate the network path to estimate its capacity limit. Doing so regularly in an automated long-term test would seem a bit frivolous and wasteful.

As as small compromise of qualifying how the network performs for common "cloud" services, we are using a http probe, which downloads a copy of the photo above from a public folder in dropbox, a cloud based file storage service and from the static user content server of the Google+ photo service.

Here is the core of the setup for these experiments in /etc/smokeping/config.d/Probes and /etc/smokeping/config.d/Targets respectively:
*** Probes ***

+ FPing
binary = /usr/bin/fping
step = 300
pings = 10

binary = /usr/bin/echoping
step = 900
pings = 5

binary = /usr/bin/echoping
step = 900
pings = 3

*** Targets ***

probe = FPing

menu = Top
title = Raspberry Pi Internet Access Monitor
remark = Latency to a few select sites and services in the Internet.

+ Internet
menu = Internet
title = Internet Access (Ping)

++ Google
title = Google
menu = Google
host =

++ FB
title = Facebook
menu = Facebook
host =

++ Yahoo
title = Yahoo
menu = Yahoo
host =

menu = DNS
title = Name Servers

++ gdns
title = Google public DNS
menu = Google public DNS
probe = EchoPingDNS
dns_request =
host =

++ cablecom
title = Cablecom DNS
menu = Cablecom
probe = EchoPingDNS
dns_request =
host = <Your ISPs primary NS IP address>

+ Cloud
menu = Cloud
title = Cloud Services

++ dropbox
title = Dropbox
menu = Dropbox
probe = EchoPingHttp
host =
port = 80 
url = /u/12770892/benchmark/raspberrypi.jpg

++ gusercontent
title = Google+ Photo
menu = Google 
probe = EchoPingHttp
host =
port = 80
url = /UB5Y5yJKtj51bs2asd8kJGjOxwigev7JPQz3g9tw1C0=w614-h801-no

Monday, May 13, 2013

Back to Broadcast

About 3 years ago, I speculated in this post, that "user curated content" would become the next logical step to the "user generated content" wave unleashed by the interactivity of web 2.0. By and large I have been wrong.

What has happened instead is an accelerating professionalization of online content creation and a return towards the traditional broadcast model with a pronounced split between few creators who produce stuff and the many consumers who consume it.

True, there are still myriads of users engaged in some form of content creation, but increasingly only few creators matter. True the cost of creating and distributing digital content has been lowered to much below a level representing a serious barrier to entry, but more so than ever it takes a serious level of luck, perseverance and highly professionalized marketing to stand out from the crowd.

Maybe it is a sign of maturing for any new medium that a period of frantic and chaotic experimentation is followed by consolidation and professionalization, even though in this new medium there is hardly an inherent, natural monopoly and the only scarce resource is the attention of the audience.

It is quite telling that G+, the most recent and most contemporary of the major social networking platforms is based on an asymmetric relationship model of follower rather than friend and makes a clear distinction between profiles (for users) and pages (for corporate entities and brands).

Some of the most symmetric and egalitarian platforms like Facebook or YouTube, which date from the mid noughties, are now well into a process of retooling themselves into a place where increasingly the masses can in some form follow, subscribe to or endorse a relatively small number of online celebrities.

While the original notion of Facebooks "friend" implies a relatively small number of peer to peer relationships, today many facebook "stars" have millions of "friends", most of whom they would probably not recognize in the street. Many of those are not even people, but brands, which seem hardly capable of feelings such as friendship.

YouTube started out as a video sharing website, where every user could also be an uploader and the key purpose seemed to be sharing and exchanging amateur videos. Today, YouTube more clearly distinguishes between partners, who create content and viewers who consume it.

Yes, it is still possible for ordinary Joes or Janes to be friends with each other on Facebook or for any YouTube user to upload a video, but this is no longer the most relevant pattern of use.

Part of the reason for this change of focus may be that it turned out to be very hard to make money with user generated content. While it may be fun for a while to see a few random people's home videos or read about what they are having for lunch, in the long run, most of us favor for our entertainment some level of professionalism in content and production values. The platform operators also face an increasing pressure to make money and the easiest way to do this, is to cater to well funded entities who want to be your friend and want you to consume their content.

But while it's basically back to broadcast, back to being a fan, follower or viewer, the entities who are winning most of our attention are not necessarily the same old traditional household names.

Some blogs, like the The Huffington Post have become veritable new-media powerhouses or some musicians like Justin Bieber or Psy have managed to leverage their presence on YouTube into an A-list international career. And even at a less high-profile level, many talented musicians, photographers, writers or journalists, endowed with a certain knack for self-promotion, have managed to build for themselves a good career as new-media entrepreneurs or social media personalities.

With the digital media revolution, there is an exciting new world evolving before our eyes, but in some ways, it seems to turn out like the old one quite a bit more than I once had thought...