Building a code instrumentation library with Python and ZeroMQ

Software Developer's Journal

4.87/5 (4 votes)

Aug 9, 2013

CPOL

12 min read

17898

This article is written by Rob Martin, and was originally published in the August 2013 issue of the Software Developer's Journal. You can find more articles at the SDJ website.

The problem

Like many people, I confused the Heisenberg Uncertainty Principle with the Observer Effect. The Heisenberg Uncertainty Principle asserts that we cannot accurately measure pairs of physical properties of particles. That is, if we know one value, the other is unknowable. This is best illustrated by the story of Heisenberg being pulled over by a police officer. The officer asks Heisenberg if he knows how fast he was driving. No, but I know where I am, says Heisenberg. The officer says, Sir, you were driving 76 miles per hour. Heisenberg replies, Great. Now I'm lost.

On the other hand, the observer effect describes the impact on a system of measuring things within that system. A common example is measuring the pressure in a tire. It's hard to do without having a bit of air leak out, thereby affecting the pressure of the tire. The more we measure, the flatter the tire gets.

Instruments that measure code performance are subject to the observer effect too. The mere process of inserting code (or reflecting on existing code) to measure our application's performance will also affect the performance of the application. If we naively code against a cloud-based metrics tool such as Librato or New Relic, we risk significant impact on our performance waiting for blocking I/O as we record the measurement to a remote location.

I needed a library for measuring code performance, and I wanted to minimize the observer effect. This problem suggested an asynchronous programming pattern that records in real time, but stores the result within another thread or process.

ZeroMQ

Enter ZeroMQ.

On the surface, it would seem that ZeroMQ is a very fast ("zero" time) message queue, but I find that a bit of a misnomer. It's not a message broker like RabbitMQ. It doesn't support the Advanced Message Queuing Protocol (AMPQ). There's no management interface. It doesn't persist messages to a disk, and if you don't have a subscriber, all of the publisher's messages are dropped by default. You can't inspect messages or get statistics on the queue - at least not without writing your own management layer.

ZeroMQ is more like a brilliant and fast socket library with built-in support for a wide variety of asynchronous patterns. This makes ZeroMQ an ideal message dispatcher when you don't need complex broker features. For my metrics library, I didn't need those features, but I did need speed.

Building a code instrumentation library

The features I wanted were simple:

Easy instrumentation for timing and counting.
Highly efficient operation within the instrumented code.
The ability to instrument multiple programs, processes, and threads in one common system. These processes are the publishers of metrics.
Support for multiple back-end systems to consume the metrics. These processes are the subscribers to the metrics.

Because ZeroMQ was my message dispatcher and Librato.com was my primary target for recording and aggregating these metrics, I chose the portmanteau Zibrato as my project name.

The Zibrato library for code instrumentation

Before we get into the architecture of the library, let's take a quick look at the API. The Zibrato library provides three ways to instrument your Python code:

Timers: Zibrato provides a decorator that can time any defined function or method, and a context manager that works with any code block.
from zibrato import Zibrato
z = Zibrato()
# decorated function
@z.time_me(level = 'debug', name = 'myfunct_timer', source = 'myprog')
def myfunct():
time_consuming_operations()
# context manager
with z.Time_me(level = 'debug', name = 'timer_name'):
slow_function_to_time()
Counters: Zibrato provides a counter method as a decorator or as a context manager.
from zibrato import Zibrato
z = Zibrato()
# decorated function
@z.count_me(level = 'info', name = 'myfunct_counter', value = 5) # inc by 5
def myfunctc():
pass
# context manager
with z.Count_me(level = 'info', name = 'counter_name', source = 'deathstar'):
pass
Gauges: Finally, Zibrato can be used to insert an arbitrary value into the backend at any point in the code.
from zibrato import Zibrato
z = Zibrato()
# Zibrato gauge
z.gauge(level = 'crit', name = 'gauge_name', value=123)

This is just a quick overview of how Zibrato is used to instrument code. For more information, check out the library at Pypi (https://pypi.python.org/pypi/Zibrato) or look at my Github.com repository (https://github.com/version2beta/zibrato).

The architecture

In order to accomplish the goals behind the API, Zibrato is divided into three parts:

The Zibrato library, which implements the API described above and publishes metrics to the message queue. As a user, I can have zero or more instrumented processes all communicating with my message queue.
A message broker, which subscribes to zero or more publishers of metrics and in turn republishes the metrics to zero or more backend subscribers.
Zero or more backend providers, which subscribe to the message broker to receive metrics, then in turn do whatever is appropriate with them. In my application, the backend provider sends the messages to my Librato account.

This is referred to as an "extended pubsub pattern". The message broker (sometimes called a message bus) is core to this topology. It provides support for multiple publishers, and filters which messages the backend providers will receive.

To implement Zibrato on a server, I generally use supervisord to start the broker and my Librato backend. Then any code that is instrumented can connect to the broker, and the backend will receive whatever messages meets its filter and forward them to Librato. The backend aggregates messages and performs the blocking I/O portion of the work, communicating across the network to Librato.com, in a completely seperate process from the instrumented code.

Using ZeroMQ

Clearly, ZeroMQ is the special sauce in the Zibrato architecture, doing the heavy lifting of messages from the instrumented code to the backend providers. It provides the asynchronicity.

If you don't already have ZeroMQ installed, it's easy to do with pip:

pip install --upgrade python-dev pyzmq

If you're running Anaconda Python from Continuum Analytics, pyzmq is already installed.

Creating a message broker

Our broker subscribes to publishers and then publishes to subscribers. ZeroMQ refers to this type of device as a "forwarder". It's fairly trivial to do this in Python:

First, we create a ZeroMQ context, which is basically a thread-safe container for our sockets that allows us to cleanly shut everything down when we're done with it.
Then we create a TCP socket to serve as a subscriber to the instrumented code. This could have been done using a Unix domain socket (similar to a named pipe) or a PGM multicast IP socket, but I wanted the ability to connect to the broker on a specific IP address and port, even if the broker is running on a different server from the instrumented code. Potential gotcha: by default, a subscriber filters out all messages, so we have to tell it what messages to receive. An empty string tells it to receive all messages.
Next, we create a TCP socket to serve as a publisher for the backends. Again, I made a design decision allowing backends to run on separate servers.
Finally, we combine our subscriber socket and our publisher socket into a ZeroMQ forwarder device.

Here's the code.

import zmq

# Get the ZeroMQ Context

context = zmq.Context()

# Create a subscriber

subscriber = context.socket(zmq.SUB)

subscriber.bind('tcp://127.0.0.1:5550')

# Subscribe to all messages

subscriber.setsockopt(zmq.SUBSCRIBE, '')

# Create a publisher

publisher = context.socket(zmq.PUB)

publisher.bind('tcp://127.0.0.1:5551')

# Combine the subscriber and the publisher into a forwarder

zmq.device(zmq.FORWARDER, subscriber, publisher)

This code gets us started. Now any ZeroMQ publisher written in any programming language running on localhost can send messages to our broker. Likewise any subscriber on localhost can receive those messages. If instead of 127.0.0.1 we created our sockets on an IP address accessible over the network, the broker would be able to connect to publishers and subscribers on other machines, too.

Building a subscriber

A broker without any listeners has little purpose in life. Let's create a simple subscriber that receives messages and prints them to standard output. We'll create this in a separate Python script - it runs separately from the broker.

Here's how to create a subscriber:

First, we connect to our ZeroMQ context described above.
Then we create our listener's socket.
Next, we set a filter to which our socket subscribes. Our call to receive messages from the message broker will only return values that start with this filter. An empty string subscribes to all messages, but the default is to subscribe to no messages so we have to set it to something, even if it's an empty string.
Finally, we set up a loop to keep on listening until something comes in.

Our code might look like this:

import zmq

# Get the ZeroMQ context

context = zmq.Context()

# Create a socket

socket = context.socket(zmq.SUB)

socket.connect('tcp://127.0.0.1:5551')

# Subscribe to all messages

socket.setsockopt(zmq.SUBSCRIBE, '')

# Keep on keeping on

while True:

print socket.recv()

This code gives us a listener that will receive anything the broker forwards and print it to STDOUT. Now all we need is someone who will give the broker some messages to forward.

Building a publisher

If a tree falls in the forest and there's no one to hear it, does it make any noise? I don't know the answer to that question, but I do know that a message broker without any publishers lives a pretty quiet life.

Connecting our broker to a publisher is easy to do. In a third Python script, we'll create a publisher that sends messages to the broker. If we do it right, the subscriber we created in the last section will receive these messages and print them to STDOUT.

Here's how to build a publisher:

First, we connect to our ZeroMQ context, just like we did above.
Next, we create our publisher socket.
Finally, we send a message from our publisher to the message broker.

Here's the code:

import zmq

# Get the ZeroMQ context

context = zmq.Context()

# Create a socket

socket = context.socket(zmq.SUB)

socket.connect('tcp://127.0.0.1:5551')

# Subscribe to all messages

socket.setsockopt(zmq.SUBSCRIBE, '')

# Keep on keeping on

while True:

print socket.recv()

With these three components, we have a publisher that sends messages, a subscriber that receives messages, and a broker that forwards messages from any connected publisher to any connected subscriber.

Putting it all together

Zibrato is based on the extended PubSub pattern described above, and uses the same three basic components.

On the front end, we have the Zibrato class that provides the instrumentation methods. This is our ZeroMQ publisher. The code itself is very light, and all it needs to do is get the message to our broker, so the net impact on performance is very low.

In the middle, we have a broker coded much like the example above. It is a simple forwarder that accepts connections from multiple publishers and forwards messages to multiple subscribers.

On the back end, we have a library that implements the Librato HTTP API using the Python Requests library to store the metrics we've recorded. It flushes the data to Librato on a set schedule, including rolling up the counters. The Librato backend inherits from a standard backend class, so it's easy to implement other kinds of backends too - like simply outputting to a log file or sending to a central Statsd server.

Credits

Isaac Newton once said If I have seen further it is by standing on the shoulders of giants. At best I squat and risk falling off the giants' shoulders, so I prefer the earlier quote from Isaiah di Trani, who said Who sees further a dwarf or a giant? Surely a giant for his eyes are situated at a higher level than those of the dwarf. But if the dwarf is placed on the shoulders of the giant who sees further?... So too we are dwarfs astride the shoulders of giants. We master their wisdom and move beyond it.

ZeroMQ is the brilliant work of Pieter Hintjens and iMatix Corporation. It is a powerful and flexible messaging platform and I highly recommend it for asynchronous applications. I also recommend reading Pieter's ZeroMQ Guide. It's lengthy and comprehensive, but it's quite accessible and even an enjoyable read.

http://zeromq.org/
http://zguide.zeromq.org/

I first became aware of Librato on the Ruby Rogues podcast #62 featuring Joe Ruscio, Librato's CTO and cofounder. They've done an excellent job of making metrics easy. Librato offers free development accounts and a free month of production with very reasonable pricing thereafter.

https://metrics.librato.com/
http://rubyrogues.com/062-rr-monitoring-with-joseph-ruscio/

Zibrato was initially inspired by Etsy's Statsd package, a Node.js service that, coupled with Graphite (written in Python and Django), provides a full asynchronous instrumentation stack. On a related note, check out Steve Ivy. He has not only written a Python library for interfacing with Statsd, he's also reimplemented it in Python.

https://github.com/etsy/statsd/
http://graphite.wikidot.com/
https://github.com/sivy

I use Kenneth Reitz' Request library: HTTP for Humans. This library makes web interactions painless.

http://docs.python-requests.org/

My testing setup is greatly benefitted from Gary Bernhardt's Expecter package, and in the future I'll probably refactor my tests to also use his Dingus library. Since I first learned BDD in Ruby, Gary was very helpful in bridging my knowledge gap from Ruby to Python.

https://github.com/garybernhardt/expecter
https://github.com/garybernhardt/dingus

I've become increasingly impressed by base version of Continuum Analytics' Anaconda Python. It now goes on each of my development machines and virtual development environments.

https://store.continuum.io/cshop/anaconda/

Special thanks to Tracy Harms (https://twitter.com/kaleidic) who spent several days pair programming with me on the Zibrato project. His feedback and insight were invaluable.

~~~

Bio: Rob Martin is a developer at i.TV in Provo, Utah, USA. One of the cooler parts of his job is that he's expected to learn every language used in their stack. Before i.TV, he's done Ruby at a Python shop, Python at a PHP house, and Perl on the factory floor.

Rob Martin is version2beta most places online. Follow him on Twitter (@version2beta), Github.com (https://github.com/Version2beta), and on his blog (http://version2beta.com/). i.TV is hiring. Email rob@version2beta.com for more information.

Upcoming issues

If you're interested in upcoming issues please check our website. You can see for example a table of content of our two in one new Python pack. Last call! Python In a Few Lines of Codes and Python Starter Kit.

History

Keep a running update of any changes or improvements you've made here.