Paxos Simply Explained

When I was studying distributed algorithms, I was wondering why there are suprisingly little simple algorithms or algorithms with simple explanations in distributed systems. The existing ones were often quite complicated or very old, like the Echo algorithm which has been discovered by Ernest J.H. Chang in 1982.

One reason is that distributed systems are hard. There are impossibility theorems like the FLP impossibility theorem from Fischer, Lynch and Paterson which says that no deterministic fault-tolerant consensus protocol can guarantee progress in an asynchronous network.

The problems that distributed algorithms try to solve are also hard. Normal algorithms are about sorting and searching in graphs and strings. They are described in detail in the algorithm books from Sedgewick or in the one from Cormen, Leiseron and Rivest. Distributed algorithms are more complicated, you have to deal with different times and race conditions, messages which never arrive, processors that may fail, etc. Therefore distributed algorithms are often about preventing these traps, like mutual exclusions to prevent race conditions, or about treating a distributed system as a unified one, which leads to atomic commits, election and consensus, etc.

One such algorithm is the Paxos algorithm from Leslie Lamport. It is an algorithm for solving the problem of consensus in a network of unreliable processors. Leslie Lamport himself has written a paper Paxos made simple because people complained the algorithm is too difficult to understand. And it turns it out it might not be that difficult to understand the basics at all. It is quite similar to a normal common sense approach to assemble a team of participants, let us say to play a game of “Kicker”.


Many startups have a “Kicker” table, where employees can play occasionally a game of table football for recreation. A common sense approach to assemble a team of participants is simple. Let us says the employees communicate over a chat program like Slack or Skype, and share a common channel. A typical communication might look like this:

Tom wants to propose a game
5:02 Tom: 1
5:03 David: 2
5:35 John: 3
5:39 Andrew: 4
5:40 Tom: Go
5:40 David: Go
5:41 John: Go
5:41 Andrew: Go
=> Round successful, game can be started

When the last of the 4 necessary team members has agreed to accept, then the team has reached consensus who wants to play and the game can be started immediately. Let us look at the sequence of steps a bit more detailed:

  1. The first participant who wants to start a game, here the one named “Tom”, starts to count, and ask the others to prepare a game. We call him proposer
  2. A second participant promises to participate, increases the counter and sends a corresponding message. He is called acceptor
  3. If three acceptors have send their promise to participate, and the counter has reached the necessary limit of 4, which we call a quorum, then the next phase can begin
  4. In the next phase the proposer can ask for acceptance from the acceptors. Each acceptor must acknowledge the acceptance by sending a message like “Go”.

If not enough acceptors can be found to participate in the game, the round failed, and the game can not be started. Then a new proposer must make a new proposal, and a new round is started.

John wants to propose a game
6:12 John: 1
6:13 Anne: 2
6:22 Mike: 3
7:15 Aaron: 4
7:16 John: Go
7:17 Anne: Too late, I am already going home
=> Round failed

And while it is far more simple than the real Paxos algorithm, the basic phases are quite similar. We have proposers and acceptors, phase 1 (Prepare & Promise) and phase 2 (Accept & Accepted), rounds can succeed or fail, and in the end the team must find a consensus who wants to play. You see, sometimes simple things as chat programs can help to explain complicated distributed algorithms. Slack can help to understand Paxos.

What makes a good developer?

You have certainly heard of those rockstar developers who are 10x and 100x times more powerful and productive than others. What is their secret? What makes a good developer? Well, good or bad is always subjective. Good according to what metric? Taking drugs and damaging property does certainly not make you a rockstar developer. Often the rock star developers are simply 10x or 100x as productive because they use the code of 100 or 1000 other developers in a clever way.

I have been a developer for more than 20 years. During my life as a software developer I have met many kinds of developers. Ordinary people, young students, experienced veterans, etc. The best ones were often immigrants, newcomers, misfits, the round pegs in the square holes that Steve Jobs mentioned. The ones that were passionately curious, loved programming, and relentlessly debugged their programs. Because they had the constant drive to learn something new, and the burning desire to prove that they are something and can do special things.

What is a good developer? The ones that thought they were very good often were only good in a certain aspect they valued more than anything else. Some were obsessed with fast code, others were obsessed with beautiful code. While a certain amount of speed and beauty is certainly useful, obsession is rarely recommendable.  In real life there are always tradeoffs to make. Quality is important, and performance, too. Yet quality is not the same as beauty. And questions remain. How fast is fast enough? How beautiful is beautiful enough? Even your commit messages of the commits to the code repository result in a poem if you collect them, it does not mean that your code will work.

According to Larry Wall, the original author of the Perl programming language, there are three great virtues of a programmer: Laziness, Impatience and Hubris. Unfortunately all of them are more or less antisocial, but useful. A lazy developer will create less code, an impatient fast code, and an arrogant one beautiful code where you can be proud of. If you are one of them, then congratulations, you will probably have major difficulties to get along with others. They will become angry because of your lack of support, your impertinent laziness or arrogant remarks.

Getting along with others is an important ability, too. Applications are always developed by teams. If you are not able to work with others respectfully, then you will not be successful in the long run. Therefore for me a good developer is someone who produces less code, fast code, and beautiful code that works. All of these points are important. And at the same time this person should be humble enough to get along with others. That is not easy. Yet the basic requirements are easy and free: the willingness to learn new things and to explore the latest trends becaues technology is always evolving, and the willing to get along with others because programming always happens in teams.


Test Driven Development


Last week I was on the TDD Geecon conference in Poznan, Poland. It took place in a cinema. What a cool place for a conference, PowerPoint presentations on a big cinema screen. My overall impression is that testing and continuous integration have become mainstream and common practice now among developers. TDD is just an aspect, a special emphasis on a test first approach. For an overview about TDD I recommend Steve Freeman‘s presentation Ten Years of TDD and his book “Growing Object-Oriented Software, guided by Tests”.

Last year there was a bit of a controversy if TDD is dead. I think it is rather Ruby on Rails which is dead, not TDD. I used to write applications in Ruby on Rails for the last 8 years, but the last year I have mostly developed in Javascript. With Coffeescript, CommonJS modules, Bower and the whole JS ecosystem that emerges Javascript becomes very powerful – and you finally can use object-oriented programming in JS as well. New powerful Javascript libraries and frameworks appear frequently. Rails seem to disappear in the background.

I observed earlier that testing is like cleaning: as a cook you can cook in a dirty kitchen, but in the long run it is not recommendable to let bugs grow in the kitchen. Similarly as a developer you can develop applications without tests, but it is not recommendable to have undetected bugs in the application. And if you always clean a bit, that is to say if you always check in your code a little bit cleaner than when you checked it out, you will eventually arrive at a clean system (like a liveness property in distributed computing).

Tests are like seat belts. There was a time when we drove without seat belts and had no insurance at all. Today it is everywhere required to use them during driving. They increase security and guarantee that nothing bad happens. Just as in real crash tests, we try to break the system in tests in every possible way. The continuous execution of tests is the task of CI server, which rely on a good test coverage. They are like insurances, possibly costly and cumbersome to configure, but once they are setup, they are increse safety a lot and a very comfortable.




If tests are like seat belts, then mocks can be compared to crash test dummies: they look and act a bit like the real objects, but they are just fake. And they are quite useful, if they are not overused. Overuse can be costly and bad. It is important to use safety tools like tests and dummy objects in the right way. A  few points I noticed in the conference were:

  • TDD guarantees that your code has tests. Because you write tests first and code later, the code you write is always covered by tests, and can be refactored well later on.
  • Mocks are indispensable for TDD, like crash test dummies for car safety, but bad if you overuse them. Do not overuse mocking.
  • A single test should only test one thing, one rule, or one piece of business logic. If the test fails, you know exactly which rule has been violated
  • Tests should reflect what a method does, not how it does it. We should test behavior, not the underlying algorithm or implementation. If you test the implementation, you can not change it without breaking the test.
  • Sometimes duplication can be useful to avoid duplication, duplication of test cases is useful to eliminate duplication in code if it helps to remove repetition by suitable refactoring. In the end more repetition in tests could mean less repetition in code.

If we take a look back, then we notice that tests have always been import in engineering. In rocket science it is well know that a rocket which looks perfect on paper but has not been tested is more likely to end in explosion than in orbit. For the moon rocket Saturn V for instance, each element was tested individually over and over again before the first launch of a complete rocket: the rocket engines, the different stages, the Launch Escape System, etc. I read somewhere that there was not single element of the Saturn V rocket which has not been tested before thoroughly. Tests can bring us to the moon and back :-)

Tests are also a bit like exams. They examine if the system or the code fulfills all the necessary requirements. In the far future, when we have possibly deep learning in autonomous entities, we might write only tests, and the system tries to pass the tests after it has learned and trained for a while itself. You write the test and the system does the rest. In this sense, TDD could be the future.

Photo Credit:
– Don’t be a Dummy from Brett Klger
– Touring Club Suisse/Schweiz/Svizzero TCS via Compfight cc

Coding and Communication


Many of the best coders, developers and programmers have one problem: they do not know how to communicate well. Or they do not want to communicate. They know how to write code in the most complicated languages, but they do not communicate well with their peers, neighbors and colleagues, although they communicate well with their machine all the time (by typing, hacking, pointing, etc.). Either they do not speak the language well, they find it boring, or they do not like to waste time with talking. They rather sit with their headphones on and talk to their computer, which is of course what they are paid for. But this lack of communication is of course a problem, because every developer in a team (a developer works rarely alone) would like to know what the other team members are doing, have done and plan to do.

But there is indeed a way to bring developers to talk with each other: give them a program to communicate. Give them a chat program, and they can communicate by coding. Developers only really talk with each other if they can use an application for it. Luckily we have plenty of chat programs like Skype, Campfire, HipChat or Slack. If we need to exchange larger texts we can use email and wikis.

Likewise developers are always happy if their fellows show them what they have done, i.e. their code, their work, and how it works. Unfortunately they usually won’t do it. They only show their code to each other if they must, or if they can use an application for it, for example a version control system with a nice GUI like GitHub, GitLab or Gitorious. The pull requests from GitHub or merge requests from GitLab can be used for code reviews. Actually, this is one of the best features of GitHub, isn’t it?

Finally, developers only ask others for help if they can use an application for it, like Stackoverflow for instance. Maybe instead of forcing programmers to communicate in a language they do not like, it is better to give them an additional tool they like, an application. By using this application they can communicate by coding. Many of the most promising startup companies at time, like StackoverflowGitHub, or Slack are actually tools for coder to communicate with each other.

The network cable picture is from Flickr user tueksta

Mind the gap between platform and requirements


You probably know the famous “Mind the Gap” signs in the subway (for example in London). They remind you of the gap between train and platform. As developers we should always be aware of the gap between platform and requirements. It is the task of the developer is to close the gap between framework and requirements. But if the gap is too large,you might stumble, and the risk of failure rises.

If you need more than a few lines for a “Hello World” program, then the gap is apparently too large, and you are probably using the wrong language, library or framework. If you need already many lines of code for a very simple problem, than you need of course much more lines for a complex real-world problem. Probably too much to keep it simple, the rule number 1 in software development. In order to bridge the gap without stumbling (or falling into the abyss) we often use plugins, libraries or frameworks.

Actually, closing the gap means closing the gap on multiple levels. Frontend development means adapting the views and templates until the gap between things which should be displayed and things that can be displayed is closed. Things which should be displayed are typically specified in the requirements and the wire frame models. As a developer you tweak and twist your interface until every pixel looks like it should.

Backend development means similarly adapting the data model and the business logic until the gap between things which should be stored and things which can be stored is closed.

The Flickr photo is from user comisariopolitico

The rise and fall of the Microsoft empire

EmpirePeople have always been fascinated by the rise and fall of empires, as the popularity of Edward Gibbon’s monumental work ‘The History of the Decline and Fall of the Roman Empire‘ has shown. Even a large and mighty empire can crumble and fall. The Roman Empire vanished. The British Empire is gone. It can occur for tech empires as well: does anyone remember the rise and fall of DEC? DEC (“Digital Equipment Corporation”) was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s. The empires of IBM and DEC are gone. IBM is only a shadow of its former self, and DEC has vanished with the emergence of Microsoft. Now, there is no reason why Microsoft should not have a similar fate. Empire can rise and fall again.

The reason why Microsoft became a successful empire is not because their software was superior. Neither MS-DOS nor the x86 processors from Intel were better than comparable products. The x86 processor architecture is indeed often considered as ugly. But they were cheap and widespread. Compatibility was the key. PCs with MS-DOS were business standard. They were good enough to run simple word processing and spreadsheet software. Software written for MS-DOS would run on any MS-DOS computer. A lock-in effect with a positive feedback loop set it: people wrote software for PCs because PC sold well and were widely distributed in the business world, and people bought in turn PCs because there were at lot of software available for them. Soon everybody in the business world was using PCs, and the old DEC empire started to crumble. Microsoft used the new market power to gain a competitive advantage in the world of windows systems. Again compatibility was the key. How many people remember the OS/2 operating system from IBM or VAX/VMS from DEC today? All commercial competitors disappeared until only Microsoft was left with Windows. Linux was able to survive in the open-source corner, a niche that is hard to tackle even for large corporations. But it was no serious opponent in the world of window systems.

This has changed. There are 750 million Android devices today. Times in the IT industry change fast. Now apparently the Microsoft empire starts to decay (or at best to stagnate). The very pillars which made Microsoft successful begin to crumble. The new Windows 8 system is no longer compatible to the classic world of Microsoft Windows software. There is no longer a central desktop where Windows applications would run. There is a desktop, but it is hidden behind a new interface. As you know Windows 8 comes with a new colorful surface named “Metro”, which is intended to replace the desktop. Microsoft wants people to use the new “Metro” interface instead of the classic desktop, and wants to people to download apps from their app store, similar to Apple’s app store, or Google Play (the former Android Market). Apparently Microsoft tries to keep pace with their competitors. Unfortunately they seem to damage the very pillar they are built on: compatibility.

Using old Windows software on a new Windows 8 system is a hassle. Older versions of windows programs for instance use often a help in the Windows Help format. This format is no longer supported in Windows 8. Just try to enable the legacy windows help system winhlp32 on windows 8. It is annoying. If you start an old applications which uses Windows Help, then you might get the following message:  “The Help for this program was created in Windows Help format, which depends on a feature that isn’t included in this version of Windows. However, you can download a program that will allow you to view Help created in the Windows Help format.” If you do this, and follow the official links, then you will get a link to an update of the help system, and if you try to install this update, then an error message occurs which claims “the update is not applicable to this computer”. Great. It is possible to get it working, it is just difficult. There is in fact a non-functional stub of WinHlp32.exe in Windows 8, which shows the above message that the help does not work. It is possible to replace the WinHlp32 file, but the “TrustedInstaller” prevents you from doing it. Obviously Microsoft does not mind or does not care if older programs (for their own platform) do not work.

From my humble point of view, Microsoft needs to fixed two things: they need to ensure compatibility as much as they can (for example by fixing things like the WinHlp32 problem, even if it is a minor issue), and they must win the hearts of business customers back. These are the pillars their empire is built on.

  • Microsoft successfully managed to alienate many of their loyal developers and now even their main customers, i.e. small and large businesses. Their main software is called Office, and it is used in offices: in most offices I know there are PCs running Microsoft Windows. If MSFT continues to alienate these customers, then they should have a problem. These users do not have touch screen devices, and they are used to classic graphical user interface with desktop and mouse input. They want to use the Office software they know (Word, Excel and Powerpoint) in the way they always used it. The new Metro interface is not useful at all for classic computers with keyboard and mouse. By hiding the old desktop behing the new Metro UI, the multi-dimensional Window UI is essentially being replaced by a 2-dimensional UI made of rectangular colorful tiles. Like the ones we had in the age of DOS. The new Metro UI and the flat colored “live tiles” feel like a step back to the age of DOS. A finger is always less precies than a mouse pointer, just because it is much wider. It is maybe useful to point to pictures or icons, but it is not useful to use office software. A real step forward would have been a 3D UI (as they can be found in games today), where the traditional desktop could be accessed through windows. That would have been revolutionary.
  • Apparently they neglected the compatibility of existing Windows software. This was always an advantage of Windows. Now traditional Windows software does not run as good it always did, and the new Microsoft App Store offers only a few apps. If Microsoft’s app store will offer as many good apps as the stores from Apple and Google remains doubtful. Developers tend to develop software for widely distributed systems, but most of the new devices run Android (i.e. a Linux derivative). Users increasingly use and buy computers without Microsoft OS, either smartphones (iPhones and Android phones) or tablets (iPads or Android tablets). Whether Windows phones will be successful is an open question.Any UI rises and falls with the number of good apps available for it. A total replacement of the old desktop in the medium term would render all existing applications useless. And when it comes devices with touchscreens, iPad and Android devices are at least as good as the new Windows 8, but wider distributed.

This means Microsoft loses all traditional advantages at once by the radical switch to a new UI. We will see how it turns out. I have a feeling that it will not turn out well. Too much change and too late. Is this the beginning from the end of the Microsoft empire? Will they end like IBM, a pale shadow of their former self? People increasingly buy smartphones and tablet PCs, but they are not from Microsoft: they are mainly from Apple (iPhone & iPad), or equipped with Android. We have seen in the Microcomputer revolution what happens to older, larger systems if they are increasingly replaced by newer, smaller systems with a new operating system. I am curious how it will turn out this time.

( Photo Credit: Pedro Vezini via Compfight cc )

Unsteadiness of progress in development

CanyonThere is a certain unsteadiness and ruggedness in the software world. Software development often feels like moving on a rugged landscape: sometimes it goes amazingly fast, but often you are just stuck and do not make progress for hours. Either you make a lot of progress in a few time, or you make no progress at all for a large time span. There are times when you make a few keystrokes and everything just works, for instance when you stick a few plugins together, make some function calls, add a few lines of code, and everything just works. These are the good times, when you think you have achieved world domination and can move an army of bits with a few keystrokes, when the programmers are like little gods in their little self-made binary universes.

And then there are times when things look desperate, when nothing works at all, and you do not know why, and can not figure it out. An exception has been raised, an error occurs, or something does not work, and you have no idea why. Plugins for instance are wonderful if they work out of the box, autmatically. But if they do not work, then it becomes cumbersome. The more automated a plugin or component is, the more annoying is it when it stops to work, because in this case you have no other options than examining it in detail, which means to drill down through the simple shell into the complex core where you understanding nothing at first.

Version conflicts and dependency hells can be very time-consuming and annoying, too. Ruby-on-Rails programs for example need the right combination of Ruby Version (for example Ruby 1.8.7 or 1.9.2), the right Ruby-On-Rails Version (2.3.8 or 3.2), and the right RubyGems Version (say 1.3.5). The gems or plugins have their own versions, too. The whole system only works if everything fits together. In the beginning this is no problem, for a new system usually everything is up-to-date. But then time goes on, and you have to update the Linux version, or the Ruby version, or the RubyGems Version. And suddenly the other versions no longer fit. It can be very frustrating to get the system working again in this case.

Software programs usually are not fault-tolerant systems at a basic level, there is no graceful degradation in machine language. On the lowest level in machine language or assembly the program works only if there is no error. A single error can be the system to a full stop. Either the computer program runs, which means you have to get every instruction right, or it hangs, throws an exception and stops completely. It is of course usually possible to figure the problem out, if you have enough time, but sometimes it takes a long time to understand what is going on in the various stages of debugging.

Photo Credit: tim caynes via Compfight cc