Looking back at FSEN 2015

Not a liveblog about a conference this time, but hey, still the talks were interesting!

From 22 to 24 April I attended FSEN 2015. For me, the main reason to attend the conference was presenting my work on benchmarking parity games. The conference was extremely well-organised at IPM in Tehran, in particular the students part of the local organisation deserve a special word of thanks for the effort they made in making the conference into an unforgettable experience! By the way, did I already say that the people in Iran are very friendly, and that it is a really nice country to visit? Well, now you know, anyway!

The program of the conference was nicely balanced, with topics such as software architecture, runtime verification, static analysis, software testing, and formal methods. There are a few talks that I would like to discuss in a bit more detail (disclaimer: the choice for these particular talks does not say anything about the quality of the talks).

Automated Integration of Service-oriented Software Systems by Paola Inverardi

As the title suggests, in her talk Paola concentrates on the ways we integrate software. Today, when developing software, having to integrate both black box components (those of which we do not have the source code), and white box components (for which do have access to the code) is a fact of life. The goal is to build a software system satisfying our requirements.

Given the nature of today’s systems, getting 100% correct systems is impossible (especially since we do not really know what we get when using blackbox components) therefore it makes sense to concentrate on what we can do. The statement Paola makes is the following:

Final system behavior and dependability depends on the integration code, integrating white box and black box components.

In other words, the code that we write to integrate systems is what we can control, and therefore it is the key to building dependable systems from components. However, in order to achieve confidence in the system we need:

  • behavioural models,
  • the process lifecycle, and
  • automation.

For this, she proposes to view software science as an experimental science. That is, we need to move from a creationist world in which formal models come with the code, to an experimental world in which the knowledge of software is limited to what can be observed.

In her talk Paola discussed two experimental approaches that help to move towards this goal:

  • Learning interaction protocols using a WSRL description of an interface (based on the tool Strawberry).
  • Synthesising integration code using, e.g., integration patterns.

Important but hard question to ask when using these approaches: when is your model complete enough with respect to a given goal? In partial when using quantitative and/or partial models we need more research!

QuickCheck: Testing the Hard Stuff and Staying Sane by John Hughes

The second keynote of the conference was by John Hughes. This was one of the most entertaining talks I have seen in the past years, while still covering a lot of content.

QuickCheck is a tool for automated test case generation, and when a test fails a minimal counter example is generated. It used Erlang as a declarative language for modeling and specification. Using a single model, many bugs can be found using random generation of test cases. John explained the theory behind the tool using a wonderful interactive example of a circular buffer, implemented in C, which of course contained all sorts of subtle bugs that were found using QuickCheck.

Now of course, you are wondering, does it work when applied to real world software. John showed it does by providing some industrial examples that he worked on with his company QuviQ.

When you put the car together, you can’t even boot it, let alone drive it.

  • Testing cars agains the AUTOSAR standard, which is a standard for software components in cars. This project is all about the integration of software from different vendors in a single car. They discovered around 200 problems using QuickCheck, half of which turned out the problems in the standard. The severity of the integration problems is nicely illustrated by the quote “When you put the car together, you can’t even boot it, let alone drive it.”
  • Analysing software problems that were hard to reproduce using traditional testing approaches at Klarna, race conditions in Erlang’s Dets database we discovered. For this case study it was necessary to extend test case generation to generate parallel unit tests.

The take home message of this talk is: Don’t write tests… Generate them!

Formal Methods Friday

This post is already getting quite lengthy, but I would still like to say something about the Friday of the conference. As the title of this section suggests, the Friday was mainly on formal methods. The level of talks during this day was particularly high (although I admit, that might have to do with my preference). Two extremely interesting talks were the following (I do recommend checking out the papers!):

  • A behavioural theory for a pi-calculus with preorders by Jean-Marie Madiot. An excellent paper introducing an axiomatisation for a pi-calculus in the context of preorders. He discusses interesting problems with respect to congruence that pop up in the process.
  • A theory of integrating tamper evidence with stabilization by Ali Ebnenasir. In this talk Ali takes a look at system correctness, but not in a traditional sense. Instead, he wonders what happens in unexpected circumstances; for instance, what happens if an adversary tampers with the system? He goes into the notions of stabilisation (a legitimate state is eventually reached by system actions only) and introduces tamper-evident stabilisation, which is weaker than stabilisation, and allows for some tampering by the adversary.

Paper: Benchmarking Parity Games

Update 26 May 2015: Slides added.

Next week, on April 24, I will present my paper Benchmarking Parity Games at FSEN 2015. The core contribution of this paper is a set of parity games that can be used for benchmarking purposes.


While working on my PhD thesis, and several papers on parity games that are part of it, I observed that papers in the literature that discuss parity game algorithms take one of the following approaches:

  • the paper focuses on the theoretical complexity of the algorithm.
  • the paper describes an experimental evaluation of an algorithm, but uses a limited set of parity games for this evaluation.

An additional observation for the second category of papers is that each paper defines its own set of parity games to do this evaluation. If we want to compare experimental results, for instance because we want to see whether our own algorithm or heuristic is an improvement compared to an existing algorithm, it is problematic that we cannot do so based on a standard set of games.


In this paper, I provide a set of parity games that can be used to overcome the problem illustrated above. The set consists of games representing different problems: synthetic games, i.e. random games and families of games that are hard for some existing algorithms; model checking games; equivalence checking games; and games for satisfiability and validity of modal logics.

Additionally, I describe an analysis of the games with respect to a set of structural properties, including a newly developed notion of alternation depth for parity games. This new notion is an attempt to more accurately describe the inherent complexity of a parity game, along the same lines as the different notions of alternation depth that exist for the modal mu-calculus. Details can be found in the paper.

Concluding remarks

Currently the set of benchmarks does not include parity games generated from synthesis problems; such games would form a very welcome addition. If you would like to contribute such games (or any other games) please contact me (e.g. at ), I would be more than happy to include them.

The analysis of the benchmarks that is described in the paper also gives rise to some interesting research questions:

  • Can structural properties of graphs be computed more efficiently (especially width measures of graphs could not be computed for reasonably sized instances)?
  • Are there heuristics that can approximate the graph measures that cannot be computed exactly?
  • Can the theoretical complexity of parity game algorithms be improved by considering the notion of alternation depth that I propose?

Further reading

An extended version of the paper is available as J. J. A. Keiren. Benchmarks for Parity Games. arXiv.org [cs.LO] No. 1407.3121. 2014.


  • Parity game generator GitHub repository. This is the code I used to generate the parity games. For instructions see the ReadMe in the repository.
  • PGInfo GitHub repository. This is the tool that I created to determine the structural information of the parity games.
  • The generated games, and the corresponding data is available from the servers of the Open University in the Netherlands, you can download the following:
    • Model checking: Parity games encoding model checking problems.
    • Equivalence checking: Parity games encoding equivalence checking problems.
    • PGSolver: Parity games generated using PGSolver. These include: random games, and games that are hard for certain algorithms.
    • MLSolver: Parity games generated using MLSolver. These are games that encode satisfiability and validity problems for modal logics.
    • Data (yaml): This is a file in YAML format that includes high-level information about all parity games described before. Detailed (structural) information about the games can be found as separate YAML files in the respectipe ZIP archives.
    • Data (sqlite): This contains a database in SQLite format that was derived from the YAML files provided in the archives, and the previous data file. The sole purpose of this derived data is to enable efficient plotting of results. The plots in the paper have been based on this file.

Contributions to the tools and the sets of games are appreciated.