On Games and Simulations

This is a guest blog by Hans-Dieter Hiep, a student from Vrije Universiteit Amsterdam who worked with me to build a GUI for our work on game-based (bi)simulations. The tool described in the post is available through BitBucket, see https://bitbucket.org/jkeiren/game_based_simulation/branch/hans-java.

Update 20/3/2017: Hans-Dieter has now also started his own blog, and reblogged the post on http://www.hansdieterhiep.nl/blog/on-games-and-simulations/.

In this blog post, I would like to report on a relatively short but productive project with Jeroen Keiren and Tim Willemse. Let’s call the result of this project the Bisimulation Games tool. Work began on this project in the summer of 2016. In particular, this post covers the project’s set-up and architectural choices, serves in a small part as a promotion of the idea to do more interactive visualization of theoretical concepts in computer science, and also contains a sneak-preview of another project that replicates the most important design choice of this project.

In particular, this post does not cover the theory behind the tool in depth. We will see games for checking bisimulation and a short introduction how the application handles a particular scenario. For a more thorough background, see Games for Bisimulation and Abstraction (on Arxiv).


An important aspect of this project is the interactive visualization of games for showing bisimulation relations. We work with Labelled Transition Systems (LTS) here: an abstract representation of the state space of, for example, computer programs. Another example can be seen in the window of the application (above) we have five states: t_0 and t_1, s_2 and c_3 and c_4. Between states we have transitions: these represent observable actions (eat pizza and hack) or internal actions (so-called tau).

David de Frutos-Escrig, Jeroen Keiren and Tim Willemse have worked towards a definition of a game semantics for showing a number of bisimulation relations. In particular, as part of this project, Keiren has implemented game semantics for bisimulation, branching bisimulation and branching bisimulation with explicit divergence as a Haskell program. This program defines the game state space and game state transitions. Using these two definitions one can determine whether, given a vertex, either a winning strategy exists for either the player or for the opponent.

An important part of this project is to reuse this existing implementation. Here, an important design decision was made: to port the Haskell code to Frege, a Haskell implementation for the Java Virtual Machine. This allowed me to write the visualization and interaction with games in the Java AWT/Swing framework. The Frege project can be still considered experimental, as exemplified by a minor compiler bug that was found and fixed during the execution of this project.

Winning strategies

The next step in the project is strategy extraction. The existing Haskell implementation by Keiren provided an implementation for an algorithm for extracting the set of winning vertices for each player of the game: Spoiler (played by the computer) and Duplicator (played by the human operator). For an interactive application, a set of winning vertices is not enough: a human operator can take a step that would ultimately lead to her defeat, even if a winning strategy for Duplicator exists at that point. Thus, an extraction algorithm was developed that, from the set of winning vertices, determines a strategy (assigning to a vertex the best next move) for Spoiler, that is guaranteed to eventually win if a winning strategy for Spoiler exists. This part is not proved and only tested informally, but seems to work fine for the tested examples.


An important aspect is helping human operators discover flaws in their models. The application may act as a “debugger”, showing a step-by-step walkthrough of counter-examples that may help humans understand the model. Let us see an example, that is taken from Korver (see [16] in Games for Bisimulation and Abstraction mentioned earlier).

Suppose the user selects the node 0 (blue, left image) as opponent and A (red, left image) as player vertex. Remark that this is also logged in the message pane, on top of the window. First, the winning vertices are computed by the algorithm by Keiren (and shown in the lower left corner: the expected game outcome if both players play according to a winning strategy). Then a strategy is extracted, and the first step is taken: Spoiler challenges Duplicator to mimic the action “a”, as highlighted in blue.

Consequently, the human operator must choose an internal action (“tau” to B), since the other action (“b” to D) does not match the challenge. Then the Spoiler drops the challenge (second to last message, right image), and challenges Duplicator with a new action: from 0 to 4, mimic action “b”. Since Duplicator sits still in the new state of B, it cannot undo the “tau” step. Hence, the game is concluded: Duplicator has lost, and indeed, no winning strategy for Duplicator exists for these two starting nodes.

Internal Debugging

One interesting problem encountered during the execution of the project was a bug in the Haskell framework code, that I introduced because of a misinterpretation and misunderstanding of Büchi games. The game semantics is defined as a Büchi game. Such games may have infinite plays.

From the perspective of some player, a Büchi objective is to find a strategy of a finite play where the other player loses or an infinite play in which an infinite number of points is earned, or so-called accepting states are visited infinitely often. Observe that if we visit at least one accepting state in a cycle (a Büchi objective), then in an infinite play where the player remains within this cycle, it has visited an infinite number of accepting states.

So to prevent one player from winning, its opponent has to prevent that the player has an infinite play in which it reaches its objective. Thus the opponent must ensure that it either plays a finite game where the other player loses, or an infinite play in which all of the visited states are non-accepting (a co-Büchi objective).

To help finding the source of the problem, a monad was developed that keeps track of function context that is threaded through the functional code. For example, it dumps the current labeling of some vertex set each iteration of a function. Then a GUI was developed that tracks and shows, for each step, the current state of the function. This allowed me to find where the problem was located. The bug was found to have mixed up Büchi objectives and co-Büchi objectives in the complements of winning vertex sets, having not fully understood the duality.

The same debugging monad (and the corresponding table UI with minor adaptions) were later reused to debug the strategy extraction algorithm. These tools seemed to be of good use during the development.


The project was executed in 4 weeks. This short timespan ensured that only the most important aspects were touched. However, to make the project more worthwhile in the (near) future, it seems useful to explore even more bisimulation relations.

Although the current version of the tool is a fun toy to show simple models, there are numerous downsides: it is slow in execution (Frege is not the most efficient Haskell implementation), it may contain bugs or flaws in the GUI code, and the strategy extraction may be wrong. To extend the tool for more relations, one has to adapt both the GUI code and the simulation framework. And, finally, only I have worked on the GUI part of the application, so the code is ugly or unclear for others to work with.

However, it seems that the architectural choice (of writing the core of the application in Haskell/Frege and use only Java for interactivity and visualization) is a good one, for several reasons: first, the system is naturally split in two parts and needs to have a well-defined interface between these parts. That increases clarity of thought. Secondly, Haskell seems a better suited language for defining state spaces and transitions (especially thanks to the List monad) than Java. And Java has a lot of client installations and includes a GUI toolkit out-of-the-box. With this choice one has the best of both worlds.

Indeed, to verify this architectural choice, I have worked on a “spiritual successor” of this project over the past few weeks, that culminated in a new tool. The new tool, let’s call it the Distributed Algorithms tool, allows users to explore all possible executions of some built-in algorithms on a user-defined network. This project was supervised by  Wan Fokkink.

The framework behind the Distributed Algorithms tool allows several basic distributed algorithms (for example Cidon’s traversal algorithm, see image) to be defined using a simple process algebraic framework. This framework allows one to specify the algorithm in terms of a state space, a message space and a local transition function with basic actions: sending, receiving, internal. Compare this with the game state space, game state transitions of the Bisimulation Games tool.

The user interface of the Distributed Algorithms tool is more advanced than the user interface of the Bisimulation Games tool. The Distributed Algorithms tool allows users to graphically construct networks from scratch, whereas in the Bisimulation Games tool users need to define networks textually in an external text editor. However, the Bisimulation Games tool might be perceived as easier to use. Another difference is that for the Bisimulation Games tool, one cannot undo a step: a user has to redo every action from scratch to revisit a particular choice. By allowing users to control the flow of time and undo choices, as is possible in the Distributed Algorithms tool, it is easier for users to explore different scenarios.

In the Distributed Algorithms tool, more effort and focus than before went into cleaner and better organized GUI code. Also with this project, the bulk of work is done within 4 weeks. However, one of the downsides of the above architectural choice became apparent: there is significant overhead in programming time spent translating Java objects to and from Frege’s compiled Haskell objects.

Concluding words

All in all, I believe that these visualization and simulation projects are fun ways of exploring theoretical concepts. A visualization is often burdensome to write, but I believe it pays off the effort in the long run. Visualization tools allows users (including its authors) to explore theoretical concepts on a different level. It may be tedious for some concepts to work out examples manually, and visualizing them solves that problem. Interactivity makes the concepts even more alive: it allows users to experiment with the theory without knowing the formalization behind it, and may serve as an easy introduction into a new field of study.

As a programmer, the practical nature of the tools required a good understanding of the subject matter, and some creativity to correctly implement and visualize them. As was mentioned, errors were made because of misunderstanding or misinterpretation. It seems one has to be patient (e.g. by building debugging tools) to solve one’s own problems. At the same time, one needs a certain amount of rigor to prevent introducing problems in the first place.

Finally, the choice to work with both the Java language and the Haskell language is an interesting one. I hope to work more on these kinds of projects in the near future: short in duration, diving deep in theory, and (re)presenting the results interactively as a tool that is hopefully pleasant to use.

Project name Bulk of work Java code Frege code
Bisimulation Games 4 weeks 4.689 LoC 580 LoC*
Distributed Algorithms** 4 weeks 12.625 LoC 881 LoC

Table overview of relative size of projects, *includes work by Keiren, **not yet finished.

A MOOC on system validation is born

Exciting news! Today the MOOC on system validation that I have developed with Mohammad Reza Mousavi from Halmstad University  has started. The MOOC is based on the book Modeling and Analysis of Communicating Systems, which was coauthored by Mohammad, and on which my System Validation course at TU Delft is also based.

About the course

Anyone who has ever designed an embedded system or a communication protocol involving several components executing simultaneously knows that such software is inherently susceptible to bugs. Typical problems include race conditions, deadlocks, and unexpected interplay between different components. The parallel nature of these systems makes it notoriously hard to detect such bugs using testing (timing, e.g., plays a crucial role). This course is designed to provide an introduction to the problems that arise in the design of such systems. It provides ways to model such systems and reason about them.

Related posts

The course has been in the making for some time, and I reported on the experiences in setting it up here.

Recording a MOOC on System Validation: Some Experiences

Recently I visited Halmstad University. Together with Mohammad Mousavi, we recorded a MOOC (Massive Open Online Course) on the subject of system validation. The MOOC goes with the book “Modeling and Analysis of Communicating Systems” by Jan Friso Groote and Mohammad Mousavi.

In this post, I plan to share some observations based on our MOOC recording experience.

1. The available budget and environment largely dictates the end-result. In order to design a nice and attractive MOOC, a lot of time is spent on designing the MOOC, splitting up the topics that need to be covered into compact recordings, and finding attractive material to show in the MOOC. The amount of time that can be invested is constrained by the budget that is available. In our case we were fortunate to have the help of a student in designing the material. Second, the budget and environment constrain the available recording facilities, and the amount of time spent on recording and editing the lectures. All lectures in our MOOC are the product of a single take (which does not mean that a single take was enough to every lecture).

2. At least 3 takes are needed to record a single video. Our experience is that, although you prepare your words in detail, in the first take you are searching for the right phrasing and the correct pace to cover the material. You might expect that the second take is better, but in practice it often turned out to be worse than the first. Typically, the third take was one which was reasonably good, and that could be used to produce the video for the MOOC.

3. You can record at most 2 lectures a day. We have been recording lectures with two different lecturers; our setup was to have a single lecturer covering a single topic in multiple videos. When recording these lectures, experience shows that after about 4 takes the quality of recordings of a single lecturer is dramatically reduced. Therefore, most of the time both Mohammad and me managed to record a single lecture every day (with one of us being able to do two on an exceptional day). Recording a single take felt to be about as exhausting as teaching an hour in front of a classroom audience for both of us.

4. It is hard to come up with appealing graphics. The topics we are covering are highly theoretical in nature. Depending on the level of detail you want to cover, visually attractive material is hard to come by. In our case, we mainly relied on examples and case studies in the techniques were used in order to get nice visuals. The theory is still covered in pretty theoretical slides, including some formulas and graphs. I wonder how others go about covering such theoretical material in a MOOC setting (of course, we could argue that you should not cover such material, but hey… we did anyway); if you’ve got nice ideas or pointers please respond in the comments!

5. The business model for MOOC hosting is somewhat elitist and unfair. Larger MOOC companies only host course from large world-leading institutes that contribute a large sum.
Smaller MOOC companies either ask for a huge amount per course and per edition or
often are not as visible as the larger companies. We are currently considering Canvas as a free MOOC hosting with fair visibility and quality.

The end-result still needs to be edited, and worked into the MOOC such that it can be published online. After we have done so, I will write about it again. In the mean-time I am going to test-drive the recordings in my course on system validation at TU Delft to (partly) flip the classroom. Also, I plan to discuss what I would like to do differently when recording other MOOCs in a subsequent blog post.

WSFM/BEAT 2015 Deadline Approaching

Update 25/6/2015: Changed the dates.

This year, the Workshop on Web Services and Formal Methods (WS-FM) and the Workshop on Behavioural Types have joined forces, forming the International Symposium on Web Services, Formal Methods and Behavioural Types (WS-FM/BEAT 2015). Check out the website of WSFM-BEAT 2015.

The submission deadlines are approaching:

  • Submission of abstracts: Wednesday 1 July 2015.
  • Submission of papers: Friday 3 July 2015.

Paper submission is open. Please consider submitting your papers!

Looking back at FSEN 2015

Not a liveblog about a conference this time, but hey, still the talks were interesting!

From 22 to 24 April I attended FSEN 2015. For me, the main reason to attend the conference was presenting my work on benchmarking parity games. The conference was extremely well-organised at IPM in Tehran, in particular the students part of the local organisation deserve a special word of thanks for the effort they made in making the conference into an unforgettable experience! By the way, did I already say that the people in Iran are very friendly, and that it is a really nice country to visit? Well, now you know, anyway!

The program of the conference was nicely balanced, with topics such as software architecture, runtime verification, static analysis, software testing, and formal methods. There are a few talks that I would like to discuss in a bit more detail (disclaimer: the choice for these particular talks does not say anything about the quality of the talks).

Automated Integration of Service-oriented Software Systems by Paola Inverardi

As the title suggests, in her talk Paola concentrates on the ways we integrate software. Today, when developing software, having to integrate both black box components (those of which we do not have the source code), and white box components (for which do have access to the code) is a fact of life. The goal is to build a software system satisfying our requirements.

Given the nature of today’s systems, getting 100% correct systems is impossible (especially since we do not really know what we get when using blackbox components) therefore it makes sense to concentrate on what we can do. The statement Paola makes is the following:

Final system behavior and dependability depends on the integration code, integrating white box and black box components.

In other words, the code that we write to integrate systems is what we can control, and therefore it is the key to building dependable systems from components. However, in order to achieve confidence in the system we need:

  • behavioural models,
  • the process lifecycle, and
  • automation.

For this, she proposes to view software science as an experimental science. That is, we need to move from a creationist world in which formal models come with the code, to an experimental world in which the knowledge of software is limited to what can be observed.

In her talk Paola discussed two experimental approaches that help to move towards this goal:

  • Learning interaction protocols using a WSRL description of an interface (based on the tool Strawberry).
  • Synthesising integration code using, e.g., integration patterns.

Important but hard question to ask when using these approaches: when is your model complete enough with respect to a given goal? In partial when using quantitative and/or partial models we need more research!

QuickCheck: Testing the Hard Stuff and Staying Sane by John Hughes

The second keynote of the conference was by John Hughes. This was one of the most entertaining talks I have seen in the past years, while still covering a lot of content.

QuickCheck is a tool for automated test case generation, and when a test fails a minimal counter example is generated. It used Erlang as a declarative language for modeling and specification. Using a single model, many bugs can be found using random generation of test cases. John explained the theory behind the tool using a wonderful interactive example of a circular buffer, implemented in C, which of course contained all sorts of subtle bugs that were found using QuickCheck.

Now of course, you are wondering, does it work when applied to real world software. John showed it does by providing some industrial examples that he worked on with his company QuviQ.

When you put the car together, you can’t even boot it, let alone drive it.

  • Testing cars agains the AUTOSAR standard, which is a standard for software components in cars. This project is all about the integration of software from different vendors in a single car. They discovered around 200 problems using QuickCheck, half of which turned out the problems in the standard. The severity of the integration problems is nicely illustrated by the quote “When you put the car together, you can’t even boot it, let alone drive it.”
  • Analysing software problems that were hard to reproduce using traditional testing approaches at Klarna, race conditions in Erlang’s Dets database we discovered. For this case study it was necessary to extend test case generation to generate parallel unit tests.

The take home message of this talk is: Don’t write tests… Generate them!

Formal Methods Friday

This post is already getting quite lengthy, but I would still like to say something about the Friday of the conference. As the title of this section suggests, the Friday was mainly on formal methods. The level of talks during this day was particularly high (although I admit, that might have to do with my preference). Two extremely interesting talks were the following (I do recommend checking out the papers!):

  • A behavioural theory for a pi-calculus with preorders by Jean-Marie Madiot. An excellent paper introducing an axiomatisation for a pi-calculus in the context of preorders. He discusses interesting problems with respect to congruence that pop up in the process.
  • A theory of integrating tamper evidence with stabilization by Ali Ebnenasir. In this talk Ali takes a look at system correctness, but not in a traditional sense. Instead, he wonders what happens in unexpected circumstances; for instance, what happens if an adversary tampers with the system? He goes into the notions of stabilisation (a legitimate state is eventually reached by system actions only) and introduces tamper-evident stabilisation, which is weaker than stabilisation, and allows for some tampering by the adversary.

Paper: Benchmarking Parity Games

Update 26 May 2015: Slides added.

Next week, on April 24, I will present my paper Benchmarking Parity Games at FSEN 2015. The core contribution of this paper is a set of parity games that can be used for benchmarking purposes.


While working on my PhD thesis, and several papers on parity games that are part of it, I observed that papers in the literature that discuss parity game algorithms take one of the following approaches:

  • the paper focuses on the theoretical complexity of the algorithm.
  • the paper describes an experimental evaluation of an algorithm, but uses a limited set of parity games for this evaluation.

An additional observation for the second category of papers is that each paper defines its own set of parity games to do this evaluation. If we want to compare experimental results, for instance because we want to see whether our own algorithm or heuristic is an improvement compared to an existing algorithm, it is problematic that we cannot do so based on a standard set of games.


In this paper, I provide a set of parity games that can be used to overcome the problem illustrated above. The set consists of games representing different problems: synthetic games, i.e. random games and families of games that are hard for some existing algorithms; model checking games; equivalence checking games; and games for satisfiability and validity of modal logics.

Additionally, I describe an analysis of the games with respect to a set of structural properties, including a newly developed notion of alternation depth for parity games. This new notion is an attempt to more accurately describe the inherent complexity of a parity game, along the same lines as the different notions of alternation depth that exist for the modal mu-calculus. Details can be found in the paper.

Concluding remarks

Currently the set of benchmarks does not include parity games generated from synthesis problems; such games would form a very welcome addition. If you would like to contribute such games (or any other games) please contact me (e.g. at ), I would be more than happy to include them.

The analysis of the benchmarks that is described in the paper also gives rise to some interesting research questions:

  • Can structural properties of graphs be computed more efficiently (especially width measures of graphs could not be computed for reasonably sized instances)?
  • Are there heuristics that can approximate the graph measures that cannot be computed exactly?
  • Can the theoretical complexity of parity game algorithms be improved by considering the notion of alternation depth that I propose?

Further reading

An extended version of the paper is available as J. J. A. Keiren. Benchmarks for Parity Games. arXiv.org [cs.LO] No. 1407.3121. 2014.


  • Parity game generator GitHub repository. This is the code I used to generate the parity games. For instructions see the ReadMe in the repository.
  • PGInfo GitHub repository. This is the tool that I created to determine the structural information of the parity games.
  • The generated games, and the corresponding data is available from the servers of the Open University in the Netherlands, you can download the following:
    • Model checking: Parity games encoding model checking problems.
    • Equivalence checking: Parity games encoding equivalence checking problems.
    • PGSolver: Parity games generated using PGSolver. These include: random games, and games that are hard for certain algorithms.
    • MLSolver: Parity games generated using MLSolver. These are games that encode satisfiability and validity problems for modal logics.
    • Data (yaml): This is a file in YAML format that includes high-level information about all parity games described before. Detailed (structural) information about the games can be found as separate YAML files in the respectipe ZIP archives.
    • Data (sqlite): This contains a database in SQLite format that was derived from the YAML files provided in the archives, and the previous data file. The sole purpose of this derived data is to enable efficient plotting of results. The plots in the paper have been based on this file.

Contributions to the tools and the sets of games are appreciated.


Moving On

At the end of the year I received good news on the career development front. On 1 February (or, actually 2 February, since the 1st is a Sunday) I will start as assistant professor at the Open University in the Netherlands. As such, you can find me at the premises in Heerlen one day week, typically on Tuesday. The other days I can found at the location in Nijmegen, where I am a visiting research at the Radboud University.

I am looking forward to teach some nice courses at the OU, as well as to continue my research in this new environment!