On Games and Simulations

This is a guest blog by Hans-Dieter Hiep, a student from Vrije Universiteit Amsterdam who worked with me to build a GUI for our work on game-based (bi)simulations. The tool described in the post is available through BitBucket, see https://bitbucket.org/jkeiren/game_based_simulation/branch/hans-java.

Update 20/3/2017: Hans-Dieter has now also started his own blog, and reblogged the post on http://www.hansdieterhiep.nl/blog/on-games-and-simulations/.

In this blog post, I would like to report on a relatively short but productive project with Jeroen Keiren and Tim Willemse. Let’s call the result of this project the Bisimulation Games tool. Work began on this project in the summer of 2016. In particular, this post covers the project’s set-up and architectural choices, serves in a small part as a promotion of the idea to do more interactive visualization of theoretical concepts in computer science, and also contains a sneak-preview of another project that replicates the most important design choice of this project.

In particular, this post does not cover the theory behind the tool in depth. We will see games for checking bisimulation and a short introduction how the application handles a particular scenario. For a more thorough background, see Games for Bisimulation and Abstraction (on Arxiv).

Background

An important aspect of this project is the interactive visualization of games for showing bisimulation relations. We work with Labelled Transition Systems (LTS) here: an abstract representation of the state space of, for example, computer programs. Another example can be seen in the window of the application (above) we have five states: t_0 and t_1, s_2 and c_3 and c_4. Between states we have transitions: these represent observable actions (eat pizza and hack) or internal actions (so-called tau).

David de Frutos-Escrig, Jeroen Keiren and Tim Willemse have worked towards a definition of a game semantics for showing a number of bisimulation relations. In particular, as part of this project, Keiren has implemented game semantics for bisimulation, branching bisimulation and branching bisimulation with explicit divergence as a Haskell program. This program defines the game state space and game state transitions. Using these two definitions one can determine whether, given a vertex, either a winning strategy exists for either the player or for the opponent.

An important part of this project is to reuse this existing implementation. Here, an important design decision was made: to port the Haskell code to Frege, a Haskell implementation for the Java Virtual Machine. This allowed me to write the visualization and interaction with games in the Java AWT/Swing framework. The Frege project can be still considered experimental, as exemplified by a minor compiler bug that was found and fixed during the execution of this project.

Winning strategies

The next step in the project is strategy extraction. The existing Haskell implementation by Keiren provided an implementation for an algorithm for extracting the set of winning vertices for each player of the game: Spoiler (played by the computer) and Duplicator (played by the human operator). For an interactive application, a set of winning vertices is not enough: a human operator can take a step that would ultimately lead to her defeat, even if a winning strategy for Duplicator exists at that point. Thus, an extraction algorithm was developed that, from the set of winning vertices, determines a strategy (assigning to a vertex the best next move) for Spoiler, that is guaranteed to eventually win if a winning strategy for Spoiler exists. This part is not proved and only tested informally, but seems to work fine for the tested examples.

Interactivity

An important aspect is helping human operators discover flaws in their models. The application may act as a “debugger”, showing a step-by-step walkthrough of counter-examples that may help humans understand the model. Let us see an example, that is taken from Korver (see [16] in Games for Bisimulation and Abstraction mentioned earlier).

Suppose the user selects the node 0 (blue, left image) as opponent and A (red, left image) as player vertex. Remark that this is also logged in the message pane, on top of the window. First, the winning vertices are computed by the algorithm by Keiren (and shown in the lower left corner: the expected game outcome if both players play according to a winning strategy). Then a strategy is extracted, and the first step is taken: Spoiler challenges Duplicator to mimic the action “a”, as highlighted in blue.

Consequently, the human operator must choose an internal action (“tau” to B), since the other action (“b” to D) does not match the challenge. Then the Spoiler drops the challenge (second to last message, right image), and challenges Duplicator with a new action: from 0 to 4, mimic action “b”. Since Duplicator sits still in the new state of B, it cannot undo the “tau” step. Hence, the game is concluded: Duplicator has lost, and indeed, no winning strategy for Duplicator exists for these two starting nodes.

Internal Debugging

One interesting problem encountered during the execution of the project was a bug in the Haskell framework code, that I introduced because of a misinterpretation and misunderstanding of Büchi games. The game semantics is defined as a Büchi game. Such games may have infinite plays.

From the perspective of some player, a Büchi objective is to find a strategy of a finite play where the other player loses or an infinite play in which an infinite number of points is earned, or so-called accepting states are visited infinitely often. Observe that if we visit at least one accepting state in a cycle (a Büchi objective), then in an infinite play where the player remains within this cycle, it has visited an infinite number of accepting states.

So to prevent one player from winning, its opponent has to prevent that the player has an infinite play in which it reaches its objective. Thus the opponent must ensure that it either plays a finite game where the other player loses, or an infinite play in which all of the visited states are non-accepting (a co-Büchi objective).

To help finding the source of the problem, a monad was developed that keeps track of function context that is threaded through the functional code. For example, it dumps the current labeling of some vertex set each iteration of a function. Then a GUI was developed that tracks and shows, for each step, the current state of the function. This allowed me to find where the problem was located. The bug was found to have mixed up Büchi objectives and co-Büchi objectives in the complements of winning vertex sets, having not fully understood the duality.

The same debugging monad (and the corresponding table UI with minor adaptions) were later reused to debug the strategy extraction algorithm. These tools seemed to be of good use during the development.

Execution

The project was executed in 4 weeks. This short timespan ensured that only the most important aspects were touched. However, to make the project more worthwhile in the (near) future, it seems useful to explore even more bisimulation relations.

Although the current version of the tool is a fun toy to show simple models, there are numerous downsides: it is slow in execution (Frege is not the most efficient Haskell implementation), it may contain bugs or flaws in the GUI code, and the strategy extraction may be wrong. To extend the tool for more relations, one has to adapt both the GUI code and the simulation framework. And, finally, only I have worked on the GUI part of the application, so the code is ugly or unclear for others to work with.

However, it seems that the architectural choice (of writing the core of the application in Haskell/Frege and use only Java for interactivity and visualization) is a good one, for several reasons: first, the system is naturally split in two parts and needs to have a well-defined interface between these parts. That increases clarity of thought. Secondly, Haskell seems a better suited language for defining state spaces and transitions (especially thanks to the List monad) than Java. And Java has a lot of client installations and includes a GUI toolkit out-of-the-box. With this choice one has the best of both worlds.

Indeed, to verify this architectural choice, I have worked on a “spiritual successor” of this project over the past few weeks, that culminated in a new tool. The new tool, let’s call it the Distributed Algorithms tool, allows users to explore all possible executions of some built-in algorithms on a user-defined network. This project was supervised by  Wan Fokkink.

The framework behind the Distributed Algorithms tool allows several basic distributed algorithms (for example Cidon’s traversal algorithm, see image) to be defined using a simple process algebraic framework. This framework allows one to specify the algorithm in terms of a state space, a message space and a local transition function with basic actions: sending, receiving, internal. Compare this with the game state space, game state transitions of the Bisimulation Games tool.

The user interface of the Distributed Algorithms tool is more advanced than the user interface of the Bisimulation Games tool. The Distributed Algorithms tool allows users to graphically construct networks from scratch, whereas in the Bisimulation Games tool users need to define networks textually in an external text editor. However, the Bisimulation Games tool might be perceived as easier to use. Another difference is that for the Bisimulation Games tool, one cannot undo a step: a user has to redo every action from scratch to revisit a particular choice. By allowing users to control the flow of time and undo choices, as is possible in the Distributed Algorithms tool, it is easier for users to explore different scenarios.

In the Distributed Algorithms tool, more effort and focus than before went into cleaner and better organized GUI code. Also with this project, the bulk of work is done within 4 weeks. However, one of the downsides of the above architectural choice became apparent: there is significant overhead in programming time spent translating Java objects to and from Frege’s compiled Haskell objects.

Concluding words

All in all, I believe that these visualization and simulation projects are fun ways of exploring theoretical concepts. A visualization is often burdensome to write, but I believe it pays off the effort in the long run. Visualization tools allows users (including its authors) to explore theoretical concepts on a different level. It may be tedious for some concepts to work out examples manually, and visualizing them solves that problem. Interactivity makes the concepts even more alive: it allows users to experiment with the theory without knowing the formalization behind it, and may serve as an easy introduction into a new field of study.

As a programmer, the practical nature of the tools required a good understanding of the subject matter, and some creativity to correctly implement and visualize them. As was mentioned, errors were made because of misunderstanding or misinterpretation. It seems one has to be patient (e.g. by building debugging tools) to solve one’s own problems. At the same time, one needs a certain amount of rigor to prevent introducing problems in the first place.

Finally, the choice to work with both the Java language and the Haskell language is an interesting one. I hope to work more on these kinds of projects in the near future: short in duration, diving deep in theory, and (re)presenting the results interactively as a tool that is hopefully pleasant to use.

Project name Bulk of work Java code Frege code
Bisimulation Games 4 weeks 4.689 LoC 580 LoC*
Distributed Algorithms** 4 weeks 12.625 LoC 881 LoC

Table overview of relative size of projects, *includes work by Keiren, **not yet finished.

The Complexity Landscape of Partial-Observation Stochastic Games – Krishnendu Chatterjee, ATVA’14

Krishnendu heads of in a direction different from Monday’s tutorial talk altogether. Whereas on Monday he considered perfect information games, today he is considering games with imperfect information. Instead of a mean-payoff objective, he considers game graphs with a parity objective.

The motivation to consider games with imperfect information is that, in practice, the assumption that a controller knows everything is unrealistic since, e.g., systems have internal variables, and sensors are unreliable. In today’s talk, transitions are stochastic.

In the talk, a lot of different complexity results are discussed. Essentially, all combinations are considered where: player 1 has partial/perfect information and player 2 has partial/perfect information, and then for all these instances, three types of winning criteria are considered: sure winning, almost sure winning, and quantitative analysis.

His slides contain a number of nice comprehensive overview slides regarding the complexity results, so I would encourage you to check them out.

To conclude, I want you to give the following quote that came up during the talk. “In computer science, what God van do we can also do by non-determinism”.

Multidimensional Quantitative Games and MDPs – Krishnendu Chatterjee, ATVA’14

This week I’m attending ATVA’14. On Monday we started with a series of tutorial talks, the first one by Krishnendu Chatterjee. Once available I’ll update the post with a link to the slides.

Krishnendu started of by introducing two-player games in general, using three motivating examples:

  1. Concurrency synthesis, e.g. synthesising locks in concurrent programs, where the objective is the have maximal performance without introducing data races.
  2. Incompatible synthesis, i.e. synthesising a program satisfying a set of Boolean objectives that cannot all be satisfied at the same time. The objective is then, e.g., to satisfy as many of them as possible.
  3. Quantitative simulation, which is useful in the case that programs cannot simulate each other exactly, now the problem is to match steps as closely as possible.

In the first part of the talk Krishnendu very clearly introduced quantitative games, in particular mean-payoff games (objectives about long-run averages), as well as lim-inf and lim-sup objectives. Mean pay-off games are two-player games in which the edges carry some (possibly negative) integer weight. The game is played by two players, player 1 and player 2, and the objective for player 1 is to make sure that the long run average of these weights exceeds a certain threshold. He explained the consequences of the choice for memoryless and finite- and infinite-memory strategies in this setting: mean-payoff games are memoryless determined, so having finite or infinite memory does not change the expressive power. Deciding whether player 1 has a winning strategy is in NP and co-NP.

The second part of the tutorial focused on generalising the ideas in mean-payoff games to the multidimensional setting. This means that instead of satisfying a single mean-payoff objective, multiple objectives have to be satisfied. There are multiple reward functions on the edges, and player 1 wins if all mean-payoff objectives are satisfied. He studies the impact of this w.r.t. different problems. It turns out that:

  • infinite memory strategies are more powerful than finite memory strategies, which in turn are more powerful than memoryless strategies.
  • determining winning in the setting with finite memories is co-NP complete.

In the process, he shows an algorithm that uses a linear programming approach to compute flows in strongly connected components in the game graph (to encode the mean payoff objective), and proofs some properties using the principle of edge induction. Both approaches are somewhat of a recurring theme, that you should check out! (I will add some reference later)

In the third part of his talk, he generalises the above ideas to Markov Decision Processes; here the algorithmic approach does not use SCCs, but instead work on maximal end components.

Open Problems

During his talk, Krishnendu introduces several open problems. One of which is to determine whether deciding winning in multi-dimensional mean-payoff games with finite memory is polynomial time computable.