Converting a `pyspiel` game state to a dictionary of array-likes #1254

kurtamohler · 2024-07-25T19:23:22Z

Would it be possible to convert a pyspiel game's State object to a dictionary of array-likes and back again, in an efficient way? If that is currently not supported, would it be possible to add this feature?

At the moment, it seems to me that this is not possible. pyspiel states are implemented in C++ and bound to Python with pybind11, and it doesn't look like any of the bound methods or properties provide a dict-of-arrays representation of the state.

I'm asking about this because I am looking into adding an environment wrapper class for OpenSpiel to TorchRL. Ideally, the wrapper would be stateless, so the state would need to be provided to the wrapper's step function as part of a TensorDict, which is a dictionary of array-likes.

Some other RL environment libraries support dict-of-arrays representations, like Brax and Jumanji. Just to give an example:

import jumanji
import jax
env = jumanji.make('Snake-v1')
key = jax.random.PRNGKey(0)
state, _ = env.reset(key)

def state_to_dict_of_arrays(state):
    res = {}
    for key, value in state.items():
        if hasattr(value, '_fields'):
            res[key] = {}
            for field in value._fields:
                res[key][field] = jax.numpy.asarray(value)
        else:
            res[key] = jax.numpy.asarray(value)
    
    return res

state_to_dict_of_arrays(state)

{'body': Array([[False, False, False, False, False, False, False, False, False,
         False, False, False],
         ...
        [False, False, False, False, False, False, False, False, False,
         False, False, False]], dtype=bool),
 'body_state': Array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
  'col': Array([2, 4], dtype=int32)},
 'length': Array(1, dtype=int32),
 'step_count': Array(0, dtype=int32),
 'action_mask': Array([ True,  True,  True,  True], dtype=bool),
 'key': Array([2467461003,  428148500], dtype=uint32)}

The text was updated successfully, but these errors were encountered:

lanctot · 2024-07-28T09:03:59Z

That would be a great feature to have. You are correct: it does not currently exist. We don't have the time to add this ourselves, but it would make a welcome contribution to the code base!

lanctot · 2024-09-27T12:09:48Z

Someone has submitted an implementation: #1279

Can I ask a quick question about the technical path forward.

My understanding from glancing over a few threads is that the use case for this is so that OpenSpiel environments can more easily be used by other RL frameworks that use array-of-dicts representation.

However, wouldn't it be better to do this via our observer framework rather than directly over pyspiel states? The pyspiel states contain everything, including information that would be hidden to RL agents, whereas the observer is designed to exposes exactly the information that the RL agents should see.

I just want to figure out this design choice before we get too far into core API additions, but I definitely want to support integration with RL frameworks.

@elkhrt any opinions on this?

elkhrt · 2024-09-27T12:35:47Z

I'm not completely clear what's needed here. If you want a structured view of the state from the point-of-view of a player, then we have the interfaces for that, but only a handful of games have implemented it.

import pyspiel
from open_spiel.python import observation
import random

game = pyspiel.load_game("leduc_poker")
state = game.new_initial_state()
obs = observation.make_observation(game)

while not state.is_terminal():
  if state.current_player() >= 0:
    obs.set_from(state, state.current_player())
    print(state.current_player(), obs.dict)
  state.apply_action(random.choice(state.legal_actions()))

This emits something like:

0 {'player': array([1., 0.], dtype=float32), 'private_card': array([1., 0., 0., 0., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 0., 0.], dtype=float32), 'pot_contribution': array([1., 1.], dtype=float32)}
1 {'player': array([0., 1.], dtype=float32), 'private_card': array([0., 0., 0., 1., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 0., 0.], dtype=float32), 'pot_contribution': array([1., 1.], dtype=float32)}
0 {'player': array([1., 0.], dtype=float32), 'private_card': array([1., 0., 0., 0., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 1., 0.], dtype=float32), 'pot_contribution': array([1., 1.], dtype=float32)}
1 {'player': array([0., 1.], dtype=float32), 'private_card': array([0., 0., 0., 1., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 1., 0.], dtype=float32), 'pot_contribution': array([1., 1.], dtype=float32)}
0 {'player': array([1., 0.], dtype=float32), 'private_card': array([1., 0., 0., 0., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 1., 0.], dtype=float32), 'pot_contribution': array([1., 5.], dtype=float32)}
1 {'player': array([0., 1.], dtype=float32), 'private_card': array([0., 0., 0., 1., 0., 0.], dtype=float32), 'community_card': array([0., 0., 0., 0., 1., 0.], dtype=float32), 'pot_contribution': array([9., 5.], dtype=float32)}

If the game doesn't support these structured observations, you'll just get a single tensor, e.g. tiny_hanabi:

0 {'observation': array([0., 1., 0., 0., 0., 0., 0., 0.], dtype=float32)}
1 {'observation': array([0., 1., 1., 0., 0., 0., 0., 0.], dtype=float32)}

Is the idea to add things like the action_mask to this? If so, then I suggest the right thing to do is to add a flag to _Observation which is passed through make_observation and which adds the extra fields to the dict and updates them in set_from. It could be mildly more efficient to do some of this in the C++ layer, but probably no big deal.

open_spiel/open_spiel/python/observation.py

Line 75 in 08543bd

self.dict[tensor_info.name] = values

kurtamohler mentioned this issue Jul 25, 2024

[Feature Request] Integration for OpenSpiel environments pytorch/rl#2133

Open

1 task

lanctot added the contribution welcome It's a nice feature! But we do not have the time to do it ourselves. Contribution welcomed! label Jul 28, 2024

Ayah-Saleh added a commit to Ayah-Saleh/open_spiel that referenced this issue Sep 26, 2024

Fix google-deepmind#1254: Convert pyspiel game state to dict

2cb0170

Ayah-Saleh mentioned this issue Sep 26, 2024

Fix #1254: Convert pyspiel game state to dict #1279

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting a `pyspiel` game state to a dictionary of array-likes #1254

Converting a `pyspiel` game state to a dictionary of array-likes #1254

kurtamohler commented Jul 25, 2024

lanctot commented Jul 28, 2024

lanctot commented Sep 27, 2024

elkhrt commented Sep 27, 2024 •

edited

Loading

Converting a pyspiel game state to a dictionary of array-likes #1254

Converting a pyspiel game state to a dictionary of array-likes #1254

Comments

kurtamohler commented Jul 25, 2024

lanctot commented Jul 28, 2024

lanctot commented Sep 27, 2024

elkhrt commented Sep 27, 2024 • edited Loading

Converting a `pyspiel` game state to a dictionary of array-likes #1254

Converting a `pyspiel` game state to a dictionary of array-likes #1254

elkhrt commented Sep 27, 2024 •

edited

Loading