Skip to content

Commit

Permalink
Fix bug in target Q-value for illegal actions
Browse files Browse the repository at this point in the history
Fix DQN bug: set ILLEGAL_ACTION_LOGITS_PENALTY to a large negative number instead of 0.
  • Loading branch information
nathanlct authored Aug 2, 2024
1 parent 1428a82 commit f10eb08
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion open_spiel/python/pytorch/dqn.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
"Transition",
"info_state action reward next_info_state is_final_step legal_actions_mask")

ILLEGAL_ACTION_LOGITS_PENALTY = sys.float_info.min
ILLEGAL_ACTION_LOGITS_PENALTY = torch.finfo(torch.float).min


class SonnetLinear(nn.Module):
Expand Down

0 comments on commit f10eb08

Please sign in to comment.