sb3Dungeon

Note: This is me manually controlling the agent

sb3Dungeon is an attempt to create a RL agent that can navigate a randomly generated dungeon and reach an exit all whilst avoiding an enemy that can follow it using A*.

Visualization created with pygame.

Environment:

Each tile is represented as an integer on a numpy array. The dungeon size, tile size, and dungeon paths are all parameters that can be easily adjusted to test differing environments. Empty tiles, rock tiles, the exit tile, as well as the agent and the enemy each use different integers that update every frame to keep track of their position in order to provide an observation to the RL model.

The dungeon is randomly generated by first creating paths to allow every dungeon to be completed, and then creating clusters of rocks around the paths.

First, the paths are created using brownian motion. The amount of paths generated can be changed in the Constants.py file. Additionally, each tile that is apart of the path gets explicitly labeled as such, which is an easy way to make sure rocks are not placed on path tiles. Second, random tiles are chosen to be the origin of a "rock cluster". The boundary tiles get added to a list and then one gets randomly selected to become a rock. The boundary list gets updated, and it repeats until the desired amount of rocks are placed. Finally, the enemy gets placed in a random location in the dungeon, and the agent gets placed in a random location in the start zone.

Something to note is that the enemy does not move when the agent is either in the spawn area, or the exit area, denoted by the empty areas to either side of the environment. This can allow the agent to perform some more complicated maneuvers in order to dodge the enemy.

Agent:

The agent is the yellow tile. It has the choice to move in one of the cardinal directions each frame, no diagonals. If it attempts to move into a tile such as a rock, it's reward will go down, but the environment won't progress, meaning it can attempt to choose again.

Reward Function:

This is still a work in progress. The idea is to have the agent want to move to the exit as quick as possible while avoiding the enemy. As of now, the agent starts with a reward of 0, and it slowly decreases as time goes on. Upon having an environment reset either due to reaching the max number of frames, or having the enemy reach it, the agent's reward increases by how close it was to the exit. This reward function is still rudimentary though, and doesnt perform as well as it could.

Interesting Notes:

Theoretically, this should be solvable almost 100% of the time if the agent learns to use the start zone to bait the enemy into a far away position and then slipping by it. Since the enemy can only move as fast as the agent, if the agent is at least one tile ahead of the enemy it should be able to make it to the exit 100% of the time.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.vscode		.vscode
__pycache__		__pycache__
.gitignore		.gitignore
Agent (2).py		Agent (2).py
Agent.py		Agent.py
Cell.py		Cell.py
Constants.py		Constants.py
Dungeon.py		Dungeon.py
Goblin.py		Goblin.py
README.md		README.md
TODO.md		TODO.md
checkenv.py		checkenv.py
clean_models.py		clean_models.py
dunlearn.py		dunlearn.py
dunload.py		dunload.py
main.py		main.py
sb3DungeonEnv.py		sb3DungeonEnv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sb3Dungeon

Environment:

Agent:

Reward Function:

Interesting Notes:

About

Releases

Packages

Languages

aiqojo/sb3Dungeon

Folders and files

Latest commit

History

Repository files navigation

sb3Dungeon

Environment:

Agent:

Reward Function:

Interesting Notes:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages