On the Effects of Intra-episodic Language Signals in the Minigrid Environment

Master Thesis at University of Potsdam (WIP)

Are intra-episodic language signals (provided by a simple heuristic) beneficial for follower success in the environment? And in what situations especially?
Can we learn intra-episodic language production in the environment? And does such a learnt generator (neural model) perform better than a simple heuristic?
Can joint learning of adaptable (neural) models succeed simple heuristics? How do the resulting policies behave and why are they better or worse?

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
envs		envs
evaluation		evaluation
feature_extraction		feature_extraction
follower		follower
playback		playback
speaker		speaker
training		training
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
manual.py		manual.py
prepare_path.sh		prepare_path.sh

Provide feedback