Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glyph ID Handling #29

Open
StephenOman opened this issue Sep 9, 2024 · 0 comments
Open

Glyph ID Handling #29

StephenOman opened this issue Sep 9, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@StephenOman
Copy link
Collaborator

This is a migrated copy of the original feature request here: facebookresearch#21

Currently, each dungeon tile (ignoring the char/color/specials observation that's also available) is an int16 between 0 and nethack.MAX_GLYPH == 5976. We use an embedding lookup table of that size embedding_dim == 32. That's 5976 * 32 == 191232 floating points, or 191232 * 16 == 3059712 bits, or ~0.3MB. That doesn't seem too much but there's some pytorch/pytorch#24912. Also, it does not give the agent a cue that certain ids (e.g., dog and large dog) are more related than others (large dog vs wall).

The way these glyphs are organized is that first come all the monsters (NUMMONS many, which is 381), then pets (again NUMMONS many because in theory every monster can be tame, then a single glyph for an invisible monster (GLYPH_INVIS_OFF, which is 762), then a glyph for each "detected" monster (again NUMMONS many). For some obscure reason, then there's corpses, which are not monsters (but there's NUMMONS many), and then there's ridden monsters, which are monsters (NUMMONS many). The check glyph_is_monster(glyph) does this:

#define glyph_is_monster(glyph)
(glyph_is_normal_monster(glyph) || glyph_is_pet(glyph)
|| glyph_is_ridden_monster(glyph) || glyph_is_detected_monster(glyph))
This makes a list like [i for i in range(nethack.MAX_GLYPH) if nethack.glyph_is_monster(i)] have length nethack.NUMMONS*4 == 1524, but it's not contiguous.

Cf. https://github.com/fairinternal/NetHack/blob/rl/win/rl/helper.cc#L37 for a list of the offsets and take a look at the comment in https://github.com/fairinternal/NetHack/blob/rl/include/display.h#L235 explaining this.

After monsters there's MAXPCHARS == 96 cmap entries for dungeon features, then there's zap beams (NUM_ZAP << 2 == 8 << 2 == 32 many). Then there's NUMMONS << 3 == 3048 (!) "swallow" glyphs. That's a lot for stuff that basically never happens to our agents. Then there's WARNCOUNT == 6 warning glyphs and finally NUMMONS statue glyphs.

As a graphic representation, the glyph ids are:

MMMMMMPPPPPPDDDDDD%%%%%RRRRRROOOOOOOCXZSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSTTTTTT
MonstsPets--DetectBody-RiddenObjectsCXZSwaaaaaaalllllllllllooooooooooowwww-----------Statue

Where

glyph_labels = {
GLYPH_MON_OFF: "M", # 6.38%
GLYPH_PET_OFF: "P", # 6.38%
GLYPH_INVIS_OFF: " ", # 0.02%
GLYPH_DETECT_OFF: "D", # 6.38%
GLYPH_BODY_OFF: "%", # 6.38%
GLYPH_RIDDEN_OFF: "R", # 6.38%
GLYPH_OBJ_OFF: "O", # 7.58%
GLYPH_CMAP_OFF: "C", # 1.46%
GLYPH_EXPLODE_OFF: "X", # 1.05%
GLYPH_ZAP_OFF: "Z", # 0.54%
GLYPH_SWALLOW_OFF: "S", # 51.00%
GLYPH_WARNING_OFF: "W", # 0.10%
GLYPH_STATUE_OFF: "T", # 6.38%
MAX_GLYPH: "-",
}
More than half of all glyph ids are swallow!

We should rethink the featurization of the glyph ids.

@StephenOman StephenOman added the enhancement New feature or request label Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant