You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When importing some DLLs with their PDBs, Ghidra creates multiple versions of the same class names, all different from each other. By default, the wrong version of the class is used, which makes everything problematic to read.
To Reproduce / Expected behavior
Import the binaries in the attached archive from the 1_0 or 1_1 folder (when importing other versions, i.e. 1.2 and newer, this does not occur, but I included them in the archive just in case. For my testing I used binaries from the 1_1 folder!!!).
Look at CBasePlayer and its copies. The real CBasePlayer has a size of 0x26A0 (1_1 version), in the GIFs below, my research from additional context column and decompile in IDA Pro confirms this. Some classes also have the same issue (you can find them by simply typing .conflict substring into Data Type Manager window).
If it can help you somehow, then the hierarchy of the same CBasePlayer is as follows: CBaseEntity -> CBaseDelay -> CBaseAnimating -> CBaseToggle -> CBaseMonster -> CBasePlayer
I know that the issue title may not sound quite correct, so feel free to edit it. I'll try to be as brief as I can.
I managed before to import a few dozen PDBs from other different games that were compiled on different MSVC versions and linker flags, and in all this time I encountered such a bug only in two DLLs. And the most interesting thing is that both of these DLLs are the first two versions of the same game modification.
But in next versions of that game modification, this did not happen (but the versions as a whole did not differ from each other significantly, rather bug fixes and the addition of a couple of small features), so I have an assumption that perhaps some options were added or removed used in the linker for newer patches, due to which Ghidra perceives PDB reading better? Their toolchain remained the same judging by linker version from dumpbin /headers (MSVC 7.10 aka .NET 2003).
Based on the same GUID from DLL and PDB match, the modification date of both files is the same and the fact that they were supplied with the modification by the developer and updated with each new patch - there is no doubt to think that all this is genuine. In IDA Pro, the correct and only version of the class is determined, I can say this based on the size of the class itself and the offsets of the fields to the member variables, comparing with those that are saved in the game code.
The text was updated successfully, but these errors were encountered:
This is another tough issue, and just wanted to give you an initial reply. I have a related internal issue to re-investigate the mechanism used for mapping forward references to structure definitions. However, that whole mechanism stinks in general, but has been instrumental to getting PDB Universal working. The issue is that, in many places, forward references are used, such as with a pointer to a specific structure. The forward reference is just an empty version of the named structure. Something has to match this forward reference to the actual definition.
Normally, we should have a complete picture. A function might have an argument that is the pointer to struct A. Internal to the structure, there might be access to offsets within A but no real definition to A. In this case, we would need to rely on data type propagation from the calling function (and possibly further upstream from it). If a function has a local struct A, then we would know the precise definition. If the function only has a pointer to A and then does an allocation (new) as in your example, then we might have a chance to limit the definitions (like you did) to a shorter list of those that have the same size (that is unless there is a flexible array at the end of the structure). If we had a complete picture of all data type usage (to include locals and parameters), we would have a better chance of determining the correct definition to use at any place using any of these techniques. I cannot say for sure that this would fix everything, but I feel that it would go a long way.
The specific issue in your 1_1 (the only one I've looked at so far) examples is that there is only one forward reference to CBasePlayer, but 18-20 different definitions. What the "mapper" does, is to try to map one forward reference to one definitions; I would normally see the same number of each from examples I had built. And I had normally always tried to do clean builds. But when I would forget to do a clean build, I would encounter what we see in this 1_1 example: multiple definitions, where generally, the older definitions do not matter, but they still exist within the PDB. However... there are times where multiple different definitions could be found within a program, so we cannot pick just one and run with it either.
I'd like to fix this issue, but it would likely take some time and rely on other fixes, such as for #7099. I wish I could give you the best guidance for before continuing to work. I think you might be able to fix things by hand by noting the size issues (advise to check for flex arrays). You also might find that using the PDB MSDIA solution might work (I have not yet tried).
Describe the bug
When importing some DLLs with their PDBs, Ghidra creates multiple versions of the same class names, all different from each other. By default, the wrong version of the class is used, which makes everything problematic to read.
To Reproduce / Expected behavior
1_0
or1_1
folder (when importing other versions, i.e. 1.2 and newer, this does not occur, but I included them in the archive just in case. For my testing I used binaries from the1_1
folder!!!).0x26A0
(1_1 version), in the GIFs below, my research from additional context column and decompile in IDA Pro confirms this. Some classes also have the same issue (you can find them by simply typing.conflict
substring into Data Type Manager window).If it can help you somehow, then the hierarchy of the same CBasePlayer is as follows:
CBaseEntity -> CBaseDelay -> CBaseAnimating -> CBaseToggle -> CBaseMonster -> CBasePlayer
Screenshots
Attachments
application.log
cryoffear-binaries-server.zip
Environment (please complete the following information):
Additional context
I know that the issue title may not sound quite correct, so feel free to edit it. I'll try to be as brief as I can.
I managed before to import a few dozen PDBs from other different games that were compiled on different MSVC versions and linker flags, and in all this time I encountered such a bug only in two DLLs. And the most interesting thing is that both of these DLLs are the first two versions of the same game modification.
But in next versions of that game modification, this did not happen (but the versions as a whole did not differ from each other significantly, rather bug fixes and the addition of a couple of small features), so I have an assumption that perhaps some options were added or removed used in the linker for newer patches, due to which Ghidra perceives PDB reading better? Their toolchain remained the same judging by linker version from
dumpbin /headers
(MSVC 7.10 aka .NET 2003).For context on why I was looking for this via the
player
function in the first GIF to get the size of the class, see the following lines from the source code:https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/player.cpp#L342
https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/util.h#L90-L93
https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/cbase.h#L887-L908
https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/cbase.h#L341-L345
https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/util.h#L102-L107
And fortunately, in the Half-Life code (and the modification is based on it), for the save-restore system, the developers save some fields and their offsets into the game code using
offsetof
and macros:https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/engine/eiface.h#L367-L384
https://github.com/HLSources/Paranoia/blob/c2d8018e3fa9de209ba76c0bbe8491bc09d8d797/code_src/dlls/player.cpp#L88
This will serve as an additional assistant in our case, which will confirm whether the offset of a given field matches what is specified in the class or not. I showed this in the example in the second GIF.
Based on the same GUID from DLL and PDB match, the modification date of both files is the same and the fact that they were supplied with the modification by the developer and updated with each new patch - there is no doubt to think that all this is genuine. In
IDA Pro
, the correct and only version of the class is determined, I can say this based on the size of the class itself and the offsets of the fields to the member variables, comparing with those that are saved in the game code.The text was updated successfully, but these errors were encountered: