Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMRay and dynamic improvements #2537

Merged
merged 5 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions capa/features/extractors/vmray/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,8 @@ def _compute_sections(self):
for pefile_section in self.sample_file_static_data.pe.sections:
self.sections[pefile_section.virtual_address] = pefile_section.name
elif self.sample_file_static_data.elf:
if self.sample_file_static_data.elf.sections:
for elffile_section in self.sample_file_static_data.elf.sections:
self.sections[elffile_section.header.sh_addr] = elffile_section.header.sh_name
for elffile_section in self.sample_file_static_data.elf.sections:
self.sections[elffile_section.header.sh_addr] = elffile_section.header.sh_name

def _compute_monitor_processes(self):
for process in self.sv2.processes.values():
Expand Down Expand Up @@ -193,13 +192,14 @@ def _compute_monitor_processes(self):
# for the other fields we've observed cases with slight deviations, e.g.,
# the ppid for a process in flog.xml is not set correctly, all other data is equal
sv2p = self.monitor_processes[monitor_process.process_id]
if self.monitor_processes[monitor_process.process_id] != vmray_monitor_process:
logger.debug("processes differ: %s (sv2) vs. %s (flog)", sv2p, vmray_monitor_process)

assert (sv2p.pid, sv2p.monitor_id, sv2p.origin_monitor_id) == (
vmray_monitor_process.pid,
vmray_monitor_process.monitor_id,
vmray_monitor_process.origin_monitor_id,
)
if self.monitor_processes[monitor_process.process_id] != vmray_monitor_process:
logger.debug("processes differ: %s (sv2) vs. %s (flog)", sv2p, vmray_monitor_process)

def _compute_monitor_threads(self):
for monitor_thread in self.flog.analysis.monitor_threads:
Expand Down
2 changes: 1 addition & 1 deletion capa/features/extractors/vmray/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ class ElfFileHeader(BaseModel):

class ElfFile(BaseModel):
# file_header: ElfFileHeader
sections: Optional[list[ElfFileSection]] = None
sections: list[ElfFileSection] = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incidentally, is this the correct way to set the default value, particularly as a list? i see this pattern used throughout the file.

my worry is that the default value = [] uses the same instance of a mutable list, rather than copies of it. sorta like when you have a kwarg parameter def foo(bar=[]).

in the past, i've used pydantic.Field for these. but maybe pydantic is extra smart and doesn't require this. @mr-tz @mike-hunhoff

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, great question, this is how mypy accepted the change and I saw the pattern throughout. Other files use Optional[list[<foo>]] = None or Field, we should cleanup the inconsistencies (separately).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that pydantic handles this correctly (i.e. deep copy) for non-hashable default values (i.e. lists). source: https://docs.pydantic.dev/latest/concepts/models/#fields-with-non-hashable-default-values

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa that's cool!

so, looks good to me. and maybe we can update our remaining code to use this pattern.



class StaticData(BaseModel):
Expand Down
4 changes: 2 additions & 2 deletions tests/test_vmray_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,8 @@ def test_vmray_model_elffile():
"""
)

assert elffile.sections and elffile.sections[0].header.sh_name == "abcd1234"
assert elffile.sections and elffile.sections[0].header.sh_addr == 2863311530
assert elffile.sections[0].header.sh_name == "abcd1234"
assert elffile.sections[0].header.sh_addr == 2863311530


def test_vmray_model_pefile():
Expand Down
Loading