Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better file comparing #142

Merged
merged 2 commits into from
Jan 30, 2024
Merged

better file comparing #142

merged 2 commits into from
Jan 30, 2024

Conversation

MilyMilo
Copy link
Contributor

@MilyMilo MilyMilo commented Jan 25, 2024

Use changes in CTFd 3.7+ (sha1sum for files) to determine whether files should be re-uploaded.

CTFd/CTFd#2451

By comparing the hash we can avoid always deleting and re-uploading files when syncing or verifying challenges.

Also did some housekeeping:

  • load_installed_challenge should throw an exception when it fails to load the challenge. This has been always assumed and there are checks before that call, but there's just no reason for it to return None instead of throwing.
  • Moved helper functions to be before the public functions, as they were all over the place.

These changes should be backward compatible, sha1sum has been treated completely optionally, and we'll still use the old behavior if it's not present.

Comment on lines +437 to +451
# Create a dictionary of remote files in { basename: {"url": "", "location": ""} } format
def _normalize_remote_files(self, remote_files: List[str]) -> Dict[str, Dict[str, str]]:
normalized = {}
for f in remote_files:
file_parts = f.split("?token=")[0].split("/")
normalized[file_parts[-1]] = {"url": f, "location": f"{file_parts[-2]}/{file_parts[-1]}"}

return normalized

# Create a dictionary of sha1sums in { location: sha1sum } format
def _get_files_sha1sums(self) -> Dict[str, str]:
r = self.api.get("/api/v1/files?type=challenge")
r.raise_for_status()
return {f["location"]: f.get("sha1sum", None) for f in r.json()["data"]}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff got a bit messy but this is the first real addition

Comment on lines 502 to +533
if "files" not in ignore:
self._delete_existing_files()
if challenge.get("files"):
self._create_files()
# Get basenames of local files to compare against remote files
local_files = {f.split("/")[-1]: f for f in self.get("files", [])}
remote_files = self._normalize_remote_files(remote_challenge.get("files", []))

# Delete remote files which are no longer defined locally
for remote_file in remote_files:
if remote_file not in local_files:
self._delete_file(remote_files[remote_file]["location"])

# Only check for file changes if there are files to upload
if local_files:
sha1sums = self._get_files_sha1sums()
for local_file_name in local_files:
# Creating a new file
if local_file_name not in remote_files:
self._create_file(self.challenge_directory / local_files[local_file_name])
continue

# Updating an existing file
# sha1sum is present in CTFd 3.7+, use it instead of always re-uploading the file if possible
remote_file_sha1sum = sha1sums[remote_files[local_file_name]["location"]]
if remote_file_sha1sum is not None:
with open(self.challenge_directory / local_files[local_file_name], "rb") as lf:
local_file_sha1sum = hash_file(lf)

if local_file_sha1sum == remote_file_sha1sum:
continue

# if sha1sums are not present, or the hashes are different, re-upload the file
self._delete_file(remote_files[local_file_name]["location"])
self._create_file(self.challenge_directory / local_files[local_file_name])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the updated code for comparing the sums with sync

Comment on lines -729 to +812
local_files = {Path(f).name: f for f in challenge.get("files", [])}
remote_files = {f.split("/")[-1].split("?token=")[0]: f for f in remote_challenge["files"]}
# Check if files defined in challenge.yml are present
try:
self._validate_files()
local_files = {Path(f).name: f for f in challenge.get("files", [])}
except InvalidChallengeFile:
return False

remote_files = self._normalize_remote_files(remote_challenge["files"])
# Check if there are no extra local files
for local_file in local_files:
if local_file not in remote_files:
return False

sha1sums = self._get_files_sha1sums()
# Check if all remote files are present locally
for remote_file in remote_files:
if remote_file not in local_files:
for remote_file_name in remote_files:
if remote_file_name not in local_files:
return False

# Check if the remote files are the same as local
r = self.api.get(remote_files[remote_file])
# sha1sum is present in CTFd 3.7+, use it instead of downloading the file if possible
remote_file_sha1sum = sha1sums[remote_files[remote_file_name]["location"]]
if remote_file_sha1sum is not None:
with open(self.challenge_directory / local_files[remote_file_name], "rb") as lf:
local_file_sha1sum = hash_file(lf)

if local_file_sha1sum != remote_file_sha1sum:
return False

return True

# If sha1sum is not present, download the file and compare the contents
r = self.api.get(remote_files[remote_file_name]["url"])
r.raise_for_status()
remote_file_contents = r.content
local_file_contents = (self.challenge_directory / local_files[remote_file]).read_bytes()
local_file_contents = (self.challenge_directory / local_files[remote_file_name]).read_bytes()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are changes for mirror

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the function has been copied from CTFd to be compatible

@ColdHeat ColdHeat merged commit ff5c8a9 into CTFd:master Jan 30, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants