Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract pattern matching functionality is bad #104

Open
Morilli opened this issue Oct 5, 2024 · 3 comments
Open

Extract pattern matching functionality is bad #104

Morilli opened this issue Oct 5, 2024 · 3 comments

Comments

@Morilli
Copy link
Contributor

Morilli commented Oct 5, 2024

The description for the --pattern option in wad extract reads: extract only files matching pattern with shell-like wildcards.
One problem is that while it's possible to chain multiple commands, like -p "*.bin" -p "*.dds", this is not clear from reading the command description alone.
Furthermore, it is not possible to filter only extensionless files. Perhaps the extracted paths should also be sanitized before being passed to the pattern matching function to allow filtering by "guessed" extensions, like .cdtb.bin.

Maybe just making the pattern a regex would also solve all of this, as you can match basically everything with regex alone.

@benoitryder
Copy link
Member

Furthermore, it is not possible to filter only extensionless files.

Shell-like patterns are fairly limited. But more advanced regexp are less known and less convenient for simple cases.

We could add an option to extract only files retrieved from a file (or stdin).
The user would be able to use any kind of regexp or script.
And one-liners would still be possible; something like that:

cdtb wad-list some.wad | grep -v '\.[a-z0-9]\+$' | cdtb wad-extract --from-list - some.wad

@Morilli
Copy link
Contributor Author

Morilli commented Oct 23, 2024

What about this?

diff --git a/cdtb/__main__.py b/cdtb/__main__.py
index 2981da8..4629922 100644
--- a/cdtb/__main__.py
+++ b/cdtb/__main__.py
@@ -133,10 +133,11 @@ def command_wad_extract(parser, args):
     elif args.unknown == 'no':
         wad.files = [wf for wf in wad.files if wf.path is not None]
 
+    wad.guess_extensions()
+    wad.sanitize_paths()
     if args.pattern:
         wad.files = [wf for wf in wad.files if any(wf.path is not None and fnmatch.fnmatchcase(wf.path, p) for p in args.pattern)]
 
-    wad.guess_extensions()
     wad.extract(args.output, overwrite=not args.lazy)
 
 

This would transform file paths before applying the pattern, allowing patterns like *.cdtb.bin.

@benoitryder
Copy link
Member

We should probably (at least by default) apply patterns to what's really in the file.
Otherwise, it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants