Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows support #52

Open
alaabenfatma opened this issue Oct 15, 2020 · 4 comments
Open

Windows support #52

alaabenfatma opened this issue Oct 15, 2020 · 4 comments
Labels
proposal Ask/propose a new feature

Comments

@alaabenfatma
Copy link
Contributor

alaabenfatma commented Oct 15, 2020

The module does not support the Windows OS, it would be great to add such a feature to make the project OS independent.

State: Work in progress... ⌛

@alaabenfatma alaabenfatma added the proposal Ask/propose a new feature label Oct 15, 2020
@alaabenfatma alaabenfatma self-assigned this Oct 15, 2020
@alaabenfatma alaabenfatma changed the title [WIP] Windows support [WIP ⌛] Windows support Oct 15, 2020
@marcorosa
Copy link
Member

Hi, the reason the project does not support Windows is that there is a library (hyperscan) that is essential for the scanner but that is not supported by Windows systems.

From the home page of python-hyperscan:

A CPython extension for Hyperscan, Intel's open source, high-performance multiple regex matching library.
Currently only supports manylinux-compatible Linux distributions.

That's why we decided to provide docker containers: this way it is possible to use Credential Digger (with the UI) also from Windows.

@alaabenfatma
Copy link
Contributor Author

Hi @marcorosa,
You are totally right about the Hyperscan part.
I agree with you as well on using Docker to get around this limitation if the users just want to use the tool as is; however, this leaves Windows users with a tool that is not capable of interacting with anything that is on the host machine (Windows in this case). For example, we cannot integrate the project within any CI/CD cycle on windows for it being unable to be executed or communicated with (directly) on that OS.

My solution, which is in progress(+), targets the core of the project (the GitScanner module) with the hope to deliver an OS independent python module, taking us one step closer to plug-and-play.

@marcorosa
Copy link
Member

So, if I understand correctly, are you trying to replace these lines?

Hyperscan was the best performing library we tested, I hope that performance will stay reasonably close also using another library. In the meantime, I stay tuned for your PR ;)

@alaabenfatma
Copy link
Contributor Author

I agree, Hyperscan is the best when it comes to performance for our usecase and I believe that removing it from the project is not a good choice.
That said, I am planning on breaking the GitScanner module into three files:
.
 └── GitScanner/
   ├── GitScanner_Linux /
   │   └── (that uses Hyperscan)
   └── GitScanner_Windows/
   │    └── (that uses regex)

This can be achieved by performing a platform test to check whether we are on Windows or not. Using this test we can decide what class to use + what library to load. As for the dependencies, we will install Hyperscan on Linux and Regex on Windows.

These following changes will be included within the requirements.txt file:

hyperscan; platform_system  != "Windows"
regex; platform_system == "Windows"

In other terms, Linux users will not experience any performance loss, whereas Windows users will now have access to the tool (with a slight performance loss as a result of not using Hyperscan).

It is somewhat similar to what we've done with Sqlite & postgres clients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Ask/propose a new feature
Projects
None yet
Development

No branches or pull requests

2 participants