You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While trying to run checkov's test suit on my local machine I noticed that some files were being (unexpectedly) ignored. After a little digging, I realised that this is because I cloned the repo under a path that looks something like: ~/my_git_forks/checkov.
The reason this is a problem is because the following patterns are treated as regular expressions by checkov:
and the .git pattern obviously excludes any paths that include the word 'git' (incl. at least one preceding character)
The more fundamental issues
I think the more fundamental problem is that checkov documents --skip-path as including regexes, documents CKV_IGNORED_DIRECTORIES to include non-regex patterns (default values arenode_modules,.terraform,.serverless instead of node_modules,\\.terraform,\\.serverless), but then treats them equally in the code by merging them in some places, and explicitly treats them as both regexes and non-regex patterns in the same line of code.
I think someone has noticed something similar in the past because I see a few p.replace(".terraform", r"\.terraform") sprinkled around the code base.
However this just seems like another inconsistency because if users set --skip-path="^.*\.terraform.*$", it will get converted internally to ^.*\\.terraform.*$, which would not ignore paths like path/to/.terraform.
How to fix it?
Since checkov allows users to pass regexes to --skip-path and regards CKV_IGNORED_DIRECTORIES as legacy, it makes sense to consistently treat all patterns as regular expressions. However, since I just started looking at this project a couple of days ago, I have no idea what the consequences of doing this are...
Some (naive?) suggestions
Escape the default values for CKV_IGNORED_DIRECTORIES:
Remove any hard-coded references to p.replace(".terraform", r"\.terraform")
Consistently threat all references to ignored_directories, excluded_paths, and skip_path as regex patterns. This means removing all ~path in skip_path invocations since this is not compatible with regex (e.g. "\.hidden" not in ".hidden/secret.txt")
As a safeguard and to avoid ~path in skip_path code being committed in the first place, the excluded_paths variable should probably be a set of re.patterns instead of strings. e.g.:
Currently, patterns set in the form of .hidden will always exclude directories of the form .hidden-not. However, this is a problem both in the non-regex ~path in skip_path way of doing things and naive regex definitions. However, at least with regex the user can be explicit and say something like (^|.*\/)\.hidden\/?
However, this is not very user friendly and maybe there should be a way to allow users to define excluded directories in a .hidden,also_hidden flavour rather than the verbose (and non-Windown-compatible) --skip-path='(^|.*\/)\.hidden\/?' --skip-path='(^|.*\/)also_hidden\/?'
The text was updated successfully, but these errors were encountered:
Hey! New checkov user here 👋
How I noticed the problem
While trying to run checkov's test suit on my local machine I noticed that some files were being (unexpectedly) ignored. After a little digging, I realised that this is because I cloned the repo under a path that looks something like:
~/my_git_forks/checkov
.The reason this is a problem is because the following patterns are treated as regular expressions by checkov:
checkov/checkov/secrets/utils.py
Line 10 in 42d3178
and the
.git
pattern obviously excludes any paths that include the word 'git' (incl. at least one preceding character)The more fundamental issues
I think the more fundamental problem is that checkov documents
--skip-path
as including regexes, documentsCKV_IGNORED_DIRECTORIES
to include non-regex patterns (default values arenode_modules,.terraform,.serverless
instead ofnode_modules,\\.terraform,\\.serverless
), but then treats them equally in the code by merging them in some places, and explicitly treats them as both regexes and non-regex patterns in the same line of code.I think someone has noticed something similar in the past because I see a few
p.replace(".terraform", r"\.terraform")
sprinkled around the code base.However this just seems like another inconsistency because if users set
--skip-path="^.*\.terraform.*$"
, it will get converted internally to^.*\\.terraform.*$
, which would not ignore paths likepath/to/.terraform
.How to fix it?
Since checkov allows users to pass regexes to
--skip-path
and regardsCKV_IGNORED_DIRECTORIES
as legacy, it makes sense to consistently treat all patterns as regular expressions. However, since I just started looking at this project a couple of days ago, I have no idea what the consequences of doing this are...Some (naive?) suggestions
CKV_IGNORED_DIRECTORIES
:checkov/checkov/common/runners/base_runner.py
Line 46 in 42d3178
i.e.,:
Same needs to be done here:
checkov/checkov/common/util/env_vars_config.py
Line 44 in 42d3178
checkov/checkov/secrets/utils.py
Line 10 in 42d3178
p.replace(".terraform", r"\.terraform")
ignored_directories
,excluded_paths
, andskip_path
as regex patterns. This means removing all ~path in skip_path
invocations since this is not compatible with regex (e.g."\.hidden" not in ".hidden/secret.txt"
)path in skip_path
code being committed in the first place, theexcluded_paths
variable should probably be a set ofre.pattern
s instead of strings. e.g.:Other things to consider
Currently, patterns set in the form of
.hidden
will always exclude directories of the form.hidden-not
. However, this is a problem both in the non-regex ~path in skip_path
way of doing things and naive regex definitions. However, at least with regex the user can be explicit and say something like(^|.*\/)\.hidden\/?
However, this is not very user friendly and maybe there should be a way to allow users to define excluded directories in a
.hidden,also_hidden
flavour rather than the verbose (and non-Windown-compatible)--skip-path='(^|.*\/)\.hidden\/?' --skip-path='(^|.*\/)also_hidden\/?'
The text was updated successfully, but these errors were encountered: