-
-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(biome_glob): add dedicated crate for globs #4609
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
|
||
[package] | ||
authors.workspace = true | ||
categories.workspace = true | ||
description = "<DESCRIPTION>" | ||
edition.workspace = true | ||
homepage.workspace = true | ||
keywords.workspace = true | ||
license.workspace = true | ||
name = "biome_glob" | ||
repository.workspace = true | ||
version = "0.1.0" | ||
|
||
[lints] | ||
workspace = true | ||
|
||
[dependencies] | ||
biome_deserialize = { workspace = true, optional = true } | ||
biome_text_size = { workspace = true, optional = true } | ||
globset = { workspace = true } | ||
schemars = { workspace = true, optional = true } | ||
serde = { workspace = true, optional = true } | ||
|
||
[features] | ||
biome_deserialize = ["dep:biome_deserialize", "dep:biome_text_size"] | ||
schemars = ["dep:schemars"] | ||
serde = ["dep:serde"] |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,31 +1,86 @@ | ||||||
use biome_rowan::{TextRange, TextSize}; | ||||||
//! biome_glob provides a globbing functionality. When listing the globs to match, it also possible to provide globs that function as "expectations" by prefixing the globs with `!`. | ||||||
//! | ||||||
//! ## Matching a path against a glob | ||||||
//! | ||||||
//! You can create a glob from a string using [core::str::FromStr::from_str] or the corresponding method `parse`. | ||||||
//! A glob can match against anything that can be turned into a [std::path::Path]. | ||||||
//! This is, for example, the access of strings. | ||||||
//! | ||||||
//! ``` | ||||||
//! use biome_glob::Glob; | ||||||
//! | ||||||
//! let glob = "*.rs".parse::<Glob>().expect("correct glob"); | ||||||
//! assert!(glob.is_match("lib.rs")); | ||||||
//! assert!(!glob.is_match("src/lib.rs")); | ||||||
//! ``` | ||||||
//! | ||||||
//! ## Matching against multiple globs | ||||||
//! | ||||||
//! When a path is expected to be matched against several globs, | ||||||
//! you should compile the path into a [CandidatePath]. | ||||||
//! [CandidatePath] may speed up matching against several globs. | ||||||
//! | ||||||
//! ``` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add a paragraph that explains the example. |
||||||
//! use biome_glob::{CandidatePath, Glob}; | ||||||
//! | ||||||
//! let globs: &[Glob] = &[ | ||||||
//! "**/*.rs".parse().expect("correct glob"), | ||||||
//! "**/*.txt".parse().expect("correct glob"), | ||||||
//! ]; | ||||||
//! | ||||||
//! let path = CandidatePath::new(&"a/path/to/file.txt"); | ||||||
//! | ||||||
//! assert!(globs.iter().any(|glob| path.matches(glob))); | ||||||
//! ``` | ||||||
//! | ||||||
//! ## Matching against multiple globs and exceptions | ||||||
//! | ||||||
//! biome_glob supports negated globs, which are particularly useful for encoding exceptions. | ||||||
//! In the following example, we accept all files in the `src` directory, except the ones ending with the `txt` extension. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
What "encoding expectations" actually mean? We could try to use simpler language. I would add a small link/reference to |
||||||
//! | ||||||
//! ``` | ||||||
//! use biome_glob::{CandidatePath, Glob}; | ||||||
//! | ||||||
//! let globs: &[Glob] = &[ | ||||||
//! "**/*.rs".parse().expect("correct glob"), | ||||||
//! "!**/*.txt".parse().expect("correct glob"), | ||||||
//! ]; | ||||||
//! | ||||||
//! let path = CandidatePath::new(&"a/path/to/file.txt"); | ||||||
//! | ||||||
//! assert!(!path.matches_with_exceptions(globs)); | ||||||
//! ``` | ||||||
//! | ||||||
//! ## Supported syntax | ||||||
//! | ||||||
//! A Biome glob pattern supports the following syntaxes: | ||||||
//! | ||||||
//! - star `*` that matches zero or more characters inside a path segment | ||||||
//! - globstar `**` that matches zero or more path segments | ||||||
//! - Use `\*` to escape `*` | ||||||
//! - `?`, `[`, `]`, `{`, and `}` must be escaped using `\`. | ||||||
//! These characters are reserved for future use. | ||||||
//! - Use `!` as first character to negate the glob | ||||||
//! | ||||||
Comment on lines
+58
to
+63
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should provide a minimal example for each bullet that we explain |
||||||
//! A path segment is delimited by path separator `/` or the start/end of the path. | ||||||
//! | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we should have a brief explanation of the features available by this crate |
||||||
/// A restricted glob pattern only supports the following syntaxes: | ||||||
/// | ||||||
/// - star `*` that matches zero or more characters inside a path segment | ||||||
/// - globstar `**` that matches zero or more path segments | ||||||
/// - Use `\*` to escape `*` | ||||||
/// - `?`, `[`, `]`, `{`, and `}` must be escaped using `\`. | ||||||
/// These characters are reserved for future use. | ||||||
/// - Use `!` as first character to negate the glob | ||||||
/// | ||||||
/// A path segment is delimited by path separator `/` or the start/end of the path. | ||||||
#[derive(Clone, Debug, serde::Deserialize, serde::Serialize)] | ||||||
#[serde(try_from = "String", into = "String")] | ||||||
pub struct RestrictedGlob { | ||||||
/// A Biome glob pattern. | ||||||
#[derive(Clone, Debug)] | ||||||
#[cfg_attr(feature = "serde", derive(serde::Deserialize, serde::Serialize))] | ||||||
#[cfg_attr(feature = "serde", serde(try_from = "String", into = "String"))] | ||||||
pub struct Glob { | ||||||
is_negated: bool, | ||||||
glob: globset::GlobMatcher, | ||||||
} | ||||||
impl RestrictedGlob { | ||||||
impl Glob { | ||||||
/// Returns `true` if this glob is negated. | ||||||
/// | ||||||
/// ``` | ||||||
/// use biome_js_analyze::utils::restricted_glob::RestrictedGlob; | ||||||
/// | ||||||
/// let glob = "!*.js".parse::<RestrictedGlob>().unwrap(); | ||||||
/// let glob = "!*.js".parse::<biome_glob::Glob>().unwrap(); | ||||||
/// assert!(glob.is_negated()); | ||||||
/// | ||||||
/// let glob = "*.js".parse::<RestrictedGlob>().unwrap(); | ||||||
/// let glob = "*.js".parse::<biome_glob::Glob>().unwrap(); | ||||||
/// assert!(!glob.is_negated()); | ||||||
/// ``` | ||||||
pub fn is_negated(&self) -> bool { | ||||||
|
@@ -52,31 +107,31 @@ impl RestrictedGlob { | |||||
self.glob.is_match_candidate(&path.0) | ||||||
} | ||||||
} | ||||||
impl PartialEq for RestrictedGlob { | ||||||
impl PartialEq for Glob { | ||||||
fn eq(&self, other: &Self) -> bool { | ||||||
self.is_negated == other.is_negated && self.glob.glob() == other.glob.glob() | ||||||
} | ||||||
} | ||||||
impl Eq for RestrictedGlob {} | ||||||
impl std::hash::Hash for RestrictedGlob { | ||||||
impl Eq for Glob {} | ||||||
impl std::hash::Hash for Glob { | ||||||
fn hash<H: std::hash::Hasher>(&self, state: &mut H) { | ||||||
self.is_negated.hash(state); | ||||||
self.glob.glob().hash(state); | ||||||
} | ||||||
} | ||||||
impl std::fmt::Display for RestrictedGlob { | ||||||
impl std::fmt::Display for Glob { | ||||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | ||||||
let repr = self.glob.glob(); | ||||||
let negation = if self.is_negated { "!" } else { "" }; | ||||||
write!(f, "{negation}{repr}") | ||||||
} | ||||||
} | ||||||
impl From<RestrictedGlob> for String { | ||||||
fn from(value: RestrictedGlob) -> Self { | ||||||
impl From<Glob> for String { | ||||||
fn from(value: Glob) -> Self { | ||||||
value.to_string() | ||||||
} | ||||||
} | ||||||
impl std::str::FromStr for RestrictedGlob { | ||||||
impl std::str::FromStr for Glob { | ||||||
type Err = RestrictedGlobError; | ||||||
fn from_str(value: &str) -> Result<Self, Self::Err> { | ||||||
let (is_negated, value) = if let Some(stripped) = value.strip_prefix('!') { | ||||||
|
@@ -91,7 +146,7 @@ impl std::str::FromStr for RestrictedGlob { | |||||
// Only `**` can match `/` | ||||||
glob_builder.literal_separator(true); | ||||||
match glob_builder.build() { | ||||||
Ok(glob) => Ok(RestrictedGlob { | ||||||
Ok(glob) => Ok(Glob { | ||||||
is_negated, | ||||||
glob: glob.compile_matcher(), | ||||||
}), | ||||||
|
@@ -101,14 +156,15 @@ impl std::str::FromStr for RestrictedGlob { | |||||
} | ||||||
} | ||||||
} | ||||||
impl TryFrom<String> for RestrictedGlob { | ||||||
impl TryFrom<String> for Glob { | ||||||
type Error = RestrictedGlobError; | ||||||
fn try_from(value: String) -> Result<Self, Self::Error> { | ||||||
value.parse() | ||||||
} | ||||||
} | ||||||
// We use a custom impl to precisely report the location of the error. | ||||||
impl biome_deserialize::Deserializable for RestrictedGlob { | ||||||
#[cfg(feature = "biome_deserialize")] | ||||||
impl biome_deserialize::Deserializable for Glob { | ||||||
fn deserialize( | ||||||
value: &impl biome_deserialize::DeserializableValue, | ||||||
name: &str, | ||||||
|
@@ -120,7 +176,10 @@ impl biome_deserialize::Deserializable for RestrictedGlob { | |||||
Err(error) => { | ||||||
let range = value.range(); | ||||||
let range = error.index().map_or(range, |index| { | ||||||
TextRange::at(range.start() + TextSize::from(1 + index), 1u32.into()) | ||||||
biome_text_size::TextRange::at( | ||||||
range.start() + biome_text_size::TextSize::from(1 + index), | ||||||
1u32.into(), | ||||||
) | ||||||
}); | ||||||
diagnostics.push( | ||||||
biome_deserialize::DeserializationDiagnostic::new(format_args!("{error}")) | ||||||
|
@@ -132,7 +191,7 @@ impl biome_deserialize::Deserializable for RestrictedGlob { | |||||
} | ||||||
} | ||||||
#[cfg(feature = "schemars")] | ||||||
impl schemars::JsonSchema for RestrictedGlob { | ||||||
impl schemars::JsonSchema for Glob { | ||||||
fn schema_name() -> String { | ||||||
"Regex".to_string() | ||||||
} | ||||||
|
@@ -156,7 +215,7 @@ impl<'a> CandidatePath<'a> { | |||||
} | ||||||
|
||||||
/// Tests whether the current path matches `glob`. | ||||||
pub fn matches(&self, glob: &RestrictedGlob) -> bool { | ||||||
pub fn matches(&self, glob: &Glob) -> bool { | ||||||
glob.is_match_candidate(self) | ||||||
} | ||||||
|
||||||
|
@@ -165,9 +224,9 @@ impl<'a> CandidatePath<'a> { | |||||
/// Let's take an example: | ||||||
/// | ||||||
/// ``` | ||||||
/// use biome_js_analyze::utils::restricted_glob::{CandidatePath, RestrictedGlob}; | ||||||
/// use biome_glob::{CandidatePath, Glob}; | ||||||
/// | ||||||
/// let globs: &[RestrictedGlob] = &[ | ||||||
/// let globs: &[Glob] = &[ | ||||||
/// "*".parse().unwrap(), | ||||||
/// "!a*".parse().unwrap(), | ||||||
/// "a".parse().unwrap(), | ||||||
|
@@ -189,7 +248,7 @@ impl<'a> CandidatePath<'a> { | |||||
/// | ||||||
pub fn matches_with_exceptions<'b, I>(&self, globs: I) -> bool | ||||||
where | ||||||
I: IntoIterator<Item = &'b RestrictedGlob>, | ||||||
I: IntoIterator<Item = &'b Glob>, | ||||||
I::IntoIter: DoubleEndedIterator, | ||||||
{ | ||||||
self.matches_with_exceptions_or(false, globs) | ||||||
|
@@ -203,9 +262,9 @@ impl<'a> CandidatePath<'a> { | |||||
/// | ||||||
/// | ||||||
/// ``` | ||||||
/// use biome_js_analyze::utils::restricted_glob::{CandidatePath, RestrictedGlob}; | ||||||
/// use biome_glob::{CandidatePath, Glob}; | ||||||
/// | ||||||
/// let globs: &[RestrictedGlob] = &[ | ||||||
/// let globs: &[Glob] = &[ | ||||||
/// "a/path".parse().unwrap(), | ||||||
/// "!b".parse().unwrap(), | ||||||
/// ]; | ||||||
|
@@ -222,7 +281,7 @@ impl<'a> CandidatePath<'a> { | |||||
/// ``` | ||||||
pub fn matches_directory_with_exceptions<'b, I>(&self, globs: I) -> bool | ||||||
where | ||||||
I: IntoIterator<Item = &'b RestrictedGlob>, | ||||||
I: IntoIterator<Item = &'b Glob>, | ||||||
I::IntoIter: DoubleEndedIterator, | ||||||
{ | ||||||
self.matches_with_exceptions_or(true, globs) | ||||||
|
@@ -232,7 +291,7 @@ impl<'a> CandidatePath<'a> { | |||||
/// Returns `default` if there is no globs that match. | ||||||
fn matches_with_exceptions_or<'b, I>(&self, default: bool, globs: I) -> bool | ||||||
where | ||||||
I: IntoIterator<Item = &'b RestrictedGlob>, | ||||||
I: IntoIterator<Item = &'b Glob>, | ||||||
I::IntoIter: DoubleEndedIterator, | ||||||
{ | ||||||
// Iterate in reverse order to avoid unnecessary glob matching. | ||||||
|
@@ -379,45 +438,33 @@ mod tests { | |||||
|
||||||
#[test] | ||||||
fn test_restricted_regex() { | ||||||
assert!(!"*.js" | ||||||
.parse::<RestrictedGlob>() | ||||||
.unwrap() | ||||||
.is_match("file/path.js")); | ||||||
assert!(!"*.js".parse::<Glob>().unwrap().is_match("file/path.js")); | ||||||
|
||||||
assert!("**/*.js" | ||||||
.parse::<RestrictedGlob>() | ||||||
.unwrap() | ||||||
.is_match("file/path.js")); | ||||||
assert!("**/*.js".parse::<Glob>().unwrap().is_match("file/path.js")); | ||||||
} | ||||||
|
||||||
#[test] | ||||||
fn test_match_with_exceptions() { | ||||||
let a = CandidatePath::new(&"a"); | ||||||
|
||||||
assert!(a.matches_with_exceptions(&[ | ||||||
RestrictedGlob::from_str("*").unwrap(), | ||||||
RestrictedGlob::from_str("!b").unwrap(), | ||||||
Glob::from_str("*").unwrap(), | ||||||
Glob::from_str("!b").unwrap(), | ||||||
])); | ||||||
assert!(!a.matches_with_exceptions(&[ | ||||||
RestrictedGlob::from_str("*").unwrap(), | ||||||
RestrictedGlob::from_str("!a*").unwrap(), | ||||||
Glob::from_str("*").unwrap(), | ||||||
Glob::from_str("!a*").unwrap(), | ||||||
])); | ||||||
assert!(a.matches_with_exceptions(&[ | ||||||
RestrictedGlob::from_str("*").unwrap(), | ||||||
RestrictedGlob::from_str("!a*").unwrap(), | ||||||
RestrictedGlob::from_str("a").unwrap(), | ||||||
Glob::from_str("*").unwrap(), | ||||||
Glob::from_str("!a*").unwrap(), | ||||||
Glob::from_str("a").unwrap(), | ||||||
])); | ||||||
} | ||||||
|
||||||
#[test] | ||||||
fn test_to_string() { | ||||||
assert_eq!( | ||||||
RestrictedGlob::from_str("**/*.js").unwrap().to_string(), | ||||||
"**/*.js" | ||||||
); | ||||||
assert_eq!( | ||||||
RestrictedGlob::from_str("!**/*.js").unwrap().to_string(), | ||||||
"!**/*.js" | ||||||
); | ||||||
assert_eq!(Glob::from_str("**/*.js").unwrap().to_string(), "**/*.js"); | ||||||
assert_eq!(Glob::from_str("!**/*.js").unwrap().to_string(), "!**/*.js"); | ||||||
} | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't show the usage of
FromStr::from_str
and we should add it.Also, I would remove the
is_match
because the previous paragraph didn't mention theis_match
function. If you intend the keepis_match
in the example, you should explain it in the previous paragraph, and explain whysrc/lib.rs
isn't a match, and whylib.rs
is a match.