Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(biome_glob): add dedicated crate for globs #4609

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ biome_diagnostics_categories = { version = "0.5.7", path = "./crates/biome_diagn
biome_diagnostics_macros = { version = "0.5.7", path = "./crates/biome_diagnostics_macros" }
biome_formatter = { version = "0.5.7", path = "./crates/biome_formatter" }
biome_fs = { version = "0.5.7", path = "./crates/biome_fs" }
biome_glob = { version = "0.1.0", path = "./crates/biome_glob" }
biome_graphql_analyze = { version = "0.0.1", path = "./crates/biome_graphql_analyze" }
biome_graphql_factory = { version = "0.1.0", path = "./crates/biome_graphql_factory" }
biome_graphql_formatter = { version = "0.1.0", path = "./crates/biome_graphql_formatter" }
Expand Down
27 changes: 27 additions & 0 deletions crates/biome_glob/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

[package]
authors.workspace = true
categories.workspace = true
description = "<DESCRIPTION>"
edition.workspace = true
homepage.workspace = true
keywords.workspace = true
license.workspace = true
name = "biome_glob"
repository.workspace = true
version = "0.1.0"

[lints]
workspace = true

[dependencies]
biome_deserialize = { workspace = true, optional = true }
biome_text_size = { workspace = true, optional = true }
globset = { workspace = true }
schemars = { workspace = true, optional = true }
serde = { workspace = true, optional = true }

[features]
biome_deserialize = ["dep:biome_deserialize", "dep:biome_text_size"]
schemars = ["dep:schemars"]
serde = ["dep:serde"]
Original file line number Diff line number Diff line change
@@ -1,31 +1,86 @@
use biome_rowan::{TextRange, TextSize};
//! biome_glob provides a globbing functionality. When listing the globs to match, it also possible to provide globs that function as "expectations" by prefixing the globs with `!`.
//!
//! ## Matching a path against a glob
//!
//! You can create a glob from a string using [core::str::FromStr::from_str] or the corresponding method `parse`.
//! A glob can match against anything that can be turned into a [std::path::Path].
//! This is, for example, the access of strings.
//!
//! ```
//! use biome_glob::Glob;
//!
//! let glob = "*.rs".parse::<Glob>().expect("correct glob");
//! assert!(glob.is_match("lib.rs"));
//! assert!(!glob.is_match("src/lib.rs"));
//! ```
Comment on lines +12 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example doesn't show the usage of FromStr::from_str and we should add it.

Also, I would remove the is_match because the previous paragraph didn't mention the is_match function. If you intend the keep is_match in the example, you should explain it in the previous paragraph, and explain why src/lib.rs isn't a match, and why lib.rs is a match.

//!
//! ## Matching against multiple globs
//!
//! When a path is expected to be matched against several globs,
//! you should compile the path into a [CandidatePath].
//! [CandidatePath] may speed up matching against several globs.
//!
//! ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a paragraph that explains the example.

//! use biome_glob::{CandidatePath, Glob};
//!
//! let globs: &[Glob] = &[
//! "**/*.rs".parse().expect("correct glob"),
//! "**/*.txt".parse().expect("correct glob"),
//! ];
//!
//! let path = CandidatePath::new(&"a/path/to/file.txt");
//!
//! assert!(globs.iter().any(|glob| path.matches(glob)));
//! ```
//!
//! ## Matching against multiple globs and exceptions
//!
//! biome_glob supports negated globs, which are particularly useful for encoding exceptions.
//! In the following example, we accept all files in the `src` directory, except the ones ending with the `txt` extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! biome_glob supports negated globs, which are particularly useful for encoding exceptions.
//! biome_glob supports exceptions - or negated globs -, which are particularly useful for encoding exceptions.

What "encoding expectations" actually mean? We could try to use simpler language.

I would add a small link/reference to git negated globs. What do you think?

//!
//! ```
//! use biome_glob::{CandidatePath, Glob};
//!
//! let globs: &[Glob] = &[
//! "**/*.rs".parse().expect("correct glob"),
//! "!**/*.txt".parse().expect("correct glob"),
//! ];
//!
//! let path = CandidatePath::new(&"a/path/to/file.txt");
//!
//! assert!(!path.matches_with_exceptions(globs));
//! ```
//!
//! ## Supported syntax
//!
//! A Biome glob pattern supports the following syntaxes:
//!
//! - star `*` that matches zero or more characters inside a path segment
//! - globstar `**` that matches zero or more path segments
//! - Use `\*` to escape `*`
//! - `?`, `[`, `]`, `{`, and `}` must be escaped using `\`.
//! These characters are reserved for future use.
//! - Use `!` as first character to negate the glob
//!
Comment on lines +58 to +63
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should provide a minimal example for each bullet that we explain

//! A path segment is delimited by path separator `/` or the start/end of the path.
//!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should have a brief explanation of the features available by this crate

/// A restricted glob pattern only supports the following syntaxes:
///
/// - star `*` that matches zero or more characters inside a path segment
/// - globstar `**` that matches zero or more path segments
/// - Use `\*` to escape `*`
/// - `?`, `[`, `]`, `{`, and `}` must be escaped using `\`.
/// These characters are reserved for future use.
/// - Use `!` as first character to negate the glob
///
/// A path segment is delimited by path separator `/` or the start/end of the path.
#[derive(Clone, Debug, serde::Deserialize, serde::Serialize)]
#[serde(try_from = "String", into = "String")]
pub struct RestrictedGlob {
/// A Biome glob pattern.
#[derive(Clone, Debug)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize, serde::Serialize))]
#[cfg_attr(feature = "serde", serde(try_from = "String", into = "String"))]
pub struct Glob {
is_negated: bool,
glob: globset::GlobMatcher,
}
impl RestrictedGlob {
impl Glob {
/// Returns `true` if this glob is negated.
///
/// ```
/// use biome_js_analyze::utils::restricted_glob::RestrictedGlob;
///
/// let glob = "!*.js".parse::<RestrictedGlob>().unwrap();
/// let glob = "!*.js".parse::<biome_glob::Glob>().unwrap();
/// assert!(glob.is_negated());
///
/// let glob = "*.js".parse::<RestrictedGlob>().unwrap();
/// let glob = "*.js".parse::<biome_glob::Glob>().unwrap();
/// assert!(!glob.is_negated());
/// ```
pub fn is_negated(&self) -> bool {
Expand All @@ -52,31 +107,31 @@ impl RestrictedGlob {
self.glob.is_match_candidate(&path.0)
}
}
impl PartialEq for RestrictedGlob {
impl PartialEq for Glob {
fn eq(&self, other: &Self) -> bool {
self.is_negated == other.is_negated && self.glob.glob() == other.glob.glob()
}
}
impl Eq for RestrictedGlob {}
impl std::hash::Hash for RestrictedGlob {
impl Eq for Glob {}
impl std::hash::Hash for Glob {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.is_negated.hash(state);
self.glob.glob().hash(state);
}
}
impl std::fmt::Display for RestrictedGlob {
impl std::fmt::Display for Glob {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let repr = self.glob.glob();
let negation = if self.is_negated { "!" } else { "" };
write!(f, "{negation}{repr}")
}
}
impl From<RestrictedGlob> for String {
fn from(value: RestrictedGlob) -> Self {
impl From<Glob> for String {
fn from(value: Glob) -> Self {
value.to_string()
}
}
impl std::str::FromStr for RestrictedGlob {
impl std::str::FromStr for Glob {
type Err = RestrictedGlobError;
fn from_str(value: &str) -> Result<Self, Self::Err> {
let (is_negated, value) = if let Some(stripped) = value.strip_prefix('!') {
Expand All @@ -91,7 +146,7 @@ impl std::str::FromStr for RestrictedGlob {
// Only `**` can match `/`
glob_builder.literal_separator(true);
match glob_builder.build() {
Ok(glob) => Ok(RestrictedGlob {
Ok(glob) => Ok(Glob {
is_negated,
glob: glob.compile_matcher(),
}),
Expand All @@ -101,14 +156,15 @@ impl std::str::FromStr for RestrictedGlob {
}
}
}
impl TryFrom<String> for RestrictedGlob {
impl TryFrom<String> for Glob {
type Error = RestrictedGlobError;
fn try_from(value: String) -> Result<Self, Self::Error> {
value.parse()
}
}
// We use a custom impl to precisely report the location of the error.
impl biome_deserialize::Deserializable for RestrictedGlob {
#[cfg(feature = "biome_deserialize")]
impl biome_deserialize::Deserializable for Glob {
fn deserialize(
value: &impl biome_deserialize::DeserializableValue,
name: &str,
Expand All @@ -120,7 +176,10 @@ impl biome_deserialize::Deserializable for RestrictedGlob {
Err(error) => {
let range = value.range();
let range = error.index().map_or(range, |index| {
TextRange::at(range.start() + TextSize::from(1 + index), 1u32.into())
biome_text_size::TextRange::at(
range.start() + biome_text_size::TextSize::from(1 + index),
1u32.into(),
)
});
diagnostics.push(
biome_deserialize::DeserializationDiagnostic::new(format_args!("{error}"))
Expand All @@ -132,7 +191,7 @@ impl biome_deserialize::Deserializable for RestrictedGlob {
}
}
#[cfg(feature = "schemars")]
impl schemars::JsonSchema for RestrictedGlob {
impl schemars::JsonSchema for Glob {
fn schema_name() -> String {
"Regex".to_string()
}
Expand All @@ -156,7 +215,7 @@ impl<'a> CandidatePath<'a> {
}

/// Tests whether the current path matches `glob`.
pub fn matches(&self, glob: &RestrictedGlob) -> bool {
pub fn matches(&self, glob: &Glob) -> bool {
glob.is_match_candidate(self)
}

Expand All @@ -165,9 +224,9 @@ impl<'a> CandidatePath<'a> {
/// Let's take an example:
///
/// ```
/// use biome_js_analyze::utils::restricted_glob::{CandidatePath, RestrictedGlob};
/// use biome_glob::{CandidatePath, Glob};
///
/// let globs: &[RestrictedGlob] = &[
/// let globs: &[Glob] = &[
/// "*".parse().unwrap(),
/// "!a*".parse().unwrap(),
/// "a".parse().unwrap(),
Expand All @@ -189,7 +248,7 @@ impl<'a> CandidatePath<'a> {
///
pub fn matches_with_exceptions<'b, I>(&self, globs: I) -> bool
where
I: IntoIterator<Item = &'b RestrictedGlob>,
I: IntoIterator<Item = &'b Glob>,
I::IntoIter: DoubleEndedIterator,
{
self.matches_with_exceptions_or(false, globs)
Expand All @@ -203,9 +262,9 @@ impl<'a> CandidatePath<'a> {
///
///
/// ```
/// use biome_js_analyze::utils::restricted_glob::{CandidatePath, RestrictedGlob};
/// use biome_glob::{CandidatePath, Glob};
///
/// let globs: &[RestrictedGlob] = &[
/// let globs: &[Glob] = &[
/// "a/path".parse().unwrap(),
/// "!b".parse().unwrap(),
/// ];
Expand All @@ -222,7 +281,7 @@ impl<'a> CandidatePath<'a> {
/// ```
pub fn matches_directory_with_exceptions<'b, I>(&self, globs: I) -> bool
where
I: IntoIterator<Item = &'b RestrictedGlob>,
I: IntoIterator<Item = &'b Glob>,
I::IntoIter: DoubleEndedIterator,
{
self.matches_with_exceptions_or(true, globs)
Expand All @@ -232,7 +291,7 @@ impl<'a> CandidatePath<'a> {
/// Returns `default` if there is no globs that match.
fn matches_with_exceptions_or<'b, I>(&self, default: bool, globs: I) -> bool
where
I: IntoIterator<Item = &'b RestrictedGlob>,
I: IntoIterator<Item = &'b Glob>,
I::IntoIter: DoubleEndedIterator,
{
// Iterate in reverse order to avoid unnecessary glob matching.
Expand Down Expand Up @@ -379,45 +438,33 @@ mod tests {

#[test]
fn test_restricted_regex() {
assert!(!"*.js"
.parse::<RestrictedGlob>()
.unwrap()
.is_match("file/path.js"));
assert!(!"*.js".parse::<Glob>().unwrap().is_match("file/path.js"));

assert!("**/*.js"
.parse::<RestrictedGlob>()
.unwrap()
.is_match("file/path.js"));
assert!("**/*.js".parse::<Glob>().unwrap().is_match("file/path.js"));
}

#[test]
fn test_match_with_exceptions() {
let a = CandidatePath::new(&"a");

assert!(a.matches_with_exceptions(&[
RestrictedGlob::from_str("*").unwrap(),
RestrictedGlob::from_str("!b").unwrap(),
Glob::from_str("*").unwrap(),
Glob::from_str("!b").unwrap(),
]));
assert!(!a.matches_with_exceptions(&[
RestrictedGlob::from_str("*").unwrap(),
RestrictedGlob::from_str("!a*").unwrap(),
Glob::from_str("*").unwrap(),
Glob::from_str("!a*").unwrap(),
]));
assert!(a.matches_with_exceptions(&[
RestrictedGlob::from_str("*").unwrap(),
RestrictedGlob::from_str("!a*").unwrap(),
RestrictedGlob::from_str("a").unwrap(),
Glob::from_str("*").unwrap(),
Glob::from_str("!a*").unwrap(),
Glob::from_str("a").unwrap(),
]));
}

#[test]
fn test_to_string() {
assert_eq!(
RestrictedGlob::from_str("**/*.js").unwrap().to_string(),
"**/*.js"
);
assert_eq!(
RestrictedGlob::from_str("!**/*.js").unwrap().to_string(),
"!**/*.js"
);
assert_eq!(Glob::from_str("**/*.js").unwrap().to_string(), "**/*.js");
assert_eq!(Glob::from_str("!**/*.js").unwrap().to_string(), "!**/*.js");
}
}
1 change: 1 addition & 0 deletions crates/biome_js_analyze/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ biome_control_flow = { workspace = true }
biome_deserialize = { workspace = true, features = ["smallvec"] }
biome_deserialize_macros = { workspace = true }
biome_diagnostics = { workspace = true }
biome_glob = { workspace = true, features = ["biome_deserialize", "schemars", "serde"] }
biome_js_factory = { workspace = true }
biome_js_semantic = { workspace = true }
biome_js_syntax = { workspace = true }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use biome_deserialize_macros::Deserializable;
use biome_js_syntax::JsModule;
use biome_rowan::BatchMutationExt;

use crate::{utils::restricted_glob::RestrictedGlob, JsRuleAction};
use crate::JsRuleAction;

pub mod legacy;
pub mod util;
Expand Down Expand Up @@ -94,7 +94,7 @@ pub struct Options {
#[serde(untagged)]
pub enum ImportGroup {
Predefined(PredefinedImportGroup),
Custom(RestrictedGlob),
Custom(biome_glob::Glob),
}
impl Deserializable for ImportGroup {
fn deserialize(
Expand Down
Loading
Loading