Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite FormatArg::Display #26

Merged
merged 2 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions tr/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,10 @@ default = ["gettext-rs"]
lazy_static = "1.2"
gettext-rs = { version = "0.7", optional = true, features = ["gettext-system"] }
gettext = { version = "0.4", optional = true }

[dev-dependencies]
criterion = "0.5"

[[bench]]
name = "my_bench"
harness = false
38 changes: 38 additions & 0 deletions tr/benches/my_bench.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use tr::tr;

pub fn short_literal(c: &mut Criterion) {
c.bench_function("short_literal", |b| b.iter(|| {
tr!("Hello");
}));
}

pub fn long_literal(c: &mut Criterion) {
c.bench_function("long_literal", |b| b.iter(|| {
tr!("Hello, world! This is a longer sentence but without argument markers. That is all for now, thank you for reading.");
}));
}

pub fn short_argument(c: &mut Criterion) {
c.bench_function("short_argument", |b| b.iter(|| {
tr!("Hello {}!", black_box("world"));
}));
}

pub fn long_argument(c: &mut Criterion) {
c.bench_function("long_argument", |b| b.iter(|| {
tr!("Hello {} and {} and {} and {} and {} and {} and {} and finally {}!",
black_box("Mercury"),
black_box("Venus"),
black_box("Earth"),
black_box("Mars"),
black_box("Jupiter"),
black_box("Saturn"),
black_box("Uranus"),
black_box("Neptune"),
);
}));
}

criterion_group!(benches, short_literal, long_literal, short_argument, long_argument);
criterion_main!(benches);
141 changes: 67 additions & 74 deletions tr/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -81,80 +81,75 @@ pub mod runtime_format {
//!
//! TODO: better error reporting and support for more replacement option

/// The result of the runtime_format! macro.
/// This implements the Display
pub struct FormatArg<'a> {
#[doc(hidden)]
pub format_str: &'a str,
#[doc(hidden)]
pub args: &'a [(&'static str, &'a dyn (::std::fmt::Display))],
}

impl<'a> ::std::fmt::Display for FormatArg<'a> {
fn fmt(&self, f: &mut ::std::fmt::Formatter<'_>) -> ::std::fmt::Result {
let mut arg_idx = 0;
let mut pos = 0;
while let Some(mut p) = self.format_str[pos..].find(|x| x == '{' || x == '}') {
if self.format_str.len() - pos < p + 1 {
break;
}
p += pos;

// Skip escaped }
if self.format_str.get(p..=p) == Some("}") {
self.format_str[pos..=p].fmt(f)?;
if self.format_str.get(p + 1..=p + 1) == Some("}") {
pos = p + 2;
} else {
// FIXME! this is an error, it should be reported ('}' must be escaped)
pos = p + 1;
}
continue;
}
/// Converts the result of the runtime_format! macro into the final String
pub fn display_string(format_str: &str, args: &[(&str, &dyn ::std::fmt::Display)]) -> String {
use ::std::fmt::Write;
let fmt_len = format_str.len();
let mut res = String::with_capacity(2 * fmt_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the capacity here. It is hard to know what's the require capacity since we don't know the size of the arguments. I would have let the default string growing algorithm do its job. But it's true that 2 times the format length is probably fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is a trade-off here. I would set at least fmt_len, because the final string will almost never be shorter than the format one. If anything {} could get an empty string, but that is just two bytes less.

The thing is that if you write tr!("a {} b {} c {} d"), you will do 7 write operations, and I guess at least 3 allocations, depending on the length of the arguments.

I've run my benchmarks many times and 2 * fmt_len seems to be the sweet spot.

And since then I've been reading the std library; it uses this handy but undocumented Arguments::estimated_capacity: it does some smart heuristics, and... hey! it doubles some length 🤓!

If you agree, I'd leave it as is.

let mut arg_idx = 0;
let mut pos = 0;
while let Some(mut p) = format_str[pos..].find(|x| x == '{' || x == '}') {
if fmt_len - pos < p + 1 {
break;
}
p += pos;

// Skip escaped {
if self.format_str.get(p + 1..=p + 1) == Some("{") {
self.format_str[pos..=p].fmt(f)?;
// Skip escaped }
if format_str.get(p..=p) == Some("}") {
res.push_str(&format_str[pos..=p]);
if format_str.get(p + 1..=p + 1) == Some("}") {
pos = p + 2;
continue;
}

// Find the argument
let end = if let Some(end) = self.format_str[p..].find('}') {
end + p
} else {
// FIXME! this is an error, it should be reported
self.format_str[pos..=p].fmt(f)?;
// FIXME! this is an error, it should be reported ('}' must be escaped)
pos = p + 1;
continue;
};
let argument = self.format_str[p + 1..end].trim();
let pa = if p == end - 1 {
arg_idx += 1;
arg_idx - 1
} else if let Ok(n) = argument.parse::<usize>() {
n
} else if let Some(p) = self.args.iter().position(|x| x.0 == argument) {
p
} else {
// FIXME! this is an error, it should be reported
self.format_str[pos..end].fmt(f)?;
pos = end;
continue;
};

// format the part before the '{'
self.format_str[pos..p].fmt(f)?;
if let Some(a) = self.args.get(pa) {
a.1.fmt(f)?;
} else {
// FIXME! this is an error, it should be reported
self.format_str[p..=end].fmt(f)?;
}
pos = end + 1;
continue;
}
self.format_str[pos..].fmt(f)

// Skip escaped {
if format_str.get(p + 1..=p + 1) == Some("{") {
res.push_str(&format_str[pos..=p]);
pos = p + 2;
continue;
}

// Find the argument
let end = if let Some(end) = format_str[p..].find('}') {
end + p
} else {
// FIXME! this is an error, it should be reported
res.push_str(&format_str[pos..=p]);
pos = p + 1;
continue;
};
let argument = format_str[p + 1..end].trim();
let pa = if p == end - 1 {
arg_idx += 1;
arg_idx - 1
} else if let Ok(n) = argument.parse::<usize>() {
n
} else if let Some(p) = args.iter().position(|x| x.0 == argument) {
p
} else {
// FIXME! this is an error, it should be reported
res.push_str(&format_str[pos..end]);
pos = end;
continue;
};

// format the part before the '{'
res.push_str(&format_str[pos..p]);
if let Some(a) = args.get(pa) {
write!(&mut res, "{}", a.1)
.expect("a Display implementation returned an error unexpectedly");
} else {
// FIXME! this is an error, it should be reported
res.push_str(&format_str[p..=end]);
}
pos = end + 1;
}
res.push_str(&format_str[pos..]);
res
}

#[doc(hidden)]
Expand All @@ -163,15 +158,13 @@ pub mod runtime_format {
macro_rules! runtime_format {
($fmt:expr) => {{
// TODO! check if 'fmt' does not have {}
format!("{}", $fmt)
String::from($fmt)
}};
($fmt:expr, $($tail:tt)* ) => {{
let format_str = $fmt;
format!("{}", $crate::runtime_format::FormatArg {
format_str: AsRef::as_ref(&format_str),
//args: &[ $( $crate::runtime_format!(@parse_arg $e) ),* ],
args: $crate::runtime_format!(@parse_args [] $($tail)*)
})
$crate::runtime_format::display_string(
AsRef::as_ref(&$fmt),
$crate::runtime_format!(@parse_args [] $($tail)*),
)
}};

(@parse_args [$($args:tt)*]) => { &[ $( $args ),* ] };
Expand Down
Loading