-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft of a possible solution to the #14 issue and changes in public interface #24
Conversation
Signed-off-by: Maxim Zhiburt <[email protected]>
@zhiburt I'm currently swamped with work from my workplace and can only look at this at the weekend. Very happy that you took my PR and added your adaptions yourselves, looking forward to review this! |
Perhaps, as a draft, it's not as clean as It should be. The idea is basically that /// read until it finds a match by needle
/// returns the needle's interest
pub fn read_until<N: Needle + ?Sized>(&mut self, needle: &N) -> Result<N::Interest>;
/// `Match` is a structure which contains a start and end of
/// a distance which `read_util` should delete from buffer.
pub struct Match<T> {
begin: usize,
end: usize,
pub interest: T,
}
/// `Needle` trait has only one method `find` which return `Some`
/// if something was found `None` otherwise.
pub trait Needle {
type Interest;
fn find(&self, buffer: &str, eof: bool) -> Option<Match<Self::Interest>>;
} By the way the It allows us to have different return values for each Needle. impl Needle for String {
/// return a content before the match
type Interest = String;
...
}
impl Needle for NBytes {
/// first N bytes
type Interest = String;
...
}
impl Needle for Regex {
/// content before the match and the match itself
type Interest = (String, String);
...
}
impl Needle for [ReadUntil] {
/// The copy of the buffer and an index of
/// a first needle by which we get successful match
type Interest = (String, usize);
...
} |
src/reader.rs
Outdated
@@ -19,12 +19,42 @@ enum PipedChar { | |||
EOF, | |||
} | |||
|
|||
#[derive(Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, it's necessary for vec!
macros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about vec!
probably I got mistaken but there's a real cause here. Currently string_needle
method consumes/owns a ReadUntil
instance. I hope the convert methods could be changed to have a &self
parameter rather than self
in which case the Clone
could be deleted.
pub enum ReadUntil { | ||
String(String), | ||
Regex(Regex), | ||
EOF, | ||
NBytes(usize), | ||
Any(Vec<ReadUntil>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice that you got rid of this, I was always under the impression that the Any doesn't belong into this enum because it is "one level above"
src/reader.rs
Outdated
} | ||
} | ||
|
||
impl Needle for String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it need a impl both for str and for String?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question, actually there's no reason in String
implementation, as I remember I've experienced some string cast issue and just added it to not have the cast. (don't remember what it was)
And actually I was wondering at the time, might it's better not to implement the interface for existed types and create a wrapper structure e.g as I did with NBytes(usize)
even though it could be implemented just to usize
. I am not sure which way is better for _string
.
impl Needle for Regex { | ||
type Interest = (String, String); | ||
|
||
fn find(&self, buffer: &str, eof: bool) -> Option<Match<Self::Interest>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is there a Needle::find
function for regex here and also a function find
which also handles regex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean this one It just hasn't been deleted, but it can be :). I didn't clean the code as much as you see
https://github.com/philippkeller/rexpect/blob/0212cd6ff069e5633ea8e6d4dad8212f88ef6464/src/reader.rs#L86
but if you mean Regex::find(&self, buffer)
, here we just call original find method regex::Regex::find
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I meant the pub fn find
function. Nice that it can be deleted. All clear then!
@zhiburt I like the approach very much, your proposal regarding the API changes makes sense (e.g. matching a string only needs to return match before string) and then allows to return the index for Any. See my comments to your code above. I also had a look at the API for pexpect and they only return the index of the match (which makes only sense if you provide a list of needles) or -1 if it didn't match. If you provide a regex then the matching string is stored in an attribute Some questions:
IMO this should first merged into a branch in order to more easily play around and be able add/change documentation and then I could release this together with the api changes for windows #11 in version 0.5 |
I am not versed in git, but I bet have a different branch first not a bad idea |
I created a new branch |
Good evening @philippkeller, I've investigated the former mentioned issue with the buffer and not very expandable use reader::{ABInterest, NBytes, Regex, Until};
// currently it equals [ReadUntil::Nbytes(30), ReadUntil::String("Hi"), ReadUntil::Regex(Regex::("").unwrap())]
let until = Until(
NBytes(30),
Until("Hi".to_string(), Regex::new("").unwrap())
);
// we could wrap it by a function `or `
// NBytes(30).or("Hi".to_string())
// .or(Regex::new("").unwrap())
match p.exp(&until) {
Ok(ABInterest::A(s)) => {}
Ok(ABInterest::B(ABInterest::A(s))) => {}
Ok(ABInterest::B(ABInterest::B((s, regex)))) => {}
Err(e) => assert!(false, format!("got error: {}", e)),
} So user is building the Why it's much more expandable? I don't think that user defined Not sure if the names of type is expansionary enough :) re implementation of
|
Aha, so a user could implement I was first fearing that this is bringing more complexity/abstractions into the code base. |
The
I couldn't disagree, but from my perspective return a original The proposal is only linked to old As I've mentioned it returns index + copy of the buffer. (I decided to return copy since I was not sure that buffer will be unchanged until the next call of a indexed object). let until = vec![ReadUntil::NBytes(30), ReadUntil::String("Hi".to_string()), Regex::new("").unwrap()];
if let Ok(buffer, index)= = p.exp(until.as_slice()) {
let res = &until[index]
.string_needle()
.unwrap()
.find(&buffer, true)
.unwrap();
} But there are a bit more here. As you can see I cast the indexed I propose to change the construction from That literally a play around of rust type system to get able to remove The code with the same logic as previous one. // Until and ABInterest is renamed to UntilOR and ORInterest correspondingly (I guess if it is planer)
use reader::{ORInterest, NBytes, Regex, UntilOR};
let until = UntilOR(
NBytes(30),
UntilOR("Hi".to_string(), Regex::new("").unwrap())
);
match p.exp(&until) {
Ok(ORInterest::Left(s)) => {}
Ok(ORInterest::Right(ORInterest::Left(s))) => {}
Ok(ORInterest::Right(ORInterest::Right((s, regex)))) => {}
Err(e) => assert!(false, format!("got error: {}", e)),
} If the order of parameters in UntilOR will be changed we will get an compiler error. And there's none of copying. And there's no second But yep I am certainly not confident to say that it is a good way to follow as might it is slightly too complicated interface. Just wanted to have some comments according to it :) |
Frankly I'm a bit lost, maybe because I'm not into rust that much as you are :-) What I'm not sure of is the changes it means to the api. So easiest for me would be if you could solve the conflicts and then I could merge it into the branch to play around, would that be possible? |
Signed-off-by: Maxim Zhiburt <[email protected]>
Signed-off-by: Maxim Zhiburt <[email protected]>
@philippkeller I hope I resolved the conflicts. The idea came to my mind today acording to my former proposal. We might are able to create a macros to hide the complexity of underling types in order to clean API. I've created a scratch of it but I am truly not versed in macros so I succeded only in creating a macros with constant number of parameters. But I guess the expandable macro can be implemented as well. Example: the same match find {
OrInterest::Lhs(OrInterest::Lhs(first)) => assert!(false),
OrInterest::Lhs(OrInterest::Rhs(second)) => assert_eq!("Hello ".to_string(), second),
OrInterest::Rhs(third)) => assert!(false),
} until!(find, {
first => { assert!(false) },
second => { assert_eq!("Hello ".to_string(), second) },
third => { assert!(false) },
}); |
@zhiburt so I merged your PR into the new branch and started to play around. Early feedback: I find the exp_any interface too complicated right now. I was hoping that the Sorry for no more thorough feedback, I wanted to give an early feedback as soon as I had a time window :-) hope you don't take it personal Additionally: did you see the windows branch? Maybe we can have some "general chat" on where this repo should be going, the way we work together, etc. Not sure what would be the best way to chat about this, maybe Telegram or Slack would serve this? But I'm not using in any code related communities so not sure what is the best option. |
As I said I totally agree with a such position :).
I too. As I currently understand that it can be hidden only by macros.
There's an issue with rust type system I guess here, so it is kind of problematic bu I'll check it. |
chore: Ensure pre-commit gets non-system Python
Hi @philippkeller, In the #22 I put an example in what way may be changed the interface of
ReadUntil
.And today I went through the use cases of
exp
function and noticed that in all cases there's no need for all the datafind
function gives so I went slightly further and check what can be done.The main idea behind this PR is to provide a cleaner calls of
read_until
function.The example when we match string and return the buffer before and the string itself but we know the string already since we did a match by that so the second part is always needless.
A bunch of examples
The resolution of #14 could looks like
we have an index of the ReadUntil element by which we get successful match, and
the copy of the buffer on which it be called so if the user want he can repeat the match on the buffer.
This is a draft and it's affected by a linter which I am sorry about.
All tests are passed.