Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: a convention for error handling #25

Open
shailesh1729 opened this issue Aug 10, 2015 · 15 comments
Open

RFC: a convention for error handling #25

shailesh1729 opened this issue Aug 10, 2015 · 15 comments
Labels

Comments

@shailesh1729
Copy link
Contributor

This RFC proposes a convention for structuring methods in SciRust
which can cater to the conflicting needs of efficiency, easy of use
and effective error handling.

For the impatient:

// Efficient access without bound checks
unsafe fn get_unchecked(&self, r : usize, c : usize) -> T;
// Safe access with bound checks, raises error if invalid address
fn get_checked(&self, r : usize, c : usize) -> Result<T, Error>;
// User friendly version. Panics in case of error
fn get(&self, r : usize, c : usize) -> T;

// Efficient modification without bound checks
unsafe fn set_unchecked(&mut self, r : usize, c : usize, value : T);

// Safe modification with bound check
fn set(&mut self, r : usize, c : usize, value : T);

Detailed discussion

The audience of SciRust can be possibly divided into
two usage scenarios.

  • A script style usage, where the objective is to quickly
    do some numerical experiment, get the results and analyze them.
  • A library development usage, where more professional libraries
    would be built on top of fundamental building blocks provided
    by SciRust (these may be other modules shipped in SciRust itself).

While the first usage scenario is important for getting new users hooked
to the library, the second usage scenario is also important for justifying
why Rust should be used for scientific software development compared
to other scientific computing platforms.

In context of the two usage scenarios, the design of SciRust has three conflicting goals:

  • Ease of use
  • Efficiency
  • Well managed error handling

While ease of use is important for script style usage,
efficiency and well managed error handling are important
for serious software development on top of core components
provided by SciRust.

We will consider the example of a get(r,c) method
on a matrix object to discuss these conflicting goals.
Please note that get is just a representative method
for this discussion. The design ideas can be applied in
many different parts of SciRust once accepted.

If get is being called in a loop, usually the code
around it can ensure that the conditions for accessing
data within the boundary of the matrix are met correctly.
Thus, a bound checking within the implementation of get
is just an extra overhead.

While this design is good for writing efficient software,
it can lead to a number of memory related bugs and goes
against the fundamental philosophy of Rust (Safety first).
There are actually two different options for error handling:

  • Returning either Option<T> or Result<T, Error>.
  • Using the panic mechanism.

Option<T> or Result<T, Error> provides the users a
fine grained control over what to do when an error occurs.
This is certainly the Rusty way of doing things. At the
same time, both of these return types make the user code
more complicated. One has to add extra calls to .unwrap()
even if one is sure that the function is not going to fail.

Users of scientific computing tend to prefer an environment
where they can get more work done with less effort. This is
a reason of the success of specialized environments like
MATLAB. Open source environments like Python (NumPy, SciPy)
try to achieve something similar.

While SciRust doesn't intend to compete at the level of
simplicity provided by MATLAB/Python environments, it does
intend to take an extra effort wherever possible to address
the ease of use goal.
In this context, the return type of a getter should
be just the value type T. This can be achieved
safely by using a panic if the access boundary
conditions are not met.

The discussion above suggests up to 3 possible ways of
implementing methods like get.

  • An unchecked (and unsafe) version for high efficiency code
    where the calling code is responsible for ensuring that
    the necessary requirements for correct execution of the
    method are being met.
  • A safe version which returns either Option<T> or
    Result<T, Error> which can be used for professional
    software development where the calling code has full control
    over error handling.
  • Another safe version which panics in case of error but provides
    an API which is simpler to use for writing short scientific
    computing scripts.

Proposed convention

We propose that a method for which these variations
need to be supported, should follow the convention defined below:

  • A method_unchecked version should provide basic implementation
    of the method. This should assume that necessary conditions
    for successful execution of the methods are already being
    ensured by the calling code. The unchecked version of method
    MUST be marked unsafe. This ensures that the calling code
    knows that it is responsible for ensuring the right conditions
    for calling the unchecked method.
  • A method_checked version should be implemented on top of
    a method_unchecked method. The checked version should
    check for all the requirements for calling the method safely.
    The return type should be either Option<T> or
    Result<T, Error>. In case the required conditions for
    calling the method are not met, a None or Error
    should be returned. Once the required conditions are met,
    method_unchecked should be called to get the result
    which would be wrapped inside Option or Result.
  • A method version should be built on top of method_checked version.
    It should simply attempt to unwrap
    the value returned by method_checked and return as T.
    If method_checked returns an error or None, this version
    should panic.

First two versions are suitable for professional development
where most of the time we need a safe API while at some times
we need an unsafe API for efficient implementation.
The third version is suitable for script style usage scenario.

The convention has been illustrated in the three versions of
get at the beginning of this document.

API bloat

While this convention is expected to lead into an API bloat,
but if the convention is followed properly across the library,
then it should be easy to follow (both from the perspective
of users of the library and from the perspective of developers
of the library).

@Sean1708
Copy link
Contributor

Just food for thought, what if the safety was baked into the type? E.g. have a Matrix<T> type and an UncheckedMatrix<T> type. That way the API would be identical across the board, you would just choose which one you need, and since the data layout would be identical converting between the two shouldn't be too expensive.

@Sean1708
Copy link
Contributor

Thinking a bit further into it, you could still have the three methods by doing something along the lines of

impl Checked for Matrix {
    fn get(...) {
        // ...
    }
}

impl Unchecked for UncheckedMatrix {
    fn get(...) {
        // ...
    }
}

fn get<T: Checked>(m: T, c: usize, r: usize) -> ... {
    match m.get(c, r) {
        Ok(v) => v,
        Err(_) => panic!(),
    }
}

Obviously that's only a rough thought but I think it might make a much cleaner API.

@shailesh1729
Copy link
Contributor Author

Does this mean that a higher level function would either work with a safe matrix or an unsafe matrix only? I was thinking in the lines of linear algebra code where most of the time, one would use the safe matrix API (like elementary row/column operations), while occasionally getting dirty with the unsafe API for efficiency purposes.

@daniel-vainsencher
Copy link
Contributor

Hi, I also am not sure how I would use the matrix type based API. Seems
like the matrices would not be easy to replace, having different apis.
On Aug 11, 2015 7:02 AM, "Shailesh Kumar" [email protected] wrote:

Does this mean that a higher level function would either work with a safe
matrix or an unsafe matrix only? I was thinking in the lines of linear
algebra code where most of the time, one would use the safe matrix API
(like elementary row/column operations), while occasionally getting dirty
with the unsafe API for efficiency purposes.


Reply to this email directly or view it on GitHub
#25 (comment).

@shailesh1729
Copy link
Contributor Author

Rust slices have both get_unchecked() and get() methods. https://doc.rust-lang.org/std/primitive.slice.html#method.get_unchecked . The proposed design is not very different. But rather than having 2 versions of get, here 3 versions are being suggested. The panic version is meant primarily for making writing scripty code easier.

@daniel-vainsencher
Copy link
Contributor

TL;DR:

  • I think that "get" and "get_checked" have completely different use cases,
    having nothing to do with scripts.
  • "get_unchecked" should replace "get" only under pretty special
    circumstances.
  • I am not sure we need such a convention for every function, maybe we
    should discuss a few different functions before generalizing (a kind of
    "rule of three" [1])
  • We might want to replace get_unchecked with a type level solution for
    vetting indices safely outside the access.

Reasoning:

get_checked: indices are input from untrusted source, like file or a user.
There is no reason to expect them to be correct, and we need the program to
do something sophisticated (like gve the user another chance to input) if
they are wrong. This is mainly useful in programs (where we know what to do
if indices are wrong). In libraries, if your using program gives wrong
indices, all you need is to help find the bug by giving a clear message and
a stack trace.

get: every other case. So I expect my indices to be correct, but I want
them checked just in case. Wrong indices indicate a bug somewhere, hence
panic is appropriate.

get_unchecked: this is only for optimization purposes, so should be used
only when some profiling shows that bounds checks are a problem. The
"unsafe" property should be considered a burden on the provider of indices
to prove (informally, of course) that they are within bounds, and thus
avoided whenever possible.

About other functions: get and set are potentially cheap enough that bounds
checking overhead might be significant, and some access patterns are
obvious enough safe that these per-element bounds checks should be avoided.
For example, a column iterator will access contiguous memory, hence be very
cheap, and it is easy to ensure the start and end points are correct. On
the other hand, the column iterator function itself probably doesn't need a
"get_unchecked" version, simply because the relative cost of checks to work
is nothing (unless used a lot to access 1-row matrices, in which case the
user should find a more appropriate implementation).

I've had a different idea btw. We could associate to a matrix two new
types: a checked_row_index and a checked_column_index. Then create versions
of get that accept those types without checking. Then create values of
those types only through known safe sources: col_index_iter,
check_col_index, first_col_index etc. Then we can use optimally fast code,
but move most of the burden of proof to the type system. This depends on
some details about the type system that I don't know, and I'm not sure
whether the pattern generalizes, but it tries to follow "Make illegal
states unrepresentable" [2]. For example, we probably want safe_indices to
borrow the shape of the matrix, to prevent it from changing shape (don't
know if we can borrow a field).

Daniel
[1] https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)
[2] https://blogs.janestreet.com/effective-ml-revisited/

On Aug 10, 2015 4:46 PM, "Shailesh Kumar" [email protected] wrote:

This RFC proposes a convention for structuring methods in SciRust
which can cater to the conflicting needs of efficiency, easy of use
and effective error handling.

For the impatient:

// Efficient access without bound checks
unsafe fn get_unchecked(&self, r : usize, c : usize) -> T;
// Safe access with bound checks, raises error if invalid address
fn get_checked(&self, r : usize, c : usize) -> Result<T, Error>;
// User friendly version. Panics in case of error
fn get(&self, r : usize, c : usize) -> T;

// Efficient modification without bound checks
unsafe fn set_unchecked(&mut self, r : usize, c : usize, value : T);

// Safe modification with bound check
fn set(&mut self, r : usize, c : usize, value : T);

Detailed discussion

The audience of SciRust can be possibly divided into
two usage scenarios.

  • A script style usage, where the objective is to quickly do some
    numerical experiment, get the results and analyze them.
  • A library development usage, where more professional libraries would
    be built on top of fundamental building blocks provided by SciRust (these
    may be other modules shipped in SciRust itself).

While the first usage scenario is important for getting new users hooked
to the library, the second usage scenario is also important for justifying
why Rust should be used for scientific software development compared
to other scientific computing platforms.

In context of the two usage scenarios, the design of SciRust has three
conflicting goals:

  • Ease of use
  • Efficiency
  • Well managed error handling

While ease of use is important for script style usage,
efficiency and well managed error handling are important
for serious software development on top of core components
provided by SciRust.

We will consider the example of a get(r,c) method
on a matrix object to discuss these conflicting goals.
Please note that get is just a representative method
for this discussion. The design ideas can be applied in
many different parts of SciRust once accepted.

If get is being called in a loop, usually the code
around it can ensure that the conditions for accessing
data within the boundary of the matrix are met correctly.
Thus, a bound checking within the implementation of get
is just an extra overhead.

While this design is good for writing efficient software,
it can lead to a number of memory related bugs and goes
against the fundamental philosophy of Rust (Safety first).
There are actually two different options for error handling:

  • Returning either Option or Result<T, Error>.
  • Using the panic mechanism.

Option or Result<T, Error> provides the users a
fine grained control over what to do when an error occurs.
This is certainly the Rusty way of doing things. At the
same time, both of these return types make the user code
more complicated. One has to add extra calls to .unwrap()
even if one is sure that the function is not going to fail.

Users of scientific computing tend to prefer an environment
where they can get more work done with less effort. This is
a reason of the success of specialized environments like
MATLAB. Open source environments like Python (NumPy, SciPy)
try to achieve something similar.

While SciRust doesn't intend to compete at the level of
simplicity provided by MATLAB/Python environments, it does
intend to take an extra effort wherever possible to address
the ease of use goal.

In this context, the return type of a getter should
be just the value type T. This can be achieved
safely by using a panic if the access boundary
conditions are not met.

The discussion above suggests up to 3 possible ways of
implementing methods like get.

An unchecked (and unsafe) version for high efficiency code
where the calling code is responsible for ensuring that
the necessary requirements for correct execution of the
method are being met.

A safe version which returns either Option or
Result<T, Error> which can be used for professional
software development where the calling code has full control
over error handling.

Another safe version which panics in case of error but provides
an API which is simpler to use for writing short scientific
computing scripts.

Proposed convention

We propose that a method for which these variations
need to be supported, should follow the convention defined below:

A method_unchecked version should provide basic implementation
of the method. This should assume that necessary conditions
for successful execution of the methods are already being
ensured by the calling code. The unchecked version of method
MUST be marked unsafe. This ensures that the calling code
knows that it is responsible for ensuring the right conditions
for calling the unchecked method.

A method_checked version should be implemented on top of
a method_unchecked method. The checked version should
check for all the requirements for calling the method safely.
The return type should be either Option or
Result<T, Error>. In case the required conditions for
calling the method are not met, a None or Error
should be returned. Once the required conditions are met,
method_unchecked should be called to get the result
which would be wrapped inside Option or Result.

A method version should be built on top of method_checked version.
It should simply attempt to unwrap
the value returned by method_checked and return as T.
If method_checked returns an error or None, this version
should panic.

First two versions are suitable for professional development
where most of the time we need a safe API while at some times
we need an unsafe API for efficient implementation.
The third version is suitable for script style usage scenario.

The convention has been illustrated in the three versions of
get at the beginning of this document.
API bloat

While this convention is expected to lead into an API bloat,
but if the convention is followed properly across the library,
then it should be easy to follow (both from the perspective
of users of the library and from the perspective of developers
of the library).


Reply to this email directly or view it on GitHub
#25.

@Sean1708
Copy link
Contributor

Seems like the matrices would not be easy to replace, having different apis.

I was actually thinking the opposite, that the two types would have exactly the same API and only the implementations would differ. Consider the following workflow:

// let's say, for example, that matrix_from_file() is guaranteed to return a 20x20 matrix or larger
let m: Matrix<f64> = matrix_from_file("filename");

// this element is not guaranteed to be there
if let Ok(element) = m.get(50, 50) {
    println!("m[50, 50] = {}", element);
}

// we can guarantee that this is in bounds so we start using UncheckedMatrix
let m = m as UncheckedMatrix;
let mut i = 0.0;
for j in 0..20 {
    i += m.get(j, j);
}

I think it would make a cleaner API but actually the more of this I write out the more error prone it seems for the user. I just always feel like the type system isn't being used properly when I have to write things like get_checked rather than just get.

@daniel-vainsencher
Copy link
Contributor

let Ok(element) = m.get(50, 50)
i += m.get(j, j);

These two functions called get do not have the same return type, hence the
APIs are not compatible. This is what I meant by not easy to replace.

On Tue, Aug 11, 2015 at 2:04 PM, Sean Marshallsay [email protected]
wrote:

Seems like the matrices would not be easy to replace, having different
apis.

I was actually thinking the opposite, that the two types would have
exactly the same API and only the implementations would differ. Consider
the following workflow:

// let's say, for example, that matrix_from_file() is guaranteed to return a 20x20 matrix or largerlet m: Matrix = matrix_from_file("filename");
// this element is not guaranteed to be thereif let Ok(element) = m.get(50, 50) {
println!("m[50, 50] = {}", element);
}
// we can guarantee that this is in bounds so we start using UncheckedMatrixlet m = m as UncheckedMatrix;let mut i = 0.0;for j in 0..20 {
i += m.get(j, j);
}

I think it would make a cleaner API but actually the more of this I write
out the more error prone it seems for the user. I just always feel like the
type system isn't being used properly when I have to write things like
get_checked rather than just get.


Reply to this email directly or view it on GitHub
#25 (comment).

Daniel Vainsencher

@Sean1708
Copy link
Contributor

Of course. Sorry, I completely missed that (it's been a long week already).

@daniel-vainsencher
Copy link
Contributor

No problem. I wonder if its possible to flesh out the shape-borrowing
safe-by-type indices I proposed above. Something like

get<C: SafeColIndex, R: SafeRowIndex>(col: C, row: R) -> ... (doesn't need
checking)
and
get<C, R>(col: C, row: R) -> ... (checks and panics if needed)

On Tue, Aug 11, 2015 at 3:52 PM, Sean Marshallsay [email protected]
wrote:

Of course. Sorry, I completely missed that (it's been a long week already).


Reply to this email directly or view it on GitHub
#25 (comment).

Daniel Vainsencher

@daniel-vainsencher
Copy link
Contributor

Sure, but what I would love is if we can arrange that users cannot create
them! users can only receive them from the Matrix, which actually verifies
that they actually are correct.

Something like

let c = m.min_col() // has type SafeColIndex, so can be used in unchecked
indexing.
In Haskell it is possible to not export the constructors for types, and
then library users cannot (easily) create invalid instances, and you can
enforce various invariants. Here we would want each such index to borrow a
reference for the "shape" field of the matrix, so that shape cannot change
while safe indices exist.

On Tue, Aug 11, 2015 at 4:52 PM, Sean Marshallsay [email protected]
wrote:

That's a good idea, maybe it could be done as a newtype, something like

struct Safe(usize);
fn get(self, c: Safe, r: Safe) -> ...

Then you could call it like

let el = mat.get(Safe(3), Safe(4));


Reply to this email directly or view it on GitHub
#25 (comment).

Daniel Vainsencher

@shailesh1729
Copy link
Contributor Author

Just to clarify on one point. The RFC doesn't intend to suggest that three versions of a method should be considered for each method.

Typical development would go like this.

  • Start with a single method like method_name() returning Option<T> or Result<T, Error>. This is a standard Rust style approach. panic() should be avoided in basic development.
  • If using the method is becoming cumbersome for writing straight-forward scripts, then we need two versions. One version as the above and the other version which panics. The panicky version has a return type of T. The panicky version in turn uses the implementation of the version returning Option<T> or Result<T, Error>. The preference would be to use the method_name() for the panicky version. The question here is how the method should be named for resulty/optiony version. The RFC above suggests the name to be method_name_checked.
  • In very rare circumstances, for efficiency purposes, we need an unsafe version of the method also which skips some of the parameter validation. This version would be named as method_name_unchecked. In this case, the optiony / resulty version would build on top of it by adding extra parameter validation.

So in summary:

  • all methods should by default be optiony or resulty.
  • if optiony / resulty version is difficult to use for script style usage, then introduce a panicky version which in turn uses the optiony / resulty version.
  • In extremely rare circumstances an unchecked version is also introduced for efficiency purposes. This is done by refactoring the implementation from the optiony / resulty version.

The objective of this RFC is to develop a guideline for the development process.

@daniel-vainsencher
Copy link
Contributor

So looks like we agree that the unsafe version should be a rarity.

I think that if most methods should be optiony/resulty, we should strive
for Results with clearly explained error types, to enable users to do
something smarter than unwrap.

For example, I would make the get have "column/row out of bounds" error.
What would a resulty matrix inversion return as error types? or dot
product? What is some concrete code where you would handle these Results,
and not by panic! ?

Daniel

On Tue, Aug 11, 2015 at 8:01 PM, Shailesh Kumar [email protected]
wrote:

Just to clarify on one point. The RFC doesn't intend to suggest that three
versions of a method should be considered for each method.

Typical development would go like this.

Start with a single method like method_name() returning Option or Result<T,
Error>. This is a standard Rust style approach. panic() should be
avoided in basic development.

If using the method is becoming cumbersome for writing
straight-forward scripts, then we need two versions. One version as the
above and the other version which panics. The panicky version has a return
type of T. The panicky version in turn uses the implementation of the
version returning Option or Result<T, Error>. The preference would
be to use the method_name() for the panicky version. The question here
is how the method should be named for resulty/optiony version. The RFC
above suggests the name to be method_name_checked.

In very rare circumstances, for efficiency purposes, we need an unsafe
version of the method also which skips some of the parameter validation.
This version would be named as method_name_unchecked. In this case,
the optiony / resulty version would build on top of it by adding extra
parameter validation.

So in summary:

  • all methods should by default be optiony or resulty.
  • if optiony / resulty version is difficult to use for script style
    usage, then introduce a panicky version which in turn uses the optiony /
    resulty version.
  • In extremely rare circumstances an unchecked version is also
    introduced for efficiency purposes. This is done by refactoring the
    implementation from the optiony / resulty version.

The objective of this RFC is to develop a guideline for the development
process.


Reply to this email directly or view it on GitHub
#25 (comment).

Daniel Vainsencher

@shailesh1729
Copy link
Contributor Author

Please look at src/error.rs type SRError for the current list of error codes. This certainly would be extended further. There is also a to_string method available for the SRError type which presents a human readable string for it. It is expected that some of these error codes will have additional parameters also attached with them in due course. For their usage, just search on SRError in the code base. You will see plenty of examples.

@daniel-vainsencher
Copy link
Contributor

Ah, looks pretty good! Already had the pleasure to encounter
DimensionMismatch, so I changed it to also store and print the dimensions
of the participating matrices, since this makes it much easier to find the
originating bug. I'll send a PR with that.

On Wed, Aug 12, 2015 at 10:55 PM, Shailesh Kumar [email protected]
wrote:

Please look at src/error.rs type SRError for the current list of error
codes. This certainly would be extended further. There is also a to_string
method available for the SRError type which presents a human readable
string for it. It is expected that some of these error codes will have
additional parameters also attached with them in due course. For their
usage, just search on SRError in the code base. You will see plenty of
examples.


Reply to this email directly or view it on GitHub
#25 (comment).

Daniel Vainsencher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants