-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safe Access to Contiguous Storage #2307
Safe Access to Contiguous Storage #2307
Conversation
Co-authored-by: Alex Martini <[email protected]>
- this is redundant, since it is the same as `String.UTF8View`
|
||
```swift | ||
extension Span where Element: ~Copyable & ~Escapable { | ||
func withUnsafeBufferPointer<E: Error, Result: ~Copyable & ~Escapable>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think Result
can be ~Escapable
here without the lifetimes proposal.
I have a meta-question about the design here. Span is already non-escaping itself, and these operations that produce an unsafe buffer pointer obviously have to break that requirement. Given that, it doesn't feel like using the closure here is providing extra safety over (e.g.) a property that produces the UnsafeBufferPointer
, but it's a lot less ergonomic. Could we consider instead to have:
var unsafeBufferPointer: UnsafeBufferPointer<Element> { get }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Span.withUnsafeBufferPointer
ensures that the Span's lifetime, and whatever that span depends on, extend across all valid uses of the pointer. The rules for pointer validity in closure taking methods are clear. There's nothing new here. As long as programmers follow the usual rules, they can't introduce any new form of unsafety.
The compiler should be able to diagnose pointer escapes from the closure, but that's a separate discussion!
func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( | ||
_ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result | ||
) throws(E) -> Result | ||
} | ||
|
||
extension RawSpan { | ||
func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( | ||
_ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result | ||
) throws(E) -> Result | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same suggestion about having a property here:
var unsafeTypes: UnsafeRawBufferPointer { get }
} | ||
``` | ||
|
||
##### Accessing subranges of elements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I noted earlier on, all of the extracting
functions depend on lifetime dependencies. Perhaps provide a closure-based withSubspan
?
/// - Parameters: | ||
/// - type: The type as which to view the bytes of this span. | ||
/// - Returns: A typed span viewing these bytes as instances of `T`. | ||
public func unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this requires lifetime dependencies. Change it to a closure-taking withUnsafeView
?
/// The closure's parameter is valid only for the duration of | ||
/// its execution. | ||
/// - Returns: The return value of the `body` closure parameter. | ||
func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted earlier, can this be a property?
```swift | ||
extension RawSpan { | ||
|
||
public func extracting(_ bounds: Range<Int>) -> Self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as earlier; this is based on lifetime dependencies.
|
||
Some types store their internal representation in a piecewise-contiguous manner, such as trees and ropes. Some operations naturally return information in a piecewise-contiguous manner, such as network operations. These could supply results by iterating through a list of contiguous chunks of memory. | ||
|
||
#### <a name="MutableSpan"></a>Safe mutations of memory with `MutableSpan<T>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to this proposal! ;)
public func extracting(droppingFirst k: Int) -> Self | ||
/// - Parameters: | ||
/// - indices: a range of indices to validate | ||
public func boundsPrecondition(_ indices: Range<Int>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span.boundsPrecondition(range)
is just precondition(span.boundsContain(range))
, right? Do we really need/want sugar for this?
Re: naming, I think things get somewhat easier if we drop the precondition func, then we just have boundsContain
, whose naming isn't great because it's kind of oblique. The question we're really asking is "are the indices in this range valid for this span?". Something like span.indicesContain(range)
, though that then raises the question of "why isn't this just span.indices.contains(range)
?" (probably performance?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have these, do we also need ClosedRange
versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, boundsPrecondition's implementation is precondition(boundsContain(range)). There's maybe an argument to be made that the sugar hurts here, since it doesn't capture the correct line of code in the defaulted arguments of precondition().
I haven't needed the ClosedRange
versions because the RangeExpression
-taking functions end up forwarding to the usual Range
-taking functions. Maybe we would want them. We could take out the sugared preconditions and add ClosedRange bounds-checking functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great to me, thank you
public func extracting(droppingFirst k: Int) -> Self | ||
/// - Parameters: | ||
/// - indices: a range of indices to validate | ||
public func boundsPrecondition(_ indices: Range<Int>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have these, do we also need ClosedRange
versions?
public func withSpan<E: Error, Result: ~Copyable>( | ||
_ body: (_ elements: Span<Element>) throws(E) -> Result | ||
) throws(E) -> Result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm one of those people who would like this API :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complex semantics of this makes it undesirable, as exhibited by the overcomplicated signature. What we actually need is a storage
property that directly returns a Span
, tied to the lifetime of self.
```swift | ||
@frozen | ||
public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { | ||
internal var _start: UnsafePointer<Element> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Span
has the wrong pointer type: UnsafePointer<Element>
. The underlying pointer should be UnsafeRawPointer
. An important goal of Span is that it can be used to reinterpret memory as different BitwiseCopyable
elements. It needs to be possible to convert Span<T: BitwiseCopyable>
to Span<U: BitwiseCopyable>
. That should require an explicit unsafe API but should still be well-defined. If you do this conversion with an underlying typed pointer, you get undefined behavior. We want multiple read-only spans to safely view the same underlying memory as different types. We also want to be able to view slices of a RawSpan
as a particular element type. This is the same problem. Note that RawSpan
does not provide any additional functionality; it is only a convenience for buffers of heterogenous elements, where declaring an element type might be misleading at the API level.
We do not want to describe strict aliasing rules and undefined behavior in the Span API!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm; unrelated to the above: didn't we intend to use buffer-pointer layout for this? The start pointer ought to be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this to be optional, with an explanation (hadn't noticed that a push failed on Friday.) We won't use buffer-pointer layout after all, because we do want the pointer to be raw.
extension RawSpan { | ||
public init<T: BitwiseCopyable>(_ span: Span<T>) | ||
|
||
public func unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're mentioning RawSpan.unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T>
here, we may also want to explain that this is a safe API as long as the memory contains a value bit pattern for T
at each position. Span<T>
does not impose any strict aliasing demands on the programmer; they can use multiple read-only spans to safely view the same underlying memory as different types.
public var count: Int { get } | ||
public var isEmpty: Bool { get } | ||
|
||
public subscript(_ position: Int) -> Element { _read } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public subscript(_ position: Int) -> Element { _read }
This is not, in fact, the API that we want long term; it won't be compatible with lifetime-dependent values that are derived from the span.
We really want some new spelling for ~Copyable
projections, like:
public subscript(_ position: Int) -> Element { borrow }
For now, we could say:
public subscript(_ position: Int) -> Element { unsafeAddress }
Either way, we end up with an extra mangling for this API that may become vestigial in the future. But with unsafeAddress
, at least we have the full functionality of some future, safe "borrow accessor".
@jckarter what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The future directions attempts to explain that the _read
accessor is a placeholder. I expect the implementation of such placeholders to be non-ABI until there is a final spelling. I could add a reference to the future directions item after this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I realize the plan is to initially implement Span.subscript via _read
. I'll just point out that it will limit usability. unsafeAddress
will at least support the same use cases that a borrow
accessor would support.
- add `boundsContain(_ bounds: ClosedRange<Int>)`
```swift | ||
@frozen | ||
public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { | ||
internal var _start: UnsafePointer<Element> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm; unrelated to the above: didn't we intend to use buffer-pointer layout for this? The start pointer ought to be optional.
/// - Parameters: | ||
/// - index: an index to validate | ||
/// - Returns: true if `index` is a valid index | ||
public func boundsContain(_ index: Int) -> Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name needs to be reconsidered. isValidIndex(_:)
/// Parameters: | ||
/// - span: a span of the same type as `self` | ||
/// Returns: whether `span` is a subrange of `self` | ||
public func contains(_ span: borrowing Self) -> Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we have blown the contains
name to mean elementwise containment, not structural containment. This usage is not compatible with that, so we need to use a different name for it.
isSuperspan(of:)
? isSubspan(of:)
? isWider(than:)
? 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Notes from in person discussions]
-
span.isWithin(other)
appears to be a workable spelling for this. -
We also need to have an
isIdentical(to:)
method, to check equivalence. (a.isIdentical(to: b) == a.isWithin(b) && b.isWithin(a)
, but we wouldn't want to spell/implement it that way.) -
If the elements are bitwise copyable, then this operation needs to check alignment, which is a surprising complication, explaining why we'd want to ship this functionality in the stdlib. (This also needs to be called out within the docs.)
This may also turn into a solid argument against using the
isWithin
name, especially when considering the equivalent APIs onRange
in point 5 below. (If so, perhaps something likeisSubspan
is the way to go.) -
We should add equivalent members to the
UnsafeBufferPointer
types, to keep the stdlib coherent with itself, and to resolve some long standing omissions of basic functionality.extension Unsafe[Mutable]BufferPointer { func isIdentical(to other: UnsafeBufferPointer<Element>) -> Bool func isIdentical(to other: UnsafeMutableBufferPointer<Element>) -> Bool func isWithin(_ other: UnsafeBufferPointer<Element>) -> Bool func isWithin(_ other: UnsafeMutableBufferPointer<Element>) -> Bool }
-
isWithin
ought to also be added to the range types:extension Range { func isWithin(other: Range) -> Bool func isWithin(other: ClosedRange) -> Bool } extension ClosedRange { func isWithin(other: Range) -> Bool func isWithin(other: ClosedRange) -> Bool }
(These types already provide the functionality of
isIdentical
in their==
operation.)The notion of unaligned bounds makes this somewhat dangerous --
Range<UnsafePointer<T>>.isWithin(_:)
would sometimes return different results than anisWithin
call onUnsafeBufferPointer
orSpan
values over the same region of memory. -
The preexisting
[Closed]Range.overlaps
hints at a need to provide[Raw]Span.overlaps
andUnsafe[Mutable][Raw]BufferPointer.overlaps
operations. (As well asintersect
andunion
/merge
/combine
operations.) This does not seem to be an urgent need though.
/// This is an unsafe operation. Failure to meet the preconditions | ||
/// above may produce an invalid value of `T`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the address has incorrect alignment, are we really going to return an invalid value? (Rather than, say, sometimes causing a fault?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There exist systems where a load instruction simply masks off the low bits and always loads whatever it finds at the resulting aligned address (AltiVec loads behave this way on PPC, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unaligned loads can fault in embedded world as well, especially when targeting memory that isn't DRAM. In a sense, the worst outcome isn't necessarily an invalid value of T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this is that there is no trap other than bounds checking, and the T
instance's memory could be in a state that does not respect T
's invariants.
public func contains(_ span: borrowing Self) -> Bool | ||
|
||
public func offsets(of span: borrowing Self) -> Range<Int> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problems as with Span
-- contains
implements the wrong semantics for that name, and offsets(of:)
has an awkward name/shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public func isWithin(_ other: borrowing Self) -> Bool // Perhaps removed
public func byteOffsets(of subspan: borrowing Self) -> Range<Int>?
public func withSpan<E: Error, Result: ~Copyable>( | ||
_ body: (_ elements: Span<Element>) throws(E) -> Result | ||
) throws(E) -> Result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complex semantics of this makes it undesirable, as exhibited by the overcomplicated signature. What we actually need is a storage
property that directly returns a Span
, tied to the lifetime of self.
|
||
```swift | ||
@frozen | ||
public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we proposing Span
with support for nonescapable elements?
This is a proposal for safe access to borrowed memory, adding
Span
(formerly known as "BufferView") andRawSpan
.This proposal is a companion to (and depends on) Non-Escapable Types and on Lifetime Dependency Annotations for Non-escapable Types.