-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handling for arbitrary derive cycles #46
Comments
Interesting. Would H look like this? struct H {
d: Box<D>,
g: G,
} When we "remove" a node, it sounds like that type would always then be referred to via In addition to "containment cycles". I'm also interested in dependency cycles so that we can use a more expansive set of derive macros. Consider, say, that a struct that contains an It may be that these are different enough problems that we want to address containment and derives distinctly. |
Oops I forgot to include H, but yes that is what H would look like.
Yes, by "removing" a node we are making all references to its type a
Edit: I realized I might be misunderstanding your question. The solution I've thought of optimizes for "removing" the minimal number of nodes. I thought that this would be nice because it would make things easier to implement. However, you could be asking, since we are changing every reference to this type to a
I think I'm confused on how having a type reference cycle causes an issue. I'll go over my current understanding of the problem and a proposed solution. Hopefully, that will help you find where I am missing some understanding of why these cycles could cause an issue. Problem statement: We want to be able to find the maximal number of desired traits that we can implement using the derive macro on a per type basis. Proposed solution: Start by assuming that every type can impl all of our desired traits: struct A {
b: B,
}
struct B {
inner: f64
} We wouldn't look at If any of these "known" types are restricting in any way then we would want to apply that restriction to the current type. In the example above, we would say that We should be able to only look at the "known" types because we are doing this for every node so if there is a path C -> B -> A when we are at node A, we would traverse to node C on the transpose graph. So, we need not worry about the reference to type This would in the worst case require a traversal from every node that reaches every other node in the case of graph where all nodes are contained in a single SCC and where every node has a restrictive "known" type. This would have a runtime of O(n2). |
Note that #300 addressed containment cycles; there is still more work to be done in terms of properly selecting which derive macros to apply. |
JSON Schema can define types that have cycles. This is simple to handle in languages like JavaScript or Java, but more complex in Rust since those cycles must be explicitly broken with a
Box<T>
. Note that use of aVec
orHashMap
also breaks the containment cycle but has implications for derive computation which we will discuss later. Note too that where one "breaks" a cycle may have multiple solutions, some that require more breaks than others. Note also that it may not be feasible to reconstruct the types e.g. if the JSON Schema were derived from Rust types because the information aboutBox
indirections is explicitly discarded (and probably reasonably so, but one could imagine including hints; more on that later as well).Currently we break trivial
A -> A
cycles such as:We can do this without a bunch of graph traversal and it solved a proximate problem.
The more general case requires us to decompose the type graph into strongly connected subgraphs that form a DAG (e.g. with algorithms proposed by Tarjan, Dijkstra or Kosaraju). In this case, the edges are defined by structure or newtype containment either directly or via an
Option
type. Within each strongly connected subgraph we then would determine where to "break" the cycles by insertingBox
es. The general case of this requires exponential time to compute. While the number of nodes (types) in a cycle is likely to be small, we still may elect for a heuristic, the simplest of which would be to cut all edges. There's very little harm in cutting more than is absolutely required--the serialization isn't affected for example--the only consequence is to the legibility and ergonomics of the generated types.For JSON Schema generated from rust types, it could be helpful to annotate boxed types with an extension. This could act as a heuristic when slicing a strongly connected component i.e. we use these extensions to see if they properly break the containment cycle and do something else if they don't.
The derive macros we apply to types have a similar problem. Consider, for example, the following type:
For this struct we could
#[derive(Eq, PartialEq)]
, but if we change theu32
to anf32
we could not! AVec<T>
isEq
only ifT: Eq
and aHashSet<T>
isn'tOrd
regardless of the traits implemented byT
.From the list of desirable traits to implement such as
Hash
,Ord
, andEq
, the ones we can apply to a type depend on the types to which it refers. And those references may form a cycle. As above, we must compute the strongly connected components. Above the edges were containment; here the edges are all references (i.e. aVec
is an edge here but not above`). Within each strongly connected component we must take the intersection of all supportable traits.The text was updated successfully, but these errors were encountered: