Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust generator support? #4436

Open
Sajjon opened this issue Apr 2, 2024 · 16 comments
Open

Rust generator support? #4436

Sajjon opened this issue Apr 2, 2024 · 16 comments
Labels
enhancement New feature or request new language Tracks issues requesting new languages generation support
Milestone

Comments

@Sajjon
Copy link

Sajjon commented Apr 2, 2024

Any plans for a Rust generator?

@sebastienlevert
Copy link
Contributor

It's definitely something we have been exploring. Right now, the demand is still unclear and we'd love to get more customer signals on the volume of demand Rust has today.

Can you share some of the scenarios you are thinking of for Rust support?

@Sajjon
Copy link
Author

Sajjon commented Apr 6, 2024

It's definitely something we have been exploring. Right now, the demand is still unclear and we'd love to get more customer signals on the volume of demand Rust has today.

Can you share some of the scenarios you are thinking of for Rust support?

Hey @sebastienlevert ! We at Radix DLT - "The radically better user and developer experience needed for everyone to confidently use Web3 & DeFi." - are heavy users of openapi , in two levels:

But openapi generator's rust generator is broken, it fails to generate our models, mostly due to enums being wrong, so we have some various scripts which modify our openapi schema to make it work with rust, which is suboptimal.

And in our middleware crate Sargon we are soon about to consume the Gateway (higher level) API's schema, but openapi generator generates broken enums. And the script above modifying our schema was written for the lower level Core API.

So we were kind of hoping Kiota can come to the rescue! :D so now you have to tricky schemas big schemas to test against :D

@sebastienlevert
Copy link
Contributor

Thanks for the context! We'll be taking this into consideration! Right now, it's not something that is prioritized. Though, we are happy to provide guidance if you want to participate with us building the support for Rust! Thanks!

@dhedey
Copy link

dhedey commented Apr 9, 2024

Thanks @sebastienlevert! I'm Head of Architecture at RDX Works, and work with @Sajjon.

Having been deep in Open API generation pains for years, we're really enthused about kiota, and the approach you're taking.

We're hoping to transition to kiota for the next major version of our generated SDKs across most of our languages.

As part of that effort, it might be possible for us to participate with building out Rust capabilities in kiota. We'd need to explore this further and I'll let you know if that's something we might be able to commit to in due course.

@sebastienlevert
Copy link
Contributor

Sounds good and haooy to meet you! Our docs in. Adding new languages support is missing as of now but we would definitely help and guide your team. Let us know and we can start the process!

@andrueastman andrueastman added enhancement New feature or request new language Tracks issues requesting new languages generation support labels Apr 12, 2024
@andrueastman andrueastman added this to the Backlog milestone Apr 12, 2024
@baywet
Copy link
Member

baywet commented Apr 15, 2024

Hi everyone 👋
Joining the conversation here a bit late since I was on vacations but happy to see this issue.
You'll want to checkout #3745 and associated links as the implementation efforts and sequence will be very similar. (and Dart is the community driven language we've made the most progress with to date).

@dhedey
Copy link

dhedey commented Apr 15, 2024

Many thanks @baywet! Just to help with trying to gauge the work, I've spent a couple of hours scoping out the work ahead. I don't suppose you might be able to double-check my assumptions and answer a few questions on how to approach implementation?

No rush from my part for replies here, if we were able to offer help, we'd still be a few months away from doing so.

Quick summary

To attempt to unify all your comments and my understanding, is this roughly right?

  • The first bit of work would be to start with an abstractions crate kiota-abstractions for the rust language, which would contain interfaces which the generated SDKs would use (and would include traits - i.e. rust interfaces - for request information, SerializationWriter, ParseNode, and a response adapter)
  • And then there is separate work on a number of areas (e.g. list of tickets for swift). Notably:
    • Code for generating e.g. models, clients and request builders - which statically require only the kiota-abstractions crate.
    • A default kiota-http Core Service implementation.
    • Default serialization implementations, e.g. kiota-serialization-json, kiota-serialization-http, kiota-serialization-multipart
    • Various authentication provider implementations

Questions:

  • It sounds like ideally (unlike with e.g. the dotnet implementation) all these crates would then ideally live in a singular kiota-rust monorepo? Is that still the preference? (It's fine/ideal from a rust POV).
  • I assume kiota-http is allowed to use some widely-used open-source http client dependency? And ditto with the serialization implementations?

Some rust-specific questions

There are a couple of things which I can foresee affecting the rust implementation, which I imagine will bite anyone integrating it for rust, and wanted to run them by you...

Lack of inheritance

Unfortunately, not quite aligning with the design-philosophy of Kiota, the Rust language isn't object oriented. That's not necessarily an issue - we can craft vague equivalents to hierachies with enums, structs and traits. Although I think it would take a bit of work to get something which both feels nice and still covers all the generation cases.

In a separate post below, I'll dump some personal reflections/lessons from the previous Open API generator which might help anyone looking to implement these models.

But, I was wondering if you had any advice?

First, I just wanted to double-check my understanding:

  • It looks like there's some kind of Kiota intermediate representation CodeDom(?) which is used for generation in each language
  • It looks like each language has:
    • A refiner (e.g. Typescript), to massage the types into better types for output (e.g. typescript creates discriminated unions; Java avoids unions)
    • And then Writers (e.g. Java) which have:
      • A convention service, for customising generation
      • And then generators for Classes (Rust structs/Enums), Properties, Methods, etc (roughly for each code type which can be output by the refiner)

Questions

A few quick questions:

  • I assume we can make use of the refiner to massage the types to fit Rust a little better, before generating?
  • How are things tested? Are there any particularly gnarly test cases we can use to stress the edge-cases? (e.g. multi-tier inheritance, including e.g. a property bag; combinations of OneOf and AnyOf; inline objects; use of $ref and overwritten descriptions)?
  • I can't see any anyOf or oneOf at all in kiota-samples - am I missing them?

Auto-serialization and serde

In reference to this comment on this post:
this post

A couple of principles to keep in mind:

  • we don't want the generated code to depend on anything else than the abstractions to "build". (in a static type sense)
  • we want the generated code to be agnostic from any given serialization format/library
  • we want to avoid using reflection as much as possible (perf, side effects...)

This is why models implement the auto-serialization pattern (they describe how to serialize/deserialize themselves)

I really like the auto-serialization approach.

In rust, the ecosystem basically has a single standard for this - the serde crate - which is serializer/deserializer agnostic.

A quick aside on serde, compared with kiota's abstractions:

  • It has its own equivalent for the kiota ISerializationWriter - the serde (Serializer), and similar, and generates the auto-serialization code with macros.
  • On the deserialization side, the serde Deserializer uses a visitor pattern, with each type (e.g. for a struct). This is a little more generic than the IParseNode interface.
  • Due to the rust trait coherence rules which limit who can implement traits (think "implement interfaces"), it's very common in the rust ecosystem for any crate exposing models to add a serde crate feature. If the feature is enabled, it includes derives of serde::Serialize and serde::Deserialize on models, which allow the models to be used with any serde-compatible frameworks.

Questions

  • I assume it would be most aligned with kiota (and likely easiest) to generate code using ISerializationWriter and IParseNode style interfaces.
  • But - I wonder if it might be reasonable to:
    • Also have a crate kiota-serde (as part of the default implementations) which in concept acts as a bridge between ISerializationWriter and serde::Serializer and from serde::Deserialize to IParseNode (the latter might need a bit of thought). We might be able to put this bridge in the form of a kiota_serde::AsSerde(MyModel) new type wrapper.
    • Have an optional feature in the generated code crate which, if enabled, would add additional trait implementations to support the deserializer part of kiota-serde - if required. (I haven't given much thought about if this would be required, but my intuition says it might be).

With such an approach, this might enable users of generated models to use the models with other parts of the rust ecosystem, and also allow us to implement the default serializer libraries as using libraries like serde-json etc.

@baywet
Copy link
Member

baywet commented Apr 15, 2024

Thanks for the detailed analysis here, I'll try to reply to the questions without missing anything.

  • mono-repo: preferred if the language allows it and it doesn't overcomplicates CI/CD workflows definitions
  • which http client to use: the preferably the one offered by the platform (like HttpClient in dotnet) if any exists/it's not ancient (HttpUrlConnection in Java), or the most widely adopted but the language community that support requirements (http/2 support out of the box, MIT license or similar, middleware pipeline for retry/redirect/compression/etc... handling)
  • serialization implementation: same rationale applies as with http, for the form and multipart serialization make sure those library do not require any server type dependency.
  • CodeDom: it's a reflection of "all the elements a language agnostic client needs to have to call the API".
  • refiner: specializes the DOM into something more language specific to make the writers job easier, yes please use refiners to get closer to Rust "native experience".
  • writers: visitor DP that writes the code for all the elements in the DOM.
  • testing: you'll see unit tests for each element in the tests projects, there's nothing fancy besides SRP and AAA unit tests here.
  • no examples of anyOf/oneOf in kiota samples AFAIK as this is a subset of Microsoft Graph (which don't have those) or very simple APIs. Please note that oneOf/anyOf support is not yet implemented in TypeScript/Ruby/Swift, so don't use those for your tests/to observe how things look like.
  • serde: I'm not familiar at all with this, but it looks interesting. We'd like to favor alignment across languages on the serialization infrastructure as it has impacts on the different datatypes supported, which OpenAPI/JsonSchema features and version are supported etc. Do you think the "Json implementation could simply forward things to serde"? If so, I guess we'd automatically have support for xml/yaml/cbor/etc... which would be neat.

Don't hesitate if you have more questions!

@dhedey
Copy link

dhedey commented Apr 15, 2024

Fantastic, very helpful, thanks @baywet!

We'd like to favor alignment across languages on the serialization infrastructure as it has impacts on the different datatypes supported, which OpenAPI/JsonSchema features and version are supported etc.

Indeed, that makes a lot of sense.

Do you think the "Json implementation could simply forward things to serde"? If so, I guess we'd automatically have support for xml/yaml/cbor/etc... which would be neat.

I think it might be possible to create some kind of adaptor between serde and the kiota abstractions (certainly they try to solve very similar problems) - but I haven't tried or investigated this too deeply. I think this would be the most natural way to interface the kiota abstractions into the rust ecosystem, but I imagine when the project gets to implementing, it might be possible to investigate this.

@dhedey
Copy link

dhedey commented Apr 16, 2024

And, as mentioned, whilst they're on my mind, I wanted to dump on some various reflections regarding model generation in rust, off the back of some painful experience with the Rust Open-API generator...

These are all rather opinionated, so could do with moving to some kind of discussion place / moving off elsewhere, when there's a separate rust repo.

Boxing

To avoid issues with large stack usage and recursive types, there will have to be some use of boxes.

Possibly the least painful place to put them is in enum variants, and Option<Box<T>> - as this prevents the two biggest issues - recursive types and large enum variants causing excessive stack usage in the other variants.

The largest struct stack size then is bounded by the size of the struct definitions open api schema.

But possibly we should also have boxing of struct fields... Open to reflections here:

  • Are we happy with arbitrarily large structs? (probably?)
  • We could possibly consider adding boxes into the generated models in particular places dependent on the size of structs below (e.g. after 1KB of stack or something). But this might be rather arbitrary.
  • Possibly boxing every child struct/enum might be a simpler model, although much more expensive in terms of allocations.
  • Smithy Rust uses a nice algorithm to make a deterministic choice about where to insert boxes

Iteration order

Ideally the extended "properties" collection would use "insertion" order. That way, { "x" => 1, "a" => 2 } would come out correctly with "x" before "a". This is quite important in some use cases, and would require use of the index_map crate, as hashmap makes no iteration ordering guarantees, and btreemap uses alphabetical ordering.

Perhaps this can be a feature or generation option. Like it is with the rust open api generator.

Inheritance

Note - I love this guide from Redocly regarding common patterns with oneOf/anyOf/allOf and use of discriminators.

Consider:

# Pseudo-Schema (all fields required unless marked with a ?)
AnimalMixin = object:{ name: string }
Gift = object:{ name: string }
FelineMixin = object:{ purrs: bool }:additional_properties=Gift
PetType = string:enum=["Dog", "Cat"]
PetTypeExtensible = string:AnyOf=[PetType, {}] // Makes it extensible
Pet = object:AllOf=[
    AnimalMixin,
    object:{ pet_type: PetType }:discriminator("pet_type", { "dog": Dog, "cat": Cat })
]
Dog = object:AllOf=[Pet, object:{ barks: bool }]
Cat = object:AllOf=[Pet, FelineMixin, object:{ licks: bool }]

Now the interpretation of the above is a bit gnarly, even in a single-parent inheritance language:

  • Should Dog have a pet_type field? Or property / getter? Should it be fixed as "Dog" in the serializer?
    • I think given Dog is a Pet and the only valid Pet has "dog", the answer ideally would be "no field", "readonly getter" (but I haven't looked at what kiota does here)
  • Should Cat inherit from Pet? What about FelineMixin?
    • I assume Kiota has Cat inherit from Pet, but I haven't checked what rules it uses.

In any case, I assume kiota must have some default representation of this object in its internal CodeDom representation, but I haven't dug into this too deeply.


We'd then need to work out how to map this to Rust. Personally, I'd propose the following, which would solve some pain points we have with the open api generator we are currently using.

My vision would be roughly:

  • To have concrete structs for each leaf in the hierachy... and probably each internal node too
  • To also output enums for any object with a discriminator
  • Add some ability to interpret descendents as their ancestors in the type hierachy (by using trait objects and From)
enum PetType {
    Dog,
    Cat,
}

enum Pet {
    Dog(Box<Dog>),
    Cat(Box<Dog>),
}

struct Dog { // Implicitly has `pet_name: PetType::Dog`
    pub name: String,
    pub barks: bool,
}

struct Cat { // Implicitly has `pet_name: PetType::Cat`
    pub name: String,
    pub purrs: bool,
    pub licks: bool,
    pub properties: IndexMap<String, Gift>,
}

struct Gift {
    pub name: String,
}

// And then From implementations for the type hierachy
impl From<Dog> for Pet { /* ... */ }
impl From<Dog> for AnimalMixin { /* ... */ }
impl From<Cat> for Pet { /* ... */ }
impl From<Cat> for AnimalMixin { /* ... */ }
impl From<Cat> for FelineMixin { /* ... */ }

// Traits representing the type hierachy; some assorted ones below. Every type could include a trait conceptually.
// For each property `x`:
// * There are `x_ref` and `x_mut` allowing access to a shared or unique reference (naming is in keeping with the `AsRef` / `AsMut` traits)
// * There is a `set_x` method to allow the use of a builder pattern.
// * Sometimes (for types implementing copy: value types or Open API enums) we also implement `get_x` which returns an owned type.
// Additional properties are exposed with a separate `additional_properties_ref` / `additional_properties_mut`
// 
// Note - for each type we have an associated object safe trait, `AsX`, and a full trait `IsX` (names TBC):
// See https://doc.rust-lang.org/reference/items/traits.html#object-safety
trait AsAnimalMixin {
    fn name_ref(&self) -> &str;
    fn name_mut(&self) -> &mut str;
    fn set_name(&mut self, value: String) -> &mut Self;
}
trait IsAnimalMixin: AsAnimalMixin + Into<AnimalMixin> {};
trait AsPet: AsAnimalMixin {
    fn get_pet_type(&self) -> PetType;
    fn pet_type_ref(&self) -> &PetType; // We may need to have a static PetType variant to allow this to work for Cat/Dog
    // NOTE: The following would panic on a discriminated child of Pet e.g. a Cat/Dog which can't mutate PetType.
    fn set_pet_type(&mut self, value: PetType) -> &mut Self;
    fn pet_type_mut(&self) -> &mut PetType;
}
trait IsPet: AsPet + IsAnimalMixin + Into<Pet> {};
trait AsFelineMixin {
    fn get_purrs(&self) -> bool;
    fn set_purrs(&mut self, value: bool) -> &mut Self;
    fn purrs_ref(&self) -> &bool;
    fn purrs_mut(&self) -> &'mut bool;

    fn additional_properties_ref(&self) -> &IndexMap<String, Gift>;
    fn additional_properties_mut(&mut self) -> &mut IndexMap<String, Gift>;
}
trait IsFelineMixin: AsFelineMixin + Into<FelineMixin> {}

// And the trait impls would be as follows:
// * X implements AsX and IsX
// * If X is an ancestor of Y then X implements AsY, and IsY... and Y implements From<X>.
// e.g. for Dog:
impl AsAnimalMixin for Dog { /* ... */ }
impl From<Dog> for AnimalMixin { /* ... */ } // Gives Dog: Into<AnimalMixin>
impl IsAnimalMixin for Dog {}
impl AsPet for Dog { /* ... */ }
impl From<Dog> for Pet { /* ... */ }
impl IsPet for Dog {}
impl AsDog for Dog { /* ... */ }
impl IsDog for Dog {}

API Extensibility and Completeness

For completeness, we may also need some struct representation of each intermediate "class" in the hierachy, such as the "Pet" class without the discriminator.

Even with a discriminator on an OAS type, such a base type is typically used as a fallback by OOP OAS generators in the case where a discriminator doesn't match. This is important for ensuring an Open API client stays compatible with certain kinds of schema extension, allowing API writers to add new variants in particular places without failing the whole request decoding.

// This fits inside the hierachy under Pet
struct PetBase {
    pub name: String,
    pub pet_type: PetType,
}

trait AsPetBase: AsAnimalMixin {
    fn get_pet_type(&self) -> PetType;
    fn set_pet_type(&mut self, value: PetType) -> &mut Self;
}
trait IsPetBase: AsPetBase + IsAnimalMixin + Into<PetBase>;

// We'd then wish to modify Pet as follows:
enum Pet {
    Dog(Box<Dog>),
    Cat(Box<Dog>),
    Unmatched(Box<PetBase>),
}

// And we modify IsPet and AsPet to inherit from IsPetBase instead of IsAnimalMixin
// And we add various impls for IsPetBase and AsPetBase...
impl AsPetBase for PetBase { /* .. */ }
impl AsPetBase for Pet { /* .. */ }
impl AsPetBase for Dog { /* .. */ }
impl AsPetBase for Cat { /* .. */ }

Although in this case, because we are using an enum as a discriminator, this extension wouldn't actually be beneficial, because we can't add a new enum variant without it being a breaking change to the API schema.

So, let's change Pet to use pet_type: PetTypeExtensible (or picked up on some Open API extension which suggested enums were extensible) then I'd expect to generate something like the following. This would allow a new pet such as a "Hamster" to be decoded as a Pet::Unmatched(PetBase { pet_type: PetTypeExtensible::Other("hamster"), .. }) without breaking the rest of the request deserialization:

enum PetTypeExtensible {
    Dog,
    Cat,
    Other(String),
}

// This fits inside the hierachy under Pet
struct PetBase {
    pub name: String,
    pub pet_type: PetTypeExtensible,
}

trait AsPetBase: AsAnimalMixin + Into<PetBase> {
    fn get_pet_type(&self) -> PetTypeExtensible;
    fn set_pet_type(&mut self, value: PetTypeExtensible) -> &mut Self;
}

NOTE: This could also conceivable be allowed as a general option in the schema generation, something like enumUnknownDefaultCase in this comment to allow for extensible enums.

Unions (oneOf, anyOf)

  • AnyOf / OneOf with a discriminator would also support an enum - but wouldn't support inheritance through it.
  • AnyOf / OneOf without a discriminator may also likely need to use some kind of struct, with Option<Box<X>> for each of the different options which matched. This would depend on how these are represented/supported in kiota - support for these based on structural validity is one of a number of areas of the Open API spec which is ugly and in-performant to support / be spec-compliant with.

Required and Nullable

  • A not-required field should map to Option<X>
  • Nullable should map to kiota_abstractions::Nullable<X>, with options Null and NotNull which should implement many similar things to Option.

A non-required, nullable field would be Option<Nullable<X>>. Ideally we could implement various trait on Option<Nullable<X>> to let you easily create and as_ref / as_mut the value.

Derives

  • Structs representing Open API objects can implement Clone, Debug, Eq, PartialEq... and I guess I'd consider Ord, PartialOrd, Hash
  • Enums representing Open API discriminators can implement Clone, Debug, Eq, PartialEq... and I guess I'd consider Ord, PartialOrd, Hash
  • Enums representing Open API enums can implement Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd, Hash

We could also have Default - but only if all its children support default (e.g. enums can't support default) - this has been broken in the past in the Open API generator.

Constructors

I'd be tempted to not have constructors and just make all fields pub. Particularly as Rust doesn't support named parameters like C#.

The Open API generators having default parameters in constructors caused us lots of pain when models changed, and we were missing new fields, or providing values for the wrong fields.

Optional parameters and Defaults

The main issue with this approach may be frustration with elegantly constructing objects with optional parameters, when the object also has one or more required parameters... Although this is a general rust problem when at least one of the fields is not default-able.

You could consider something like default_with_v1, but it requires adding this extra model for the required fields. Another option would be default_with_v2 which sacrifices some type safety / visibility of the field names (by hiding them in the constructor) but gives better code ergonomics:

enum MyEnum {
    A,
    B,
}

struct MyExampleModel {
    optional_1: Option<i32>,
    required_1: i32,
    optional_2: Option<i32>,
    required_2: MyEnum,
    required_3: MyEnum,
}

struct MyExampleModelDefaultRequired {
    required_2: MyEnum,
    required_3: MyEnum,
}

impl MyExampleModel {
    fn default_with_v1(required: MyExampleModelDefaultRequired) -> Self {
        let MyExampleModelDefaultRequired { required_2, required_3, } = required;
        Self {
            optional_1: Default::default(),
            required_1: Default::default(),
            optional_2: Default::default(),
            required_2,
            required_3,
        }
    }

    fn default_with_v2(required_2: MyEnum, required_3: MyEnum) -> Self {
        Self {
            optional_1: Default::default(),
            required_1: Default::default(),
            optional_2: Default::default(),
            required_2,
            required_3,
        }
    }
}

fn my_function() -> MyExampleModel {
    MyExampleModel {
        optional_1: Some(1),
        ...MyExampleModel::default_with(MyExampleModelDefaultRequired {
            required_2: MyEnum::A,
            required_3: MyEnum::B,
        })
    }
}

Integer types

I think it makes sense to keep with the interpretation of format from the Open API specification

We should use the following:

  • type: integer, format: int32 => i32
  • type: integer, format: int64 => i64
  • type: integer, no format => i64 EDIT: Kiota says this is i32
  • type: number, format: float => f32
  • type: number, format: double => f64
  • type: number, no format => f64 EDIT: Kiota sets this to f32

There is a question about whether to use the minimum / maximum ranges to further restrict the type (to e.g. i8 / u8 / u32 / u64). For example, if we know that 0 <= x <= 255, we could fit it in a u8, or know it's >=0 and int32, arguably it's a little nicer to stick it into a u32 rather than an i32.

For me, if I were designing things from the ground up, my answer would be "yes, be intelligent here" - but I assume we should go with what kiota's policy is on this, if there is one.

@baywet
Copy link
Member

baywet commented Apr 19, 2024

Boxing

What do you consider large structs? lots of symbols? or hold a lot of data? (e.g. a property is an array/collection/map with lots of entries)
Could you provide examples to illustrate the difference in approches?

Iteration order

The client MUST maintain the order of the collections it receives from the service. It also MUST serialize collections in the same order they were provided. While it's not common, order matters to a lot of services/applications for the business logic. (e.g. what about an API that returns race finalists, but without providing a numerical position or a time to complete the race?)

Inheritance

(and composed types)

This is something we've already given a lot of thoughts, there are still rough edges in some scenarios but effectively:

  • OneOf: exclusive union, results in wrapper types with only one member at a time that can be hydrated for languages that can't express exclusive unions.
  • AnyOf: inclusive union, results in wrapper types with multiple members at a time that can be hydrated for languages that can't express inclusive unions.
  • AllOf: (1 inline + 1 schema ref) inheritance
  • AllOf: (more than one inline schema, or more than one schema ref), mushed into a new type, although this one is not properly implemented yet. allOf - Properties not generated - C# #4346 and associated issues.

Discriminators are supported in all of the oneOf/anyOf/allOf scenarios implemented today, with a caveat that only one level will be supported (kiota clients don't recursively walk the type graph at runtime as it could become really expensive)
Any of those cases with just one member type will be "squashed".

All of that is represented through the code DOM for you already (inheritance is code class + parent in the start block, oneOf/anyOf are code union/code exclusion type that can be converted to a wrapper class at refiner stage, intersection types will be code classes)

In your cat and dog example, why wouldn't you have a base/abstract class "pet" that holds the name and type properties instead of an enum? (I know nothing about Rust at this point, sorry if the question is stupid)
Ruby also has the notion of mixins, when I started working on that with a couple of interns we first tried to go down that route. And at least for ruby, the fact that those concepts didn't not "obviously" map to traditional OOP concepts made things harder. We ended up using the mixins to model inheritance and implementations and other aspects. Not sure it was the best solution but so far the people who have tried it have not complained about that aspect.

API Extensibility and Completeness

Pretty much where I was going with my previous comments. One important piece of information you might not have about kiota: we left the discriminators as strings for this very problem. As adding a new member type (inheritance or other scenarios) to the "discriminated types collection" would have been a breaking change for some languages (think type assertions, switches, etc... type of scenarios where if you exhaust the possibilities, the compiler doesn't require a default, but if a new possibility gets introduced, the compiler will yell at you)

Unions (oneOf, anyOf)

I think I've talked about that in my previous comments, let me know if there are further questions/comments.

Required and Nullable

Please read #3911

Derives

I'm guessing implementing those enables equality comparisons, sorts, etc? In dotnet and other languages, they are usually an additional interface to implement on a different type (string has StringEqualityComparers and StringOrderComparers for example). We have done any generation with that regard to date and didn't get any request for that. I'd put that in a nice to have category at the moment and focus on the other concepts first.

Constructors

Used for models to set default values for properties and to set the backing store + additional data: we need to make sure the model will be functional, if they are alternatives to achieve the same goal, we can leverage those instead.
Used for request builders to set some of the required properties (request adapter, uri template, etc...): this is a good guard rail to minimize the generated code and ensure people will end up with a functional request adapter.

Optional and defaults

Again #3911 and #3398

Integer types

I'd align those mappings with the other languages, otherwise maintenance is going to be a nightmare. Please don't quote the swagger reference or @darrelmiller is going to get mad 🤣
After we implemented the types mapping here, I did some work to help formalize it to the registry, so kiota doesn't implement everything there, and discussions revealed a bit of drift we'll correct in another major version.

Corrected table:

  • type: integer, format: int32 => i32
  • type: integer, format: int64 => i64
  • type: integer, no format => i32
  • type: number, format: float => f32
  • type: number, format: double => f64
  • type: number, no format => f32

But the generator does that for you automatically to the code dom. What I suggest is that you look at the parsenode/serialization writer interfaces to find the closest mappings.

I hope that answers a lot of your questions, don't hesitate to follow up!

@dhedey
Copy link

dhedey commented Apr 19, 2024

@baywet First off - massive thank you on sharing your thoughts / links. Super helpful!

Will reply in chunks.

Boxing

What do you consider large structs? lots of symbols? or hold a lot of data? (e.g. a property is an array/collection/map with lots of entries)

Sorry, I'm not being very precise. When I say "large struct" I'm loosely meaning a "large, possibly nested structure which would live as one unit in memory". In rust, any types without allocation/indirection all live like this. Boxes are explicit, and lots of Rust's speed comes from avoiding allocations and generally data lives on the stack unless explicitly moved into a box (or belonging to a heap-allocated structure such as a Vec (think C# List) or some kind of map).

But if you nest lots of rust types together naively (without boxes), you can end up with a really large item, which can be can cause stack overflow (as each stack frame needs to allocate space for it - particularly when built in debug mode, which doesn't optimise away moves/copies).

To get around this risk of stack overflow with very large types, you need to introduce boxes (or equivalent stack allocation) somewhere.

Could you provide examples to illustrate the difference in approches?

Consider a model like this:

struct MyParentItem { 
   a: MyMassiveItem,
   b: MySumTypeItem,
}
struct MyMassiveItem {
   property_1: MyMiniItem,
   //...
   property_1000: MyMiniItem,
}
struct MyMiniItem {
    a: u64,
}
enum MySumTypeItem {
   A(MyParentItem),
   B(MyMassiveItem),
   C(MyMiniItem),
   None,
}

OPTION 1 - Enum variants only

Rust requires boxes to support recursive types (in the above example, MyParentItem > MySumTypeItem form a recursive loop) - so we'd need some boxes. The first option is to add them just for enum variants, e.g. same as above, except:

enum MySumTypeItem {
   A(Box<MyParentItem>),
   B(Box<MyMassiveItem>),
   C(Box<MyMiniItem>),
   None,
}

The drawback is that if you never have any enums, and have a really large request/response just built out of objects, then you end up with lots of heap usage. I don't think practically it would be a problem though in most cases.

OPTION 2 - Enum variants + "large structs"

(for some definition of large thing)

So, same as Option 1, except we'd also add a box here:

struct MyParentItem { 
   a: Box<MyMassiveItem>,
   b: MySumTypeItem,
}

This is kinda how a human might solve the problem, adding manual boxes in places to divide up large types where it makes sense... the problem is that it might be quite arbitrary, and cause box churn across the whole code-base, as stably deciding which type to box based on how large it is isn't particularly solvable.

OPTION 3 - Box all non-built-in-types

Simplest to reason about, but the code is less idiomatic with lots of boxes around and a bit more boilerplatey to work with.

struct MyParentItem { 
   a: Box<MyMassiveItem>,
   b: Box<MySumTypeItem>,
}
struct MyMassiveItem {
   property_1: Box<MyMiniItem>,
   //...
   property_1000: Box<MyMiniItem>,
}
struct MyMiniItem {
    a: u64,
}
enum MySumItem {
   A(Box<MyParentItem>),
   B(Box<MyMassiveItem>),
   C(Box<MyMiniItem>),
   None,
}

Personal reflections

I think offering option 1 as a default, possibly with a break-glass configuration choice for option 3 if people want it; might be best.
I think on reflection Option 2 will be too complicated and hard to define/reason about end up with too much churn as an API evolves.

@dhedey
Copy link

dhedey commented Apr 19, 2024

A few more replies:

Iteration order

The client MUST maintain the order of the collections it receives from the service. It also MUST serialize collections in the same order they were provided. While it's not common, order matters to a lot of services/applications for the business logic. (e.g. what about an API that returns race finalists, but without providing a numerical position or a time to complete the race?)

I totally agree with you. That just means we will need to pull in the ecosystem-standard indexmap library into kiota-abstractions because there's not such a map in Rust's core/std.

Inheritance etc

This is something we've already given a lot of thoughts, there are still rough edges in some scenarios but effectively:

Yes, I can imagine it's received a lot of thought!

In your cat and dog example, why wouldn't you have a base/abstract class "pet" that holds the name and type properties instead of an enum? (I know nothing about Rust at this point, sorry if the question is stupid)

Rust doesn't have classes or inheritance at all! The best we get are traits (think interfaces). Types (structs, enums) can implement many traits, and traits can have dependencies, like a class hierachy.

So the rust way of representing inheritance would be to create the hierachy in traits, and then have concrete structs implementing those traits. Sounds quite a lot like your Ruby project!

This works fine conceptually, but it's also really important to be able to match on an exhaustive list of children. That's why you need the enums too. Enums are much more bread-and-butter rust than complex trait hierachies.

Any type with a discriminator allows us to have it appear as an enum (e.g. Pet), as well as a struct (e.g. PetBase). The enum variant would be most useful / idiomatic I think.

As for how it maps to the CodeDom... The CodeDom sounds very OOP-centric, so we might have to look at whether it's possible/sensible to refine the OOP; or might make sense to calve out at a slightly lower / more intermediate level in the processing pipeline (I don't imagine this would be necessary, and obviously we should explore CodeDom first)

AllOf: (more than one inline schema, or more than one schema ref), mushed into a new type, although this one is not properly implemented yet. #4346 and associated issues.

In some ways the beauty of the Rust approach is that, by not being strictly tied to inheritance, you can support multiple parents just fine.

Might it be permissible to extend the CodeDom to provide hints of multiple parents so that Rust could handle this? Or would that be too complicated?

API Extensibility and Completeness

Pretty much where I was going with my previous comments. One important piece of information you might not have about kiota: we left the discriminators as strings for this very problem.

Interesting! I guess this makes sense.

Derives

These all come from free by Rust's macro system... We just need to decide which ones we want.

Required and Nullable

Please read #3911

I give my full support to handling required + not nullable. We could have it so that required + not nullable would map to X and everything else to Option<X>.

Although the rust philosophy is to be as explicit as possible in the type system, so the most rusty approach would be to have distinct types X / Option<X> / Nullable<X> / Option<Nullable<X>>... with trait methods on Option<Nullable<_>> to make it easy to use. But it sounds like this isn't very Kiota; and consistency with the rest of Kiota sounds pretty important.

I'm just not sure how we'd allow people to distinguish {} from { field: null } without it (as per #3846 (comment) ). Whilst we could theoretically implement a "backing bag" approach which would handle required/non-required; built naively, its interface would be very unnatural in Rust (because there aren't getters/setters, it would need to be method based).

When I'm building a Rust API, I'd like it to be a requirement that it can be compiler-guaranteed that I set all required properties on a response... Any solution which didn't do that would be a blocker.

I think we'd need to explore this area further.

@dhedey
Copy link

dhedey commented Apr 19, 2024

Integer types

I'd align those mappings with the other languages, otherwise maintenance is going to be a nightmare. Please don't quote the swagger reference or @darrelmiller is going to get mad 🤣

🤣. Sorry Darrel! Haha I feel the pain. One of my frustrations with Open API is how bad/inconsistent the docs are (the swagger ones are in the 🙈 category) - and sadly the better docs are often buried further down search engine results! I will bookmark the registry.

And noted on the mapping of no format => i32 / f32.


Right, I think that's it for now 👍. Thanks again for your reflections / input.

It'll be parked from my/our side for a while. I might find time to do some out-of-hours work on this over the next few months (but personal life's about to get rather hectic, so that might be optimistic) and we might be able to come back to it with dedicated in-work-hours developer time in a few months, though that wouldn't be before June/July time at the earliest.

In the mean time, I encourage anyone else from the community looking at this thread to get involved / pick up / share the gauntlet - I just wanted to start the scoping work so that we could have a bit of a shared understanding of what an implementation would look like.

Actually, one final question @baywet regarding where this mono-repo should live -- would it be best for microsoft to create a microsoft/kiota-rust repository base (including licenses, codeowners etc)? Or should I/we/someone else start on things first, and then once you see it begin, it can be moved across to microsoft later?

@baywet
Copy link
Member

baywet commented Apr 22, 2024

Might it be permissible to extend the CodeDom to provide hints of multiple parents so that Rust could handle this? Or would that be too complicated?

Interestingly this would make supporting allOf and the different use cases much simpler.
Generally speaking we try to avoid adding anything in the DOM at the Kiota builder stage that will be language specific. This is what the refiners are for. If you need additional elements in the DOM so the refiner can "augment" the context for a given language, no issues with that.
Arguably inheritance support is language specific, and should be done at the refiner stage. But this represents a lot of code, that's been placed in the builder for historical reasons, and I'd rather not get into a huge refactoring at this point.
TODO ask about fields duplication in inheritance scenario

These all come from free by Rust's macro system... We just need to decide which ones we want.

Could we start with none of them for now then? And let user demand drive things here?

though that wouldn't be before June/July time at the earliest

No rush! We really appreciate the discussion here, as it'll help everyone who has interest in rust + kiota to get a better understanding of the task at hand!

regarding where this mono-repo should live

The approach we took on Dart with @ricardoboss what to have the mono-repo under his own org, and when things start to finalize, we'll move it under Microsoft. There are a lot of reasons for that including security and compliance requirements on our end that I'd like to not have in our way as we iterate rapidly at the beginning. (effectively we can't have non-MSFT GitHub handles with direct permissions on Microsoft owned repo, which means any initial configuration you need done will have to go through an employee....) Does that work for you?

@baywet
Copy link
Member

baywet commented Apr 22, 2024

An option I forgot to mention for the repo here and for @ricardoboss is that we can temporarily put the repos under https://github.com/kiota-community/ for now as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new language Tracks issues requesting new languages generation support
Projects
Status: New📃
Development

No branches or pull requests

5 participants