Skip to content

Commit

Permalink
[Security Solution] Generate artefacts (Zod schemas + TS types) in a …
Browse files Browse the repository at this point in the history
…proper order (#187044)

**Closes:** #182928

## Summary

This PR modifies `kbn-openapi-generator` to produce Zod schemas and related TS types for shared schemas in `components.schemas` in a proper order independent from OpenAPI spec schemas definition order. 

## Details

Current `kbn-openapi-generator` implementation to generate Zod schemas and related TS types for OpenAPI's spec file `components.schemas` is straightforward. It repeats schemas definition order which can lead to cases when a dependent artefact is defined before its dependencies. Engineers have to manually order schemas in OpenAPI spec files to make sure produce artefacts are valid TS files while OpenAPI specification doesn't impose such requirements.

To illustrate the problem let's consider a following OpenAPI spec

```yaml
...
components:
  x-codegen-enabled: true
  schemas:
    MainSchema:
      type: object
      properties:
        fieldA:
          $ref: '#/components/schemas/DepA'
          nullable: true
        fieldB:
          $ref: '#/components/schemas/DepB'
        fieldC:
          $ref: '#/components/schemas/DepC'
    DepB:
      type: boolean
    DepC:
      type: integer
    DepA:
      type: string
```

Running code generation for the spec above produces the following artefacts

```ts
import { z } from 'zod';

export type MainSchema = z.infer<typeof MainSchema>;
export const MainSchema = z.object({
  fieldA: DepA.nullable().optional(),
  fieldB: DepB.optional(),
  fieldC: DepC.optional(),
});

export type DepB = z.infer<typeof DepB>;
export const DepB = z.boolean();

export type DepC = z.infer<typeof DepC>;
export const DepC = z.number().int();

export type DepA = z.infer<typeof DepA>;
export const DepA = z.string();
```

which is not valid since dependencies are defined after the dependent `MainSchema`.

### After the fix

The fix takes into account that references chain represents a graph in a common case. When there are no cycles in references they represent a [Directed Acyclic Graph (DAG)](https://en.wikipedia.org/wiki/Directed_acyclic_graph). In graph's terminology schema is a node and reference is an edge. There are `[topological sorting algorithms](https://en.wikipedia.org/wiki/Topological_sorting)` which order DAG nodes starting from zero incoming edges to the maximum number. There is an ability to take into account cycles so the result has sorted nodes which don't take part in cycles. References sorted this way have dependencies defined before dependent schemas.

After applying this fix and running code generation for the OpenAPI spec above we will get the following

```ts
import { z } from 'zod';

export type DepA = z.infer<typeof DepA>;
export const DepA = z.string();

export type DepB = z.infer<typeof DepB>;
export const DepB = z.boolean();

export type DepC = z.infer<typeof DepC>;
export const DepC = z.number().int();

export type MainSchema = z.infer<typeof MainSchema>;
export const MainSchema = z.object({
  fieldA: DepA.nullable().optional(),
  fieldB: DepB.optional(),
  fieldC: DepC.optional(),
});
```

### Notes

- Implementation preserves original schemas order when possible.
  Generally speaking topological sorting doesn't define relative order in a group of elements with the same number of incoming edges (dependencies in our case). It means the result ordering can vary. **To reduce the diff topological sorting implementation in this PR preserves original schemas order when possible**. When sorting is necessary schemas are places in dependencies discovery order. You can see that OpenAPI spec above has schemas ordered like `DepB`, `DepC` and `DepA` while prodiced TS file has them ordered `DepA`, `DepB` and `DepC` since it's the order the dependencies were discovered in the `MainSchema`.
- There are two way in how to implement topological sorting Depth-First Search (DFS) and Breadth-First Search (BFS). Implementation in this PR uses recursive DFS implementation since it allows to preserve the original order where possible and has better readability.
  • Loading branch information
maximpn authored Jul 4, 2024
1 parent 015fd39 commit 72b2252
Show file tree
Hide file tree
Showing 9 changed files with 206 additions and 33 deletions.
28 changes: 3 additions & 25 deletions packages/kbn-openapi-generator/src/parser/lib/get_circular_refs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
import type { OpenApiDocument } from '../openapi_types';
import type { PlainObject } from './helpers/plain_object';
import { extractByJsonPointer } from './helpers/extract_by_json_pointer';
import { findRefs } from './find_refs';
import { findLocalRefs } from './helpers/find_local_refs';
import { parseRef } from './helpers/parse_ref';

/**
* Extracts circular references from a provided document.
Expand All @@ -19,7 +20,7 @@ export function getCircularRefs(document: OpenApiDocument): Set<string> {
const localRefs = findLocalRefs(document);
const circularRefs = new Set<string>();
const resolveLocalRef = (localRef: string): PlainObject =>
extractByJsonPointer(document, extractJsonPointer(localRef));
extractByJsonPointer(document, parseRef(localRef).pointer);

// In general references represent a disconnected graph. To find
// all references cycles we need to check each reference.
Expand Down Expand Up @@ -80,26 +81,3 @@ function findCycleHeadRef(

return result;
}

/**
* Finds local references
*/
function findLocalRefs(obj: unknown): string[] {
return findRefs(obj).filter((ref) => isLocalRef(ref));
}

/**
* Checks whether the provided ref is local.
* Local references start with `#/`
*/
function isLocalRef(ref: string): boolean {
return ref.startsWith('#/');
}

/**
* Extracts a JSON Pointer from a local reference
* by getting rid of the leading slash
*/
function extractJsonPointer(ref: string): string {
return ref.substring(1);
}
120 changes: 116 additions & 4 deletions packages/kbn-openapi-generator/src/parser/lib/get_components.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,124 @@
* Side Public License, v 1.
*/

import type { OpenApiDocument } from '../openapi_types';
import type {
OpenApiDocument,
OpenApiComponentsObject,
OpenApiSchemasObject,
} from '../openapi_types';
import { extractByJsonPointer } from './helpers/extract_by_json_pointer';
import { findLocalRefs } from './helpers/find_local_refs';
import { parseRef } from './helpers/parse_ref';

export function getComponents(parsedSchema: OpenApiDocument) {
if (parsedSchema.components?.['x-codegen-enabled'] === false) {
/**
* Returns document components.
*
* It performs topological sorting of component schemas to enable arbitrary
* schemas definition order.
*/
export function getComponents(document: OpenApiDocument): OpenApiComponentsObject | undefined {
if (document.components?.['x-codegen-enabled'] === false) {
return undefined;
}

return parsedSchema.components;
if (!document.components) {
return;
}

const refsAdjList = buildLocalRefsAdjacencyList(document.components);
const sortedSchemaRefs = sortTopologically(
refsAdjList,
Array.from(Object.keys(document.components?.schemas ?? {}))
);
// Starting from ES2020 functions returning or traversing object properties
// make it in ascending chronological order of property creation. It makes
// it possible to assemble schemas object which will be traversed in
// the right order preserving topological sorting.
const sortedSchemas: OpenApiSchemasObject = {};

for (const schemaName of sortedSchemaRefs) {
sortedSchemas[schemaName] = extractByJsonPointer(document, `/components/schemas/${schemaName}`);
}

return {
...document.components,
schemas: sortedSchemas,
};
}

/**
* References adjacency list with keys as schema name and value
* as a set of schemas the key references to.
*/
type ReferencesAdjacencyList = Map<string, Set<string>>;

/**
* Builds a references adjacency list. An adjacency list allow to apply
* any graph algorithms working with adjacency lists.
* See https://en.wikipedia.org/wiki/Adjacency_list
*/
function buildLocalRefsAdjacencyList(
componentsObj: OpenApiComponentsObject
): ReferencesAdjacencyList {
if (!componentsObj.schemas) {
return new Map();
}

const adjacencyList: ReferencesAdjacencyList = new Map();

for (const [schemaName, schema] of Object.entries(componentsObj.schemas)) {
const dependencies = adjacencyList.get(schemaName);
const dependencySchemaNames = findLocalRefs(schema).map((ref) => parseRef(ref).schemaName);

if (!dependencies) {
adjacencyList.set(schemaName, new Set(dependencySchemaNames));
} else {
for (const dependencySchemaName of dependencySchemaNames) {
dependencies.add(dependencySchemaName);
}
}
}

return adjacencyList;
}

/**
* Sorts dependent references in topological order. Local dependencies are placed
* before dependent schemas. External references aren't involved.
* See https://en.wikipedia.org/wiki/Topological_sorting
*
* It uses Depth First Search (DFS) variant of topological sort to preserve schemas
* definition order in OpenAPI specification document. Topological sorting doesn't
* define any order for non dependent schemas. Preserving original ordering looks
* like a good option to minimize diffs and have higher result predictability.
*
* @param adjacencyList An adjacency list, e.g. built via buildLocalRefsAdjacencyList
* @param originalOrder A string array having schema names sorted in OpenAPI spec order
* @returns A string array sorting in topological way
*/
function sortTopologically(
adjacencyList: ReferencesAdjacencyList,
originalOrder: string[]
): string[] {
const sortedSchemas: string[] = [];
const visited = new Set<string>();
const addToSorted = (schemaName: string): void => {
if (visited.has(schemaName)) {
return;
}

visited.add(schemaName);

for (const dependencySchemaName of adjacencyList.get(schemaName) ?? []) {
addToSorted(dependencySchemaName);
}

sortedSchemas.push(schemaName);
};

for (const schemaName of originalOrder) {
addToSorted(schemaName);
}

return sortedSchemas;
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

import { uniq } from 'lodash';
import type { OpenApiDocument } from '../openapi_types';
import { findRefs } from './find_refs';
import { findRefs } from './helpers/find_refs';

export interface ImportsMap {
[importPath: string]: string[];
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

import { findRefs } from './find_refs';
import { isLocalRef } from './is_local_ref';

/**
* Finds local references
*/
export function findLocalRefs(obj: unknown): string[] {
return findRefs(obj).filter((ref) => isLocalRef(ref));
}
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
* Side Public License, v 1.
*/

import { hasRef } from './helpers/has_ref';
import { traverseObject } from './helpers/traverse_object';
import { hasRef } from './has_ref';
import { traverseObject } from './traverse_object';

/**
* Traverse the OpenAPI document recursively and find all references
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

/**
* Checks whether the provided ref is local.
* Local references start with `#/`
*/
export function isLocalRef(ref: string): boolean {
return ref.startsWith('#/');
}
46 changes: 46 additions & 0 deletions packages/kbn-openapi-generator/src/parser/lib/helpers/parse_ref.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

interface ParsedRef {
uri: string;
pointer: string;
schemaName: string;
}

/**
* Parses an OpenAPI reference a.k.a JSON reference
* See https://datatracker.ietf.org/doc/html/draft-pbryan-zyp-json-ref-03
*
* JSON reference consists of an optional uri and required JSON pointer
* looking like `uri#pointer`. While RFC implies URI usage mostly relative
* paths are used.
*
* An example looks like
*
* ```
* ../path/to/my/file.schema.yaml#/components/schemas/MySchema
* ```
*
* This function returns `uri`, JSON `pointer` and
* `schemaName` which is the last part of the JSON pointer. In the example
* above `schemaName` is `MySchema`.
*/
export function parseRef(ref: string): ParsedRef {
if (!ref.includes('#')) {
throw new Error(`Reference parse error: provided ref is not valid "${ref}"`);
}

const [uri, pointer] = ref.split('#');
const schemaName = pointer.split('/').at(-1)!;

return {
uri,
pointer,
schemaName,
};
}
5 changes: 5 additions & 0 deletions packages/kbn-openapi-generator/src/parser/openapi_types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ interface AdditionalProperties {
}

export type OpenApiDocument = OpenAPIV3.Document<AdditionalProperties>;
export type OpenApiComponentsObject = OpenAPIV3.ComponentsObject;
export type OpenApiSchemasObject = Record<
string,
OpenAPIV3.ReferenceObject | OpenAPIV3.SchemaObject
>;

// Override the OpenAPI types to add the x-codegen-enabled property to the
// components object.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ export function registerHelpers(handlebarsInstance: typeof Handlebars) {
});

/**
* Checks whether provided schema is circular or a part of the circular chain.
* Checks whether provided schema is circular or a part of a circular chain.
*
* It's expected that `context.circularRefs` has been filled by the parser.
*/
Expand Down

0 comments on commit 72b2252

Please sign in to comment.