Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Data loader integration #51

Open
MichalLytek opened this issue Mar 27, 2018 · 45 comments
Open

Feature: Data loader integration #51

MichalLytek opened this issue Mar 27, 2018 · 45 comments
Assignees
Labels
Enhancement 🆕 New feature or request
Milestone

Comments

@MichalLytek
Copy link
Owner

MichalLytek commented Mar 27, 2018

https://github.com/facebook/dataloader

  1. Caching resolving the same data
  2. Instead of normal resolving:
@FieldResolver()
author(@Root() recipe: Recipe) {
  return this.userRepository.findById(recipe.authorId);
}

Introduce batched resolving - taking array of roots:

@FieldResolver()
async author(@Root() recipes: Recipe[]) {
  const ids = recipes.map(r => r.authorId);
  return this.userRepository.findByIds(ids);
}
  1. Decorator:
@Mutation()
async mutationThatAffectLoader(
  @Arg("input") input: SomeInput,
  @DataLoader(User) loader,
) {
  const result = await this.mutateSomething(input);
  loader.clear(input.someKey);
  return result;
}
@MichalLytek MichalLytek added the Enhancement 🆕 New feature or request label Mar 27, 2018
@MichalLytek MichalLytek added this to the 1.0.0 release milestone Mar 27, 2018
@MichalLytek MichalLytek self-assigned this Mar 27, 2018
@laukaichung
Copy link

laukaichung commented Apr 29, 2018

Looking forwards to it on solving n+1 requests.

@laukaichung
Copy link

laukaichung commented Apr 30, 2018

I'm not sure how it's supposed to work.

Does @FieldResolver() automatically cache the returned result with dataloader?

@FieldResolver()
author(@Root() recipe: Recipe) {
  return this.userRepository.findById(recipe.authorId);
}

It looks like I don't need to do anything at all that involves dataloader.
If it's true, I'll wait for the integration instead of starting to use dataloader now.

@MichalLytek
Copy link
Owner Author

MichalLytek commented Apr 30, 2018

@Resolver(() => Post)
class PostResolver {
  @Query(returns => [Post])
  posts() {
    return this.manager.find(Post);
  }

  @FieldResolver()
  comments(@Root() posts: Post[]) {
    const postIds = posts.map(post => post.id);
    const comments = this.manager.find(Comment, { postId: In(postIds) });
    // grouping and providing correct order for dataloader
    return posts.map(post =>
      comments.filter(comment => comment.postId === post.id)
    );
  }
}

So executing this query:

query {
  posts {
    id
    title
    comments {
      text
    }
  }
}

will call DB two times - one for fetching posts collection, one for fetching comments collection, because comments resolver will be batched.
And yes, field resolvers will be cached, so calling the field resolver with the same parent and args data will not execute it again.

@laukaichung
Copy link

laukaichung commented Apr 30, 2018

This is great. It looks like the dataloader is working under the hood in @FieldResolver. I don't even need to call up @DataLoader(Comment) and explicitly cache the fetched comments in @FieldResolver.

@chrisdevereux
Copy link

When I've used DataLoader before, I've generally used it at the repository layer, rather than on resolvers. The main benefit of this being that it doesn't leak logic around batching into the service layer (requiring everything in the app to work on n ids rather than 1).

The main downside is that it requires me to construct services & repositories per-request. Which I can't really figure out how to do with TypeGraphQL — would require the resolvers & DI context to be constructed per request.

What do you think about solving this problem by allowing this, rather than bundling dataloader support into the resolver?

@MichalLytek
Copy link
Owner Author

It's easy to solve - create a global middleware that will create new dataloaders and attach them to context. Then you can use the loaders from context in resolver class methods.

const DataLoader: MiddlewareFn = ({context}, next) => {
  if (!context.isDataLoaderAttached) {
    context.isDataLoaderAttached = true;

    context.personLoader = new DataLoader(/* ... */);
    // ...
  }
  return next();
}

Of course you would need to provide new context object in express-graphql or graphql-yoga.

I'm not a fan of creating DI Container for each "request" as it denies the idea of single-threaded Node.js.

@ashleyw
Copy link

ashleyw commented Jul 24, 2018

Is anyone aware of a workable strategy to get dataloader integrated with type-graphql today?

@MichalLytek
Copy link
Owner Author

@ashleyw you can always fallback to the plain apollo-like resolvers strategy - create dataloaders for your app for each request (like in example above - new DataLoader) and then retrieve them using @Ctx decorator for use in your resolvers.

@ashleyw
Copy link

ashleyw commented Jul 24, 2018

@19majkel94 thanks, I'll look into it!

Is there anything you've discovered that prevents the batched resolving you posted above? That seems like it would be a sweet, almost automatic solution!

@MichalLytek
Copy link
Owner Author

@ashleyw This is how it works in @pleerock's vesper and it's quite easy to copy reimplement it in TypeGraphQL. But I am really busy right now, so bugfixing and maintance is all I can do right now 😞

@goldcaddy77
Copy link

goldcaddy77 commented Jul 27, 2018

This should get you started. Create a DataLoaderMiddleware:

import DataLoader = require('dataloader');
import { MiddlewareInterface, NextFn, ResolverData } from 'type-graphql';
import { Service } from 'typedi';

import { Context } from './context.interface';

@Service()
export class DataLoaderMiddleware implements MiddlewareInterface<Context> {
  async use({ root, args, context, info }: ResolverData<Context>, next: NextFn) {
    if (!context.dataLoader.initialized) {
      context.dataLoader = {
        initialized: true,
        loaders: {}
      };

      const loaders = context.dataLoader.loaders!;

      context.connection.entityMetadatas.forEach(entityMetadata => {
        const resolverName = entityMetadata.targetName;
        if (!resolverName) {
          return;
        }

        if (!loaders[resolverName]) {
          loaders[resolverName] = {};
        }

        entityMetadata.relations.forEach(relation => {
          // define data loader for this method if it was not defined yet
          if (!loaders[resolverName][relation.propertyName]) {
            loaders[resolverName][relation.propertyName] = new DataLoader((entities: any[]) => {
              return context.connection.relationIdLoader
                .loadManyToManyRelationIdsAndGroup(relation, entities)
                .then(groups => groups.map(group => group.related));
            });
          }
        });
      });
    }
    return next();
  }
}

Then attach when building your schema:

  const schema = await buildSchema({
    globalMiddlewares: [DataLoaderMiddleware],
    resolvers: [__dirname + '/**/*.resolver.ts']
  });

Then you can use in your FieldResolvers like so:

  @FieldResolver()
  photos(@Root() user: User, @Ctx() ctx: Context): Promise<Photo[]> {
    return ctx.dataLoader.loaders.User.photos.load(user);
  }

This still requires a bunch of boilerplate. I'd love to figure out a way to automatically create these field resolvers while also getting the schema generated and type safety.

@miroslavzeman
Copy link

@19majkel94 Hey, I would like to ask if the current approach with FieldResolver() for batching is working or not. Im trying to use the FieldResolver for batching but its not working. More in description in my code

import { Args, FieldResolver, Query, Resolver, Root } from "type-graphql";
import { Page } from "../../entity/Page/Page";
import { PageArgs } from "./PageArgs";
import { EntityManager, getManager } from "typeorm";
import { PagePart } from "../../entity/PagePart/PagePart";

@Resolver(() => Page)
class PageResolver {
  private manager: EntityManager;
  constructor() {
    this.manager = getManager();
  }

  // This query is called once and receive all pages from DB in array
  // Each page has some parts which are fetched separately using FieldResolver.
  @Query(() => [Page])
  pageAll(@Args() args: PageArgs): Promise<Page[]> {
    return this.manager.find(Page);
  }

  @FieldResolver(() => PagePart, { nullable: true })
  async pageParts(@Root() pages: Page[]) {
    // pages arg should contain an array of all fetched pages from pageAll()
    // BUT instead im receiving in pages arg a single page as object
    // Means: If I fetch 50 pages, the FieldResolver is called 50 times
    // receiving each page separately as object instead of array of 50 items.

    console.log("Field resolver called");
    return null;
  }
}

export default PageResolver;

Query example:

{
  pageAll {
    pageParts {
      id,
      name,
      content
    }
  }
}

Any help of what im doing wrong, please? Thanks!

@MichalLytek
Copy link
Owner Author

@miro4994 It's an issue, "To Do in Board", so how this is supposed to work? Is it a feature documented in docs? 😄 https://19majkel94.github.io/type-graphql/

@miroslavzeman
Copy link

Well, its morning... My mistake :D Thanks for quick response.

@joonhocho
Copy link

I've recently created BatchLoader written in TypeScript.
Compared to DataLoader, it only does batching, but does it better.
It shares identical apis (load, loadMany) with DataLoader.
It handles duplicate keys (further reduces round trips when duplicate ids are provided) and is composable to create a new loader via mapping function.
It doesn't do any caching, but can easily be intergated with any cache library.
You may want to consider it over DataLoader. :)

@MichalLytek
Copy link
Owner Author

MichalLytek commented Oct 7, 2018

it only does batching, but does it better.

What does it means "better"? Can you show an example what is better, or maybe show some benchmarks with comparison to the dataloader?

It doesn't do any caching, but can easily be intergated with any cache library.

I think that Facebook's DataLoader is a well established solution that is known to every graphql developer, so it will be the first choose for now as it has "battery" included.
Maybe in the future I will try to expose the data loading API to plugins so library consumers could choose their own baching or caching library.

@apiel
Copy link

apiel commented Oct 7, 2018

One of the issue of dataloader, is that you have to create a loader per field. So if in your query you have 2 fields that are both related to the same entity, you will have to load them 2 times.
For example, you have a table game with a column player1 and another column player2, that both are related to another table user, you will still need 2 loader, one for player1 and one for player2. Would be cool to only have one loader user.
Also another weird thing with dataloader is that it use setTimeout to trigger the loading of the data. It would be great to get rid of this and maybe find a solution where graphql library would trigger the loader, maybe with an event or a callback function.

But, I guess to start, using dataloader might be the best approch ;-)

@joonhocho
Copy link

joonhocho commented Oct 7, 2018

@19majkel94

First, batchloader takes care of duplicate keys. For example, for the following code

loader.load(1),
loader.load(2),
loader.load(1),

with BatchLoader, your batch function will get [1, 2] as keys without duplicate keys, while DataLoader gives you [1,2,1]. There is a bit of optimization to remove duplicate keys.

As @apiel noted, with BatchLoader you can map a loader to create another loader to get specific fields or chain another query.

usernameLoader = userLoader.mapLoader((user) => user && user.username);
pictureLoader = userLoader.mapLoader((user) => user && user.picture);

usernameLoader.load(userId)
pictureLoader.load(userId)

both loaders will be sharing the same userLoader's batch queue so there are only one call to userLoader's batch function with keys being [userId].

You can compose a new loader by chaining loaders as well:

chainedLoader = loader1.mapLoader(v => loader2.load(v)).mapLoader(v => ...)

@apiel, We also use setTimeout to create batch queue. I don't see why it's a problem tho. Otherwise, you will need to somehow manage the queue yourself. Do you have any reason that you don't like setTimeout? do you have a better approach or use case?

@MichalLytek
Copy link
Owner Author

MichalLytek commented Oct 8, 2018

with BatchLoader, your batch function will get [1, 2] as keys without duplicate keys, while DataLoader gives you [1,2,1]. There is a bit of optimization to remove duplicate keys.

But we are talking about the cached version:

However, when the memoization cache is disabled, your batch function will receive an array of keys which may contain duplicates!

https://github.com/facebook/dataloader#disabling-cache

with BatchLoader you can map a loader to create another loader to get specific fields or chain another query

So can you show me that in this example? How many DB calls to the tables will DataLoader produces and the BatchLoader? 😉

query {
  user(id: 12345) {
	name
	posts(last: 5) {
     title
     comments(last: 3) {
       content 
     }
  }
}

@lukejagodzinski
Copy link

lukejagodzinski commented Oct 17, 2018

I was trying to use custom Repositories from typeorm with partial success. I've done something like this and batching is working fine:

import { EntityRepository, Repository } from "typeorm";
import * as DataLoader from "dataloader";

import { User } from "../types/User";

@EntityRepository(User)
export class UserRepository extends Repository<User> {
  private loader: DataLoader<number, User> = new DataLoader(ids => {
    return this.findByIds(ids);
  });

  findById(id: number) {
    return this.loader.load(id);
  }
}

and later in the UserResolver:

import { Arg, Int, Query } from "type-graphql";
import { InjectRepository } from "typeorm-typedi-extensions";

import { User } from "../types/User";

import { UserRepository } from "../repositories/UserRepository";

export class UserResolver {
  constructor(
    @InjectRepository(User) private readonly userRepository: UserRepository,
  ) {}

  @Query(returns => User, { nullable: true })
  user(@Arg("id", type => Int) id: number) {
    return this.userRepository.findById(id); // It's using DataLoader
  }
}

So in the query like this:

query {
  user1: user(id: 1) {
    fullName
  }
  user2: user(id: 2) {
    fullName
  }
}

DataLoader will receive array of IDs [1, 2];

However, the @InjectRepository decorator will create a new UserRepository instance for each decorator invocation. So in the case like this one it will fail to reuse data loader and caching will not work:

query {
  user1: user(id: 1) {
    fullName
    posts {
      title
      user {
        fullName
      }
    }
  }
  user2: user(id: 2) {
    fullName
  }
}

as there will be two instances of repositories, one for the UserResolver and one for the PostResolver class.

So the DataLoader in the UserResolver class will receive IDs [1, 2] and later the PostResolver DataLoader will receive ID [1] and another database query will have to be made.

Do you know if there is a way of reusing injected repository? I'm new to all the DI stuff in TypeScript.
Or I have to use the context object to store data loaders?

@MichalLytek
Copy link
Owner Author

MichalLytek commented Oct 18, 2018

Do you know if there is a way of reusing injected repository? I'm new to all the DI stuff in TypeScript.
Or I have to use the context object to store data loaders?

Of course you can store dataloader in context.
You can also use scoped containers feature:
https://19majkel94.github.io/type-graphql/docs/dependency-injection.html#scoped-containers

I was trying to use custom Repositories from typeorm with partial success. I've done something like this and batching is working fine:

You can use typeorm-loader:
https://github.com/Webtomizer/typeorm-loader

@laukaichung
Copy link

laukaichung commented Nov 6, 2018

I just learnt the hard way that dataloader instance should not be shared application-wide. I'm getting similar caching problem as graphql/dataloader#65

I have to create dataloader instance per request and pass it to the Context. I'm looking for alternatives that can be used for batch requests as well as caching results.

@gotexis

This comment has been minimized.

@GilShik
Copy link

GilShik commented Jan 19, 2020

Hi @MichalLytek,

Was wondering if there is a way to inject the root array(for N+1 problem) for a query when using NestJS,
cause I'm getting only the Root object?

Thanks,
Gil

@rfgamaral
Copy link

@laukaichung Have you tried to use scoped containers with a wrapper class around a dataloader and try that instead?

@MichalLytek What would be the cons of the approach I just described?

@micnil
Copy link

micnil commented Mar 31, 2020

I'm experimenting with a custom middleware for this, basically the idea is to attach a Dataloader decorator to fields that you want to batch, and then return the batch function from that resolver:

@Resolver(Poll)
export class PollResolver {
  // ....

  @FieldResolver()
  @Dataloader()
  votes() {
    // Return the batch function
    return (polls: Poll[]) => this.batchVotes(polls);
  }

  async batchVotes(polls: Poll[]) {
    return await this.voteRepo.getVotesForPolls(polls.map(poll => poll.uuid));
  }
}

The Dataloader middleware simply creates the dataloader (if it doesn't already exist) and stores it in a scoped container. Then calls the dataloader and returns the result:

// dataloader.middleware.ts
import { createMethodDecorator, ResolverData } from 'type-graphql';
import Dataloader, { BatchLoadFn, Options } from 'dataloader';
import Context, { Service } from 'typedi';
import { ResolverContext } from '../types/resolver-context';

interface Loaders {
  [key: string]: Dataloader<any, any>;
}

@Service()
class DataloaderFactory {
  loaders: Loaders = {};

  makeLoader(id: string, batchFunction: BatchLoadFn<any, any>, options?: Options<any, any>) {
    if (!!this.loaders[id]) {
      return this.loaders[id];
    }
    const loader = new Dataloader(batchFunction, options);
    this.loaders[id] = loader;
    return loader;
  }
}

function DataloaderMiddleware<K, V>(options?: Options<K, V>) {
  return createMethodDecorator(
    async ({ root, context, info }: ResolverData<ResolverContext>, next) => {
      const dataloaderFactory = Context.of(context.requestId).get(
        DataloaderFactory
      );
      const batchFunction = await next();
      const loader = dataloaderFactory.makeLoader(
        info.parentType.toString() + '.' + info.fieldName,
        batchFunction,
        options
      );
      
      return loader.load(root);
    }
  );
}

export {DataloaderMiddleware as Dataloader}

As you can see, we can also pass in the dataloader options to the decorator as an argument, making it possible to provide the cache key function for example. Any feedback would be appreciated.

@adamalfredsson
Copy link

adamalfredsson commented May 11, 2020

Inspired by @micnil middleware implementation I tried experimenting with property decorators instead. I would like to inject a unique data-loader instance for each request so I created a custom parameter decorator @RequestContainer that pulls dependencies from a container scoped to each request. This is how I would use my dataloaders:

@Resolver(() => Book)
export class BookResolver {
  @FieldResolver(() => User)
  public async author(
    @Root() root: Book,
    @RequestContainer() userDataLoader: UserDataLoader
  ): Promise<User> {
    return userDataLoader.load(root.userId);
  }
}

So each data-loader is a service class extending the DataLoader class. This allows me to inject dependencies like this:

@Service()
export class UserDataLoader extends DataLoader<string, User | undefined> {
  constructor(userService: UserService) {
    super(async (ids) => {
      const users = await userService.findByIds(ids);
      return ids.map((id) => users.find((user) => user.id === id));
    });
  }
}

Finally I would make sure that there is a unique instance of the data-loader for each request, by using a custom parameter decorator:

export function RequestContainer(): ParameterDecorator {
  return function (target: Object, propertyName: string | symbol, index: number) {
    return createParamDecorator<Context>(({ context, info }) => {
      const paramtypes = Reflect.getMetadata('design:paramtypes', target, propertyName);
      const requestContainer = Container.of(context.requestId);
      return requestContainer.get(paramtypes[index]);
    })(target, propertyName, index);
  };
}

The request id is just an uuid added to each request context.

It would also be advised to reset these containers after each request. Please read more about this in the type-graphql docs.

@slaypni
Copy link

slaypni commented Aug 7, 2020

Thanks to @nomadoda idea, I've created an utility type-graphql-dataloader to make use of DataLoader with TypeGraphQL more simple. If an application is using TypeORM, just add @TypeormLoader decorator to relation properties so that the fields will be batch loaded.

@ObjectType()
@Entity()
export class Book {
  @Field((type) => ID)
  @PrimaryGeneratedColumn()
  id: number;

  @Field((type) => User)
  @ManyToOne((type) => User, (user) => user.books)
  @TypeormLoader((type) => User, (book: Book) => book.authorId)
  author: User;

  @RelationId((book: Book) => book.author)
  authorId: number;
}

In case of not using TypeORM, @Loader decorator is also available to define custom loader.

@Resolver((of) => Book)
export class BookResolver {
  @FieldResolver()
  @Loader<number, User>(async (ids) => {  // batchLoadFn
    const repository = getRepository(User);
    const users = keyBy(await repository.findByIds(ids), "id");
    return ids.map((id) => users[id]);
  })
  author(@Root() root: Book) {
    return (dataloader: DataLoader<number, User>) =>
      dataloader.load(root.authorId);
  }
}

@tafelito
Copy link

tafelito commented Aug 7, 2020

@slaypni that looks good. Now if you have a loader with a complex key, like a pair of keys, how would you use it with @TypeormLoader?

@slaypni
Copy link

slaypni commented Aug 7, 2020

@tafelito That is a limitation of the current @TypeormLoader. In the above example, @TypeormLoader won't be applicable to author property if User had composite primary key. For that case, you could resort to @Loader writing custom loader function.

@geoyws
Copy link

geoyws commented Oct 14, 2020

@slaypni @MichalLytek could we merge type-graphql-dataloader into TypeGraphQL?

@MichalLytek
Copy link
Owner Author

No, TypeGraphQL is orm-lib agnostic. Why would you want to do that?

@geoyws
Copy link

geoyws commented Oct 19, 2020

Oh goodness, apologies, I mistook this forum for TypeORM's.

@geoyws
Copy link

geoyws commented Oct 25, 2020

We've started using @slaypni's type-graphql-dataloader at our company a week ago. It works well and has been saving us a lot of man-hours.

@lowe0292
Copy link

We've started using @slaypni's type-graphql-dataloader at our company a week ago. It works well and has been saving us a lot of man-hours.

Seconded! I'm new to TypeGraphQL and expected the API to automatically resolve relations via decorators if requested in the query. type-graphql-dataloader bridges that gap.

@fknop
Copy link

fknop commented Feb 8, 2021

Inspired by @micnil middleware implementation I tried experimenting with property decorators instead. I would like to inject a unique data-loader instance for each request so I created a custom parameter decorator @RequestContainer that pulls dependencies from a container scoped to each request. This is how I would use my dataloaders:

@Resolver(() => Book)
export class BookResolver {
  @FieldResolver(() => User)
  public async author(
    @Root() root: Book,
    @RequestContainer() userDataLoader: UserDataLoader
  ): Promise<User> {
    return userDataLoader.load(root.userId);
  }
}

So each data-loader is a service class extending the DataLoader class. This allows me to inject dependencies like this:

@Service()
export class UserDataLoader extends DataLoader<string, User | undefined> {
  constructor(userService: UserService) {
    super(async (ids) => {
      const users = await userService.findByIds(ids);
      return ids.map((id) => users.find((user) => user.id === id));
    });
  }
}

Finally I would make sure that there is a unique instance of the data-loader for each request, by using a custom parameter decorator:

export function RequestContainer(): ParameterDecorator {
  return function (target: Object, propertyName: string | symbol, index: number) {
    return createParamDecorator<Context>(({ context, info }) => {
      const paramtypes = Reflect.getMetadata('design:paramtypes', target, propertyName);
      const requestContainer = Container.of(context.requestId);
      return requestContainer.get(paramtypes[index]);
    })(target, propertyName, index);
  };
}

The request id is just an uuid added to each request context.

It would also be advised to reset these containers after each request. Please read more about this in the type-graphql docs.

I'm having an issue with this solution for subscriptions. The request id being created on the connection of the subscription, the data loader is not "renewed" for every subscription execution resulting in stale data in my subscriptions.
Any ideas how we could solve this with subscriptions?

@jgiunta1
Copy link

jgiunta1 commented Sep 2, 2021

I'm adding my solution to the mix, I just added the dataloader to the context so its created for every request. I'm using Mikro-Orm but you can replace the entity manager with any other way you query the database in your application.

/index.ts

const DataLoader = require('dataloader');

...

	const apolloServer = new ApolloServer({
		context: async ({ req, res }): Promise<MyContext> => {
			const em = orm.em.fork();
			return {
				em: em,
				req,
				res,
				bookLoader: new DataLoader(async (keys: string[]) => {
					// Keys here would be a list of Author ID's. Each author can have multiple books.
					// This data loader is to get all books by the authors in the list
					// Use entity manager, em, to batch your calls here for the entity you'd like to resolve
					const booksToResolve = await em.find(Book, {
						author: {
							id: { $in: keys },
						},
					});

					const bookMap: Map<String, Book[]> = new Map();

					booksToResolve.forEach((book) => {
						let bookArray = bookMap.get(book.author.id);
						if (bookArray) {
							bookArray.push(book);
							bookMap.set(book.author.id, bookArray);
						} else {
							bookMap.set(book.author.id, [book]);
						}
					});

					const resolvedBooks: Book[][] = keys.map((key) => {
						let books = bookMap.get(key);
						if (!books) {
							return [];
						}
						return books;
					});

					return resolvedBooks;
				}),
			};
		},
	});

...

resolvers/author.ts

	@FieldResolver(() => [Book])
	async books(
		@Root() author: Author,
		@Ctx() context: MyContext,
	): Promise<Book[]> {
		return context.bookLoader.load(author.id);
	}

@kicks321
Copy link

Status on this?

@MichalLytek
Copy link
Owner Author

@kicks321
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement 🆕 New feature or request
Projects
None yet
Development

No branches or pull requests