Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the tree of field subselections to the resolver on demand #169

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

xmirya
Copy link

@xmirya xmirya commented Feb 25, 2018

What's requested in #17 .

Use case: we have Order and Owner entities stored in the DB, each Order refers to one Owner. We want to expose the API to list orders, optionally providing owners info, e.g.

type QueryRoot {
  orders(some criteria): [Order!]
}

type Order {
  field1
  field2
  ...
  owner: Owner!
}

type Owner {
  field1
  field2
  ...
}

W/o knowing whether the user has requested Owner info in the orders resolver we end up with doing 1 query for the orders list + N queries for each Owner (assuming each Order has a different Owner). If orders resolver might check ahead of time whether Owner is requested, it might end up running just one (in case of relational DB, using INNER JOIN) or two (in the common case, one to fetch the orders, second to fetch owners by the list of IDs found from the 1st query) queries.

This is well aligned with how many ORMs allow related entity preloading, like orm.GetOrders(...).With("Owner"), so order.GetOwner() later does not result in a separate query.

@darh
Copy link
Contributor

darh commented Mar 10, 2018

If I'm interpreting semaphoreci setup errors correctly, someone needs to fix semaphoreci-github integration for this to build to pass?

@tonyghita
Copy link
Member

SemaphoreCI hasn't really handled the github.com/neelance/graphql-go -> github.com/graph-gophers/graphql-go transition very gracefully :(

I'll look into what I can do to straighten it out.

@darh
Copy link
Contributor

darh commented Mar 20, 2018

It looks like SemaphoreCI issue is fixed. Anything else that is stopping us from merging this?

@darh
Copy link
Contributor

darh commented Mar 26, 2018

@tonyghita @xmirya 👋

@xmirya
Copy link
Author

xmirya commented Mar 26, 2018

I see no reasons for this not to be merged (naturally, as I filed this PR :) ); I use this patch for some time in one of private projects

dvic added a commit to qdentity/graphql-go that referenced this pull request Mar 27, 2018
Copied from graph-gophers#169 and
used package 'query' instead of 'selected'.
@dvic
Copy link
Contributor

dvic commented Mar 27, 2018

@xmirya I applied this patch to our fork along with a basic test: https://github.com/qdentity/graphql-go/blob/5472fce1344b4de8067df6cf53d09384c6533ff3/graphql_test.go

@dvic
Copy link
Contributor

dvic commented Mar 28, 2018

@xmirya Added support for wrapper types, i.e., can be any type X []pubquery.SelectedField

qdentity@d95424a

@0xSalman
Copy link
Contributor

@xmirya

Hey,

Thanks for the PR. I also came up with a solution but I like your solution better.

I have a couple of suggestions:

  1. Wouldn't it be better to put selection set in the context to avoid an extra argument in resolver method? Context could also be passed down with selection set intact and downstream can benefit from it if and when needed. I think this approach would make the api more generic.

  2. I think it would be nice if selection set also included variables for a field if there are any. This would be very useful for querying graph DBs (i.e., DGraph) and I am sure queries to SQL DBs could also benefit from it. For example, the following schema

schema {
	query: Query
}

type Query {
	user(id: ID!): User!
}

type User {
	id: ID!
	name: String!
	email: String!
	phone: String
	address: [String]
	friends(page: Pagination): [User!]
}

input Pagination {
  first: Int
  after: ID
  last: Int
}

with GraphQL Query

query GetUser($id: ID!, $page: Pagination) {
	user(id: $id) {
		name
		email
		phone
		address
		friend(page: $page) {
			id
			name 
			email
		}
	}
}

would benefit from a selection set having variable values for each field. When you build the SQL/Graph query for db from the selection set, you could identify the pagination values for the field friends and able to specify limit, offset, filtering etc. I am sure there would be other use cases where it would be useful.

Looking forward to your reply, thanks!

@tonyghita
Copy link
Member

Hey, thanks for the contribution!

I'm still getting up to speed with the implementation details of execution but I'd like to get this feature in. I believe we have a similar PR in #70.

It'd be nice to include some unit tests to cover the new behavior, and perhaps a benchmark to get an idea of how much additional overhead the feature adds (if any).

@xmirya
Copy link
Author

xmirya commented Apr 9, 2018

@salmana1 For including it into the context: i'd not do it for three reasons: (1) the optional parameter allows constructing the tree only when the method requests it by specifying the optional parameter; with the context solution we'd have to either always construct it or pass some factory function that constructs it (which is a bit ugly IMO) (2) the optional parameter solution ensures the runtime and parsing performance for existing code does not change (3) IMO passing whatever as a context parameter under some key is not "backward compatible", whatever key name is chosen, it still might collide with someone using it in the existing code.

For variables in subselections - will add it to the patch.

@tonyghita - will add tests (prob. would be nice to provide an example/update docs as well); for benchmarking - i expect it not to change the runtime performance of the existing 1- or 2-parameter methods - the difference is one "if" branch; same about memory use - Go booleans are packed AFAIK; but might be useful to show how adding this optional parameter affects the GQL method call speed.

@0xSalman
Copy link
Contributor

@xmirya

Thanks for replying back and clarifying. After commenting here, I thought about the context approach again and came up with same conclusion as yourself. I am liking the optional argument approach more now. The patch with variables would be nice. If you haven't started working on the patch yet, I can create a PR.

@tonyghita

It would be awesome if you could please merge this PR soon. I think we can follow up with another PR to add unit tests and benchmark results.

@nicksrandall
Copy link
Member

I left some comments on this subject here: #70 (comment)

As I mentioned there, I believe that the context is the idiomatic place to put these fields even if it requires a constructor function. We could also used a typed key for the context variable so that there is no possible collision.

That said, I don't feel so strongly about this as to prevent this PR from being merged if the general consensus is that an extra argument is preferred to using the context.

@0xSalman
Copy link
Contributor

0xSalman commented Apr 10, 2018

I think people will use selection set more often than not. Perhaps we could achieve best of both worlds by introducing a configurable option via SchemaOpt. Construct the selection set tree and put it in the context only when it is configured. This approach will make sure that there is no performance impact to existing code and there is an idiomatic solution.

What do you guys think?

@tonyghita
Copy link
Member

tonyghita commented Apr 10, 2018

I haven't experienced a need for the selection set in resolvers, but I can see how this may be important if your resolvers directly communicated with a database.

Is anyone able to provide an example of how they would use this functionality in their resolver code? One or more concrete code examples could help guide this discussion.

@darh
Copy link
Contributor

darh commented Apr 10, 2018 via email

@0xSalman
Copy link
Contributor

@tonyghita

We are using DGraph as a database and having the selection set helps us convert a GraphQL query into a DGraph query. Or at least know which fields/nodes we need to pull from the DB. We do not want to pull all the nodes & reverse nodes and then figure out what to send back to client.

May I ask, how are you interacting with db? Are you pulling everything from DB and filtering what to send back to client in the app?

@tonyghita
Copy link
Member

tonyghita commented Apr 10, 2018

@salmana1 We don't interact with any databases directly—instead we make service requests to hundreds of data services which front databases.

The databases are denormalized (generally joins happen in the application layer). We haven't really had to care if we are over-fetching from services since it all happens within the same network, so generally the costs of over-fetching at this level are neglible.

This is the approach I've tried to demonstrate in https://github.com/tonyghita/graphql-go-example

@ldechoux
Copy link

ldechoux commented Apr 18, 2018

Hi,

@tonyghita My use case (currently we are using apollo-server with nodejs, providing the possibility to get tree field on the resolver level).
We are creating an application for displaying TV programs and associated videos.

The (simplified) schema :

type Video {
  id: ID!,
  title: String!
  duration: Int!,
  img: String!
}

type VideosConnector {
  offset: Int!,
  hasNext: Boolean,
  total: Int!,
  items: [Video!],
}

type Program {
  id: ID!,
  name: String!,
  category: String!
  img: String!,
  videos(offset: Int = 0, limit: Int = 20): VideoConnectors
}

type ProgramConnector {
  offset: Int!,
  hasNext: Boolean,
  total: Int!,
  items: [Program!], 
}

type Query {
  getLastPrograms(offset: Int = 0, limit: Int = 20): ProgramConnector,
  getProgramsByCategory(category: String!, offset: Int = 0, limit: Int = 20): ProgramConnector,
  searchPrograms(search: String!, offset: Int = 0, limit: Int = 20): ProgramConnector,
  getProgramById(id: ID!): Program,
}

Screens :
The homepage of the app displays the last programs, and a selection of programs ordered by categories.
For each programs the app display the program name, his picture and the total of available videos.
The request looks like that :

fragment Prg on Program {
  id name category img
  videos { total }
}

{
  getLastPrograms {
    total offset hasNext
    items { ...Prg }
  }
  series : getProgramsByCategory(category:"series") {
    total offset hasNext
    items { ...Prg }
  }
  tv_show : getProgramsByCategory(category:"tv_show") {
    total offset hasNext
    items { ...Prg }
  }
  news : getProgramsByCategory(category:"news") {
    total offset hasNext
    items { ...Prg }
  }
}

When an user select a program the app displays a program page with all the associated videos.
The request looks like that :

{
  getProgramById(id:"1234-5678-01234") {
    id name category img
    videos {
       total offset hasNext
       items {
          id title img duration
       }
    }
  }
}

Resolvers :
In these requests, the app asks for VideoConnector node.
In the first case, just to get the total of videos associated to a program, in the second to get the all the videos.

What we have done with apollo-graphql :
In the VideoConnector resolver, if the items node is not requested, we use a dataloader (executing a count request) to get all the video counters at once.
If items node is requested, we execute a select request for the program in the video table to retrieve all the informations of videos.

Using a dataloader and count request in the first case is more efficient. It's why we need to have access to the tree of field to achieve these kinds of optimization.

I hope my use case will help you !

@darh
Copy link
Contributor

darh commented Apr 22, 2018

So, where are we with this? :) Planning to push my API in (pre)production mid-May and I would really like to have this support built in.. ty.

@lubo
Copy link

lubo commented Jun 5, 2018

Any update?

@karlos1337
Copy link

+1

@OoXoSoO
Copy link

OoXoSoO commented Jun 29, 2018

Any update? I rly need this feature

@abradley2
Copy link

Any update? This is indeed a make-or-break feature for me as well- I can't get any semblance of good performance without this.

@avocade
Copy link

avocade commented Oct 5, 2018

Just ran into this issue ourselves, found it strange that it wasn't already in the lib. But great that it's close to merge, ship it 🚀 :)

@lfv89
Copy link

lfv89 commented Jan 22, 2019

This is an important and repeatedly requested feature. Is there anything specifically preventing this PR from being merged? Anything we can help?

@choonkeat
Copy link

@nicksrandall @pavelnikolov @tonyghita ?

Copy link
Member

@pavelnikolov pavelnikolov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add unit tests for the new functionality? At least one resolver and then make sure that the proper fields get passed as arguments?

@@ -9,6 +9,7 @@ import (
"github.com/graph-gophers/graphql-go/internal/common"
"github.com/graph-gophers/graphql-go/internal/exec/packer"
"github.com/graph-gophers/graphql-go/internal/schema"
pubselected "github.com/graph-gophers/graphql-go/selected"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the pub prefix stand for?


- Optional `context.Context` argument.
- Mandatory `*struct { ... }` argument if the corresponding GraphQL field has arguments. The names of the struct fields have to be [exported](https://golang.org/ref/spec#Exported_identifiers) and have to match the names of the GraphQL arguments in a non-case-sensitive way.
- Optional `[]selected.SelectedField` argument to receive the tree of selected subfields in the GraphQL query (useful for preloading of database relations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the repetition of the work "selected" in: selected.SelectedField. It'd be nice if we can avoid that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe selected.Field 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, even better package name...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just came across this PR. Given the author has not replied, I'm happy to do these changes (along with tests) if there's any chance that this gets approved and merged afterwards cc/ @pavelnikolov

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jesushernandez this is the most requested feature and I really want it to get some traction but I'm not sure that this is the best approach.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I help getting more traction?

selected/selected.go Show resolved Hide resolved
@gragera
Copy link

gragera commented Apr 25, 2019

This would be so good for the library :) . Are there any news regarding this PR?

@ihgann
Copy link

ihgann commented May 1, 2019

👍 Definitely would like to see this get implemented, it would make queries from resolvers far more efficient in some use cases.

Having it be optional will allow certain upstream resolvers still follow natural convention, but allow DB queries to be a lot faster.

Note

I think, should this get implemented, it would be highly worthwhile calling out certain traps one could get caught in by relying on this (especially if they want to use dataloader).

For example:

query {
    authors {
        id
        name
        books {
            id
            name
            relatedBooks {
                id
                name
                isbn
            }
        }
    }
}

The major benefit to this query IMHO would not be the fact that I need not blindly SELECT * from authors, but that I could under-the-hood JOIN on the books.id. Assuming I'm using dataloader in my resolver for Books, my Authors resolver would be able to prime the books dataloader automatically and I could reduce my needed queries from 2 -> 1, able to prime my cache of authors & books simultaneously.

But there'd be a small gotcha for someone who might naively try to over-optimize, such as the case where they granularly want to select (instead of the *). For example, the query above may want to run SELECT id, name, relatedBookIds from books in that join (forgetting that the nested relatedBooks is also fetching for isbn). The cache would then be missing the isbn field in the queried books, which can result in some... odd situations (either needing to re-fetch which is inefficient, or return an invalid value/error). I think this could be an easy pitfall for someone to fall into if this feature were to be implemented, and should be provided as a word of caution somewhere in the documentation.

@PatrickBuTaxdoo
Copy link

Is there any way to get this merged soon?

@abourget
Copy link

What's up with DealTap#5 ? They've had that over there.. do we want something similar here?

@keithmattix
Copy link

Bump. Any news?

@jeanpi
Copy link

jeanpi commented Feb 9, 2020

+100 for having access to selected fields and their arguments, especially for pagination.

@qneyrat
Copy link

qneyrat commented Jul 6, 2020

@tonyghita up, Any news? :)

@ajbouh
Copy link

ajbouh commented Mar 24, 2021

After catching up on all of the relevant conversation, it seems like this PR (#169) overlaps with two other viable PRs to provide access to the query from the resolver.

I've implemented a similar approach in my own fork. Similarly to the approach this PR, I compute the value returned by context lazily.

There are two major differences between my fork and this PR:

  • I return enough information to forward the current field resolution to another GraphQL instance. This is why I include Alias, TypeName, Arguments, and ArgumentTypes.
  • I return information about the field that's being resolved instead of returning a slice of selected fields. So rather than returning just the "child selections" I return the "current node" and its "child selections".

These are the data structures I've settled on:

type Field struct {
	Alias         string
	Name          string
	TypeName      string
	Arguments     map[string]interface{}
	ArgumentTypes map[string]ArgumentType
	Selected      []Field
}

type ArgumentType struct {
	NamedType string
	Elem      *ArgumentType
	NonNull   bool
}

type FieldFunc func() Field

func GetFieldFromContext(ctx context.Context) (field Field, ok bool) {
	fieldFunc, ok := ctx.Value(FieldContextKey).(FieldFunc)
	if ok {
		field = fieldFunc()
	}
	return
}

@pavelnikolov
Copy link
Member

This has been one of the most commonly requested features for this library. I agree that we need to expose the selected fields, however, I'm still not completely sure if we need to expose the fields through the context or through a strongly typed argument in the resolver. And if we go the path of strongly typed argument in the resolver we need to make sure that this is opt-in and doesn't introduce any breaking changes to existing codebases. I agree that having a field with its subfields makes a lot of sense - especially for graph databases.

@ajbouh
Copy link

ajbouh commented Mar 25, 2021

I don't have a strong opinion about which approach is best. If we wanted to get things moving with something we might start with the context approach and mark the package as experimental. This way people can opt into it and experiment with it without needing the project to make a long term commitment about compatibility.

Once the design settles and the preferred approach becomes more obvious, then we either remove the experimental tag or deprecate the package.

@maoueh
Copy link

maoueh commented Mar 25, 2021

You can even link to my own PR with is quite similar to others #422

I most admit I did not researched enough before doing my own work, so I apologize in advance :) We use it in production, our main current use case is to inspect Unions selected by the user so the backend registered to only a subset of the fields.

@ajbouh
Copy link

ajbouh commented Mar 25, 2021

Yes, your PR looks solid as well!

Assuming they expose adequate information to forward the current resolution operation, I look forward to any of these being merged

@maoueh
Copy link

maoueh commented Mar 25, 2021

And so do I :)

@mehran-prs
Copy link

Any update?

@jhelberg
Copy link

How can I help to get this closed? I'm happy to spend a couple hours on this.

@jhelberg
Copy link

jhelberg commented Oct 3, 2021

I manually integrated this PR into the newest code and it works like a charm. I agree with the two naming oddities (selected.selected and pubselected). There is a small incompatibility where the api-code calls it's own resolver-calls: the []selected.SelectedField argument must be supplied then also. In practice one will implement an extra function to do that.
My API has become a lot faster now, as I can avoid some SQL-joins and sub-selects when fields are not selected by the caller.
Faster as in sub .1 second instead of 5 seconds.
Thumbs up for this patch; in use for a week now and serving over 1 million calls.
One suggestion: it's an array now, maybe a map is better; most of the times I want to know if a particular field is requested. A map seems more natural and avoids looping over all fields until the one I'm looking for is encountered.

@willglynn
Copy link

@jhelberg: Arguments throw a wrench in that, which is out of scope for this PR but is part of a derived changeset I'm using in production. (I intend to open a PR for that once this PR lands.) The selection data would have to be expressed as a map of slices to capture cases where a field is resolved more than once, e.g.

query {
  foo: node(id: "foo") { … }
  bar: node(id: "bar") { … }
}

map[string][]x would be extra allocations and extra indirection which makes it not obviously performance a win over iterating through []x once.

My resolvers usually look something like:

var withFoo, withBar bool
for _, field := range fields {
  switch field.Name {
  case "foo":
    withFoo = true
  case "bar":
    withBar = true
  }
}
// do the operation, optionally withFoo or withBar

When my resolver needs nested arguments, I fish them out while iterating:

var nodeIds []string
for _, field := range fields {
  switch field.Name {
  case "node":
    nodeIds = append(nodeIds, field.arguments["id"])
  }
}
// bulk-retrieve nodeIds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.