Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph support #207

Open
wants to merge 2 commits into
base: v2
Choose a base branch
from
Open

Graph support #207

wants to merge 2 commits into from

Conversation

ekkis
Copy link

@ekkis ekkis commented Aug 4, 2016

this patch enhances the library by allowing use of the ’graph’ format
the REST API support (passed in via the resultDataContents field).
please note I’ve added a test but am not clever enough to make it pass

ekkis added 2 commits July 31, 2016 20:19
this patch enhances the library by allowing use of the ’graph’ format
the REST API support (passed in via the resultDataContents field).
please note I’ve added a test but am not clever enough to make it pass
@ekkis ekkis mentioned this pull request Aug 4, 2016
@aseemk
Copy link
Member

aseemk commented Aug 5, 2016

Thanks for your contribution, @ekkis!

Just for easier conversation, here's what you wrote in #206 (comment) onward:

I'm not sure that overloading the lean attribute was the right thing to do. that attribute doesn't appear in the API and it would have been better if resultDataContents had just been supported directly. I didn't want to have both and removing lean would probably have been an IBC, ergo my choice, but do please feel free to modify it as you will

and yes, I know about time being short... but it would be great (for me) if you could do a release to npm with the patch soon. I'm reading about npm link (thank you for pointing that out in your docs) so I can get by in the meantime

@ekkis
Copy link
Author

ekkis commented Aug 5, 2016

if you want to talk I hang out on freenode#node.js

@aseemk
Copy link
Member

aseemk commented Aug 5, 2016

Sorry, I had to run just as I began to respond. Will resume later!

@aseemk
Copy link
Member

aseemk commented Aug 5, 2016

I agree that overloading lean doesn't feel like the right thing to do. My first thought is to just introduce a new format option that defaults to something ('rest'? 'row'? 'table'??), but can be set to 'graph'. (Note though that 'graph' always returns metadata like labels, so lean: true would be ignored.)

I say this because requiring the graph format feels like a very different concern than "only send me properties, not metadata".

But first it'd be good to get on the same page for what the format actually is, and what this driver should return. E.g. looking at the docs:

http://neo4j.com/docs/2.3.0/rest-api-transactional.html#rest-api-return-results-in-graph-format

What's coming over the wire is something like this, for a single result row in 'graph' format:

{
    "nodes": [{
        "id": "19",
        "labels": ["Bike"],
        "properties": {
            "weight": 10
        }
    }, {
        "id": "21",
        "labels": ["Wheel"],
        "properties": {
            "spokes": 32
        }
    }, {
        "id": "20",
        "labels": ["Wheel"],
        "properties": {
            "spokes": 3
        }
    }],
    "relationships": [{
        "id": "9",
        "type": "HAS",
        "startNode": "19",
        "endNode": "20",
        "properties": {
            "position": 1
        }
    }, {
        "id": "10",
        "type": "HAS",
        "startNode": "19",
        "endNode": "21",
        "properties": {
            "position": 2
        }
    }]
}

Does it make sense for the API to just return this data as-is, except transform the nodes and relationships into their respective instances?

One really odd thing about this over-the-wire format is that a graph is sent per row. If most or all rows share the same sub-graph (e.g. node or nodes), that gets duplicated over and over.

So alternately, should this API be combining all of the graph rows into one graph?

One immediate thought I have is: what does the new, official Neo4j driver do? From the docs, I can't find any mention of support for this 'graph' format either. Not sure the new binary protocol even has that notion? Maybe it does. I'll ask on Slack.

At this moment, my feeling is: this is still such a new and unique use case, that I would hesitate to solidify this API and merge this PR without understanding how it's really used.

Thank you again for the contribution! And thanks for your patience. In the meantime, I encourage you to simply work with your own fork. You can either publish your fork to npm and point your package.json to it, or simply point your package.json at your git repo.

@aseemk aseemk added the question label Aug 5, 2016
@aseemk aseemk added this to the v2 redesign milestone Aug 5, 2016
@aseemk aseemk self-assigned this Aug 5, 2016
@ekkis
Copy link
Author

ekkis commented Aug 5, 2016

my view is that node-neo4j should support returning the data sets that the API does. if I can ask Neo for data in graph format, then I should be able to do so through this module. no reason why this module should selectively eliminate certain formats.

insofar as the interface, I would drop "lean" and add a "resultDataContents", which would be provided (at least) the values the API accepts: "row", "graph" and "rest". it could also be provided other values for formats that node-neo4j generates (you mentioned "table").

but if I'm transitioning from some other module that queries Neo via the API, I will expect to receive the results in the format the API returns. if the module does any transformations on the data returned, it should be under a value other than those the API accepts e.g. "table"'

as for what the new, official driver does, I don't think it matters. they have changed things. I haven't had the time to review all the changes but it's irrelevant to this discussion.

I would really prefer that you include this enhancement in the next release as it breaks none of the existing tests. it's merely additional functionality that should have been there to begin with in order to truly represent the REST API that the database supports.

it will also my make life considerably easier since if I have to maintain my own fork I'll constantly need to merge in the new changes.

p.s. as for understanding the use cases, there is at least one that I'm concerned with: the linkurious library, which grabs the data from the server in that format and hands it to the sigma engine. the lack of support in node-neo4j means that I can't transition from using linkurious' database-access layer to your module. at least not without having to write some transformation to take data from one of the formats you do support and convert it to the "graph" format, which would be awful work given as how I can just get it from the server directly

p.p.s. as for lean, you'd probably want to keep it for backward compatibility but it would just be an alias for resultDataContents. if true "row", if false "rest", if undefined the default value for resultDataContents would be used

@ekkis
Copy link
Author

ekkis commented Aug 8, 2016

thoughts?

@aseemk
Copy link
Member

aseemk commented Aug 8, 2016

Hey @ekkis, thanks for your patience. I was out of town this weekend.

my view is that node-neo4j should support returning the data sets that the API does.

Thanks for your thoughts! I think you have a valid perspective, but I disagree. =)

An over-the-wire API has different goals than a code API, and part of this driver's job is to abstract the wire API away — and provide a friendly code API.

A prime example of this is that the wire API's row and rest formats return query results as an array (rows) of arrays (columns). But this is a very unfriendly code API.

E.g. if you're returning the list of your friends, and you want to display their names and link to their profiles, you'd have to access e.g. friend[0] and friend[1] with the wire API (if you were returning name and email as the two columns). This library transforms those into friendlier dictionaries, so you can do friend.name and friend.email.

If this driver didn't do that transformation, it'd just be shifting the burden to your code to do it. It's unimaginable to write code without a transformation like that.

This transformation is also something the new, official Neo4j 3.x drivers do.

as for what the new, official driver does, I don't think it matters. they have changed things. I haven't had the time to review all the changes but it's irrelevant to this discussion.

I don't feel strongly here, but given that neither Neo4j 2.x nor this driver will be developed much more at this point (given Neo4j 3.x and official drivers), I think there's something to be said for not diverging too much from the present/future of Neo4j and drivers.

insofar as the interface, I would drop "lean" and add a "resultDataContents", which would be provided (at least) the values the API accepts: "row", "graph" and "rest". it could also be provided other values for formats that node-neo4j generates (you mentioned "table").

but if I'm transitioning from some other module that queries Neo via the API, I will expect to receive the results in the format the API returns. if the module does any transformations on the data returned, it should be under a value other than those the API accepts e.g. "table"'

This is related to the first point: if this driver's goal is to abstract away the wire API and provide a friendlier code API, I explicitly don't want you (the consumer) to have to read the Neo4j wire API documentation. This driver's documentation aims to be thorough and sufficient.

if I can ask Neo for data in graph format, then I should be able to do so through this module. no reason why this module should selectively eliminate certain formats.

I would really prefer that you include this enhancement in the next release as it breaks none of the existing tests. it's merely additional functionality that should have been there to begin with in order to truly represent the REST API that the database supports.

I absolutely agree with you that this driver shouldn't eliminate formats; it should expose the features and capabilities of the database.

I'm just hesitant to merge a specific API right now because every new API adds surface area that needs to be maintained, increases the complexity of the codebase, etc., and there are still big open API questions that make a good API hard to get right without more use cases.

A good example of an open question is: along the lines of "friendly code API", this driver transforms node & relationship JSON (which has a bunch of wire artifacts like wire API URLs in it) to Node and Relationship class instances. But this wouldn't suit your use case, where Linkurious.js wants the exact same JSON that the database returns over the wire.

it will also my make life considerably easier since if I have to maintain my own fork I'll constantly need to merge in the new changes.

I hear you on this. I know that maintaining forks isn't ideal. I can tell you, though, that there have been no significant changes to node-neo4j v2 in months now, and I don't anticipate any significant new changes going forward either, given Neo4j 3.x and official drivers. (So I'm considering this driver, along with all Neo4j 2.x drivers, as "winding down" towards their end of life.)

So does that help and make it reasonable for you to work off your own fork? I can say with confidence that you won't have to worry about merging any significant changes.

All of this said, I just realized another idea: you could use db.http to make your own Cypher wire request! (Tip: you'll want to use raw: true.)

https://github.com/thingdom/node-neo4j/tree/v2#http--plugins

Then you can request the graph format and get back the exact raw JSON that Linkurious.js needs. And this was the whole point of db.http — so that you could always talk to the db directly, and not be limited by the driver's current feature set.

In fact, db.cypher is itself implemented via db.http, as you've probably seen:

@http {method, path, headers, body, raw: true}, (err, resp) =>

I wish I had thought of this sooner! Try it out and let me know what you think.

@ekkis
Copy link
Author

ekkis commented Aug 8, 2016

part of this driver's job is to abstract the wire API away — and provide a friendly code API.

and it can continue to do so whilst also allowing retrieval of the data unfiltered. consider that data in raw formats may be more "friendly" to some consumers than the formats you've designed

I explicitly don't want you (the consumer) to have to read the Neo4j wire API documentation

and I'm not suggesting this change. it's good that I as a consumer don't have to deal with the raw format. what I'm advocating is the ability for me to consume those underlying format (which were designed with consumers in mind to begin with) if I need to. consider it this way: "graph" is a "friendly" format that you didn't have to write but could provide to your audience with my tiny patch

you could use db.http to make your own Cypher wire request

yes, of course. but the point of having a driver is that it serves as an abstraction to the underlying service and in that capacity should fully support the functionality of the service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants