Skip to content
This repository has been archived by the owner on Feb 26, 2021. It is now read-only.

API consideration ExecuteAsync #14

Open
mgravell opened this issue Nov 15, 2017 · 4 comments
Open

API consideration ExecuteAsync #14

mgravell opened this issue Nov 15, 2017 · 4 comments

Comments

@mgravell
Copy link

mgravell commented Nov 15, 2017

Some initial thoughts; not sure if I should log these individually or whether this is fine, but:

  • not sure how well it works for non-query and multi-grid query scenarios
  • the factory approach as illustrated by Execute_success() with the list add in the factory precludes non-buffered data streams; really feels like the primary API should expose something more akin to async-enumerable (or a spoofed similar if that still isn't nailed down)
  • the column bind mechanism looks ... unexpected; personally I would have expected that to be part of the factory method, i.e. Func<SomeRowApi, TResult>; but in particular the current API is incompatible with both immutable types (which are a thing) and value types (because of pass-by-value) (edit: I guess technically you could work with immutable types via return obj.WithName(row.ReadString()) etc in each branch of a switch, but it would be horribly inefficient for classes)
  • not sure quite how well the arbitrary name prepare/execute is going to map to other providers
  • whole topic of parameterization
  • whole topic of column metadata inspection - meaning: as a library author, I want to construct the factory/binder at runtime based on looking at the columns
  • might need a whole lot more async when dealing with large data sets or wide columns (multiple large strings etc) - perhaps via ValueTask<T>-based Read*Async APIs?

I appreciate the current status is minimal viable working code and that most of these things are almost certainly on your radar, but I wanted to throw them down.

In particular, re the primary multi-row API, my "gut" suggests something more like:

public WhateverAsyncEnumerableLooksLike<TResult> Execute(...,
    Func<RowAPI, Task<TResult>> factory) => Execute(..., () => factory);    
public WhateverAsyncEnumerableLooksLike<TResult> Execute(...,
    Func<MetadataAPI, Func<RowAPI,Task<TResult>>> factoryFactory) // oh shit, I think I just java'd myself

so that the following is possible (for the "we know what the data looks like, thanks" case):

var data = session.Execute<Customer>(..., row => ReadCustomerAsync(row));
...
static async Task<Customer> ReadCustomerAsync(RowAPI row) {
    var customer = new Customer();
    customer.Id = row.ReadInt32(); // in my head I'm assuming this increments the column automagically
    customer.Notes = await row.ReadStringAsync(); // value task; might be in the buffer, might not?
    customer.Foo = row.ReadDouble();
    ...
    return customer;
}

where-as a library based reader (that doesn't know about types ahead of time) might be more like:

var data = session.Execute<T>(..., metadata => GetTypeReader<T>(metadata));

where GetTypeReader looks at the T and the metadata, and hands back a (possibly cached) configuration/strategy-generated Func<Task<RowAPI, T>>


If Customer was immutable, then the same would be:

static async Task<Customer> ReadCustomerAsync(RowAPI row) {
    var id= row.ReadInt32(); // in my head I'm assuming this increments the column automagically
    var notes = await row.ReadStringAsync(); // value task; might be in the buffer, might not?
    var foo= row.ReadDouble();
    ...
    return new Customer(id, notes, foo, ...)
}

The key point I'm trying to explore here is essentially: when and how can callers inspect column metadata, and when and how can they use that to influence what Peregrine is doing? And also: when and how do immutable/value-types work.

@mgravell
Copy link
Author

after some thought, I wonder whether returning something akin to IDataReader but backed by the new impl is frankly simpler and more flexible...

@anpete
Copy link
Contributor

anpete commented Nov 15, 2017

@mgravell Excellent points. Thanks for taking the time to put this together.

This code shouldn't be considered any kind of API proposal and I agree completely that it is terrible! 😄 Right now, this is just an experiment (unfinished) we put together to help us establish a perf. baseline for one data access scenario.

The initial results are promising in that we see 25%-33% throughput improvement over ADO.NET, and we get pretty close to the numbers produced by the pgbench tool. So we know it is possible!

We plan to evolve this some more (@davidfowl wants to try his Pipelines magic), but we would also love to get PRs on this. I think it would be great to have a bunch of different API proposals in here that we could compare.

@anpete
Copy link
Contributor

anpete commented Nov 15, 2017

@mgravell Another thing to bear in mind: The goal here is be fast (and low allocating), and we expect to have to trade some amount of usability to get there. The idea is that frameworks like Dapper and EF can use these very low-level APIs, but they may not be great for general consumption.

@mgravell
Copy link
Author

mgravell commented Nov 15, 2017 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants