Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quickcheck zebra reading #499

Merged
merged 8 commits into from
Mar 6, 2017
Merged

quickcheck zebra reading #499

merged 8 commits into from
Mar 6, 2017

Conversation

tranma
Copy link
Contributor

@tranma tranma commented Feb 21, 2017

This adds a test that just savages some random Icicle facts to Zebra, reads them into an input struct, and peeks the result.

depends on icicle-lang/zebra-ambiata#64

! @jystic @amosr
/jury approved @jystic @amosr

(bools, ls, rs) <- List.unzip3 <$> mapM (fromVSum (tableOf l) (tableOf r)) rows
lcols <- mergeValues l ls
rcols <- mergeValues r rs
pure $ [ Zebra.IntColumn (Storable.fromList bools) ] <> lcols <> rcols
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍


SumT ErrorT t -> do
(bools, vs) <- List.unzip <$> mapM (fromVSumError (tableOf t)) rows
pure $ [ Zebra.IntColumn (Storable.fromList bools) ] <> concat vs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how icicle does it, but not sure that this specialisation should end up in a zebra file

should probably be (tags, errs, vs) instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just get rid of this case and let SumT handle it maybe

@@ -65,6 +66,7 @@ main
, Icicle.Test.Avalanche.Melt.tests

, Icicle.Test.Sea.Psv.tests
-- , Icicle.Test.Sea.Zebra.tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not enabled?

@jacobstanley
Copy link
Contributor

It might be easier to convert the icicle type to a Schema then convert the icicle value to a zebra Value, then you can use:

fromRow :: Schema -> Value -> Either (TableError Schema) (Table Schema)

might be a safer test because otherwise you're testing your encoding of tables vs the icicle generated code.

@tranma
Copy link
Contributor Author

tranma commented Feb 21, 2017

true, that is less savage actually -- I'll convert to a Schema

@tranma tranma force-pushed the topic/input-test branch 2 times, most recently from a4dce76 to 566607b Compare February 27, 2017 03:29
@tranma
Copy link
Contributor Author

tranma commented Feb 27, 2017

using zebra Schema now

@tranma tranma force-pushed the topic/input-test branch 2 times, most recently from 4cb4399 to 4079cfe Compare February 27, 2017 03:38
, pure TimeT
, pure StringT
, StructT <$> arbitrary
, OptionT <$> arbitrary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stuff in the "small" list should probably be non-recursive, otherwise you might generate OptionT (OptionT (OptionT etc and never terminate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right ok. This was because I wanted more structs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should all be equally likely except when the size is 1 or less

@tranma tranma force-pushed the topic/input-test branch 4 times, most recently from 732b087 to dac0214 Compare February 28, 2017 05:12
-- iprogram: { *mempool, input, ... }
-- input: { *chord_time, *fact_count, *tombstone, *input_start, ... }
programs0_ptr <- peekWordOff fleet_ptr 4
tombstones_ptr <- peekWordOff programs0_ptr 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some serious pointer twiddling :D

pure Schema.Int

UnitT ->
Nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be mapped to Schema.Struct Boxed.empty

Schema.Field (Schema.FieldName (nameOfStructField f)) <$> schemaOfType t
in Schema.Struct . Boxed.fromList <$> mapM fieldOf (Map.toList fields)

_ -> Nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to map all of ValType to zebra? probably doesn't matter for a test I guess

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a savage way to let you comment out cases for debugging :D

pure . Zebra.Double $ x

VUnit ->
pure . Zebra.Int $ 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

pure . Zebra.Int . fromIntegral . Icicle.packedOfTime $ x

VString x ->
pure . Zebra.ByteArray . ByteString.pack . Text.unpack $ x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ByteString.pack . Text.unpack/Data.Text.Encoding.encodeUtf8/ ?


VNone
| OptionT t <- ty
-> Zebra.Enum 0 <$> zebraOfValue t (defaultOfType t)
Copy link
Contributor

@jacobstanley jacobstanley Feb 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be
Zebra.Enum 0 (Zebra.Struct Boxed.empty)
or
Zebra.Enum 0 <$> zebraOfValue UnitT VUnit

OptionT t ->
let
noneOf x =
Schema.Variant (Schema.VariantName "none") <$> schemaOfType x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be:
Schema.Variant (Schema.VariantName "none") (Schema.Struct Boxed.empty)
or
Schema.Variant (Schema.VariantName "none") <$> schemaOfType UnitT

-> Zebra.Array . Boxed.fromList <$> mapM (zebraOfValue t) xs

VFactIdentifier x ->
pure . Zebra.Int . fromIntegral . getFactIdentifierIndex $ x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case (fact identifier) is missing from schemaOfType

v' <- schemaOfType' v
pure . Schema.Struct . Boxed.fromList $
[ Schema.Field (Schema.FieldName "keys") (Schema.Array k')
, Schema.Field (Schema.FieldName "vals") (Schema.Array v') ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool for a test, but maybe we want Array (Struct [("key", k'), ("val", v')]) for production? not sure just wanted to bring it up for discussion. I can imagine when zebra is pretty printed having the key/value pair together would be useful.

| StructT struct <- ty
, types <- getStructType struct
-> do let vs = Map.elems (Map.intersectionWith (,) types xs)
Zebra.Struct . Boxed.fromList <$> mapM (uncurry zebraOfValue) vs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have a guard saying length vs == length types or similar

@jacobstanley
Copy link
Contributor

💯 for the test

👍 after we resolve the mapping from icicle types to zebra, happy to discuss IRL, maybe it's not so important, but I can see that testing stuff becoming the basis for our production mapping

@tranma tranma force-pushed the topic/input-test branch from 45a9de5 to b45b766 Compare March 1, 2017 02:54
@@ -155,7 +156,7 @@ static int64_t zebra_translate_column
istring_t *strings = anemone_mempool_alloc (mempool, count * sizeof (istring_t));
int offset = 1;
for (int i = 0; i != count; ++i) {
int64_t index = i + elem_start;
int64_t index = i;
Copy link
Contributor

@amosr amosr Mar 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. I think this should include the elem_start, but that elem_start should not have added += len below, and also the recursive call should use start instead of elem_start.

the elem_start refers to the current level - that is the indices/lengths array. then start refers to the nested values, so you need to use it when calling zebra_translate_table

I think this code would pass as long as elem_start = 0 for nested arrays (strings), but not when you start splitting nested arrays up with max rows

does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think I did that in the other branch -- (reading zebra facts in chunk)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this?

                      int64_t index = i + elem_start;
                      int64_t start = data.a.n[index]   - data.a.n[0];
                      int64_t len   = data.a.n[index+1] - data.a.n[index];
 -                    offset = zebra_translate_table (mempool, elem_start, len, (void**)(strings + i), &data.a.table);
 -                    elem_start += len;
 +                    offset = zebra_translate_array (mempool, start, len, (void**)(strings + i), &data);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example probably better

nested array of strings
  ["beef", "helps", "keep", "order"]

lengths:
  [4, 5, 4, 5]
scan (data.a.n):
  [0, 4, 9, 13, 18]
values (data.a.table):
  "beefhelpskeeporder"

zebra_translate_table(elem_start = 0, count = 1, table)
= ["beef"]

zebra_translate_table(elem_start = 0, count = 2, table)
= ["beef", "helps"]

zebra_translate_table(elem_start = 2, count = 2, table)
= ["keep", "order"]

the last one is the most interesting because it needs to start looking at scan array 2 and 3 (data.a.n), so to pull out the value "beef" it looks up scan[2] = 9 for the start index, and scan[3] = 13 for the end index, meaning the length is 13-9=4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I wrote that example comment above before I saw your reply

auxillary :: [ValType] -> [ValType]
auxillary [] = []
auxillary (t@(BufT _ a) : ts) = (ArrayT a : t : auxillary ts)
auxillary (t : ts) = t : auxillary ts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silly question, no chance of bufs inside other types at this point?

e ->
pure (VLeft (VError e))

peekInputs (x:xs) (index - 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this peekInputs might be worth moving in to sea eval, wherever peekOutput is

@amosr
Copy link
Contributor

amosr commented Mar 1, 2017

🥇 wow

@tranma
Copy link
Contributor Author

tranma commented Mar 6, 2017

I'm going to rebase and merge this, ignoring the unmelted value failure (see #500)

@tranma tranma force-pushed the topic/input-test branch from 2e31527 to e8fdcc4 Compare March 6, 2017 00:44
@tranma tranma merged commit d515064 into master Mar 6, 2017
@tranma tranma deleted the topic/input-test branch March 6, 2017 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants