Associative Map Example #343

oxinabox · 2020-12-12T16:42:05Z

This is pretty neat I think
Closes #332
It has a List backed dictionary and a hash based dictionary

TODO

Put under common interface (i need to workout why it is giving so many errors)
Use new interface/instance stuff, not @instance
Implement a fast hashing algorithm
~~Split Hashable into seperate type classes for isequal and hash~~ make Hashable use Eq, and don't make Float hashable

Maybe TODO

Deleting
Overwriting (right now it basically leaks memory if you overwrite anything)

Not TODO:

Rebucketting when too full (that seems like it would complicate things a bit too much for this PR)

google-cla · 2020-12-12T16:42:10Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

oxinabox · 2020-12-12T16:45:47Z

And i made the bot sad again because i committed based on top of the wrong branch. 😂

oxinabox · 2020-12-13T17:37:01Z

Benchmarks

Dex listdict: 6us
Dex hashdict: 9us
Julia hashdict: 0.07 us
Julia littledict: 0.33 us
Dex hash: 8us, that's the bottleneck
Julia hash: 0.06 us

Dex

full_data = for i:(Fin 256). ([10 * ordinal i, ordinal i], i)
big_listdict = fold (emptyListDict _ _)  \i state. storeListDict state (fst full_data.i) (snd full_data.i)
big_hashdict = fold (emptyHashDict 128 _ _) \i state. storeHashDict state (fst full_data.i) (snd full_data.i)

%time
tryGetHashDict big_hashdict [1280, 128]
(Just (128@Fin 256))

Compile time: 150.174 ms
Run time:     9.000 us

%time
thehash [1280, 128]
1720020324

Compile time: 74.541 ms
Run time:     8.000 us

%time
tryGetListDict big_listdict [1280, 128]
(Just (128@Fin 256))

Compile time: 130.772 ms
Run time:     6.000 us

Julia

The OrderedCollections.LittleDict is not quite the list dict implemented here, but it is close.
Its what I wanted to implement but couldn't get to work, 1 list of keys, one list of values.
(and I don't know what the Julia Dict uses for collisions, but i am sure it isn't a LittleDict

julia> using OrderedCollections, BenchmarkTools

julia> const full_data = [([10ii, ii]=>ii) for ii in 1:256];

julia> const littledict = LittleDict(full_data);

julia> const hashdict = Dict(full_data);

julia> @btime littledict[$[1280, 128]];
  330.862 ns (0 allocations: 0 bytes)

julia> @btime hashdict[$[1280, 128]];
  73.187 ns (0 allocations: 0 bytes)

julia> @btime hash($([1280, 128]))
  64.405 ns (0 allocations: 0 bytes)
0x71191e98f759495d

dougalm · 2020-12-14T00:25:11Z

If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?

apaszke

Nice! BTW I wouldn't use %time for benchmarks, because it only executes the code once. When it only takes a few us or ns, you should use %bench, which will execute it a number of times to get a more accurate estimate.

Finally, there is one caveat which is that for some reason even the most trivial functions seem to take around 2us when ran in %bench, so it is unlikely that you will ever see values in the ranged of nanoseconds, even if that's the true cost. I suspect it's a problem with our benchmarking infrastructure, or it might be a cost of doing an FFI call from Haskell, but I didn't have time to investigate.

examples/hashmap.dx

oxinabox · 2020-12-14T15:28:34Z

examples/hashmap.dx

+:p tryGetHashDict big_hashdict [210, 20]
+:p tryGetListDict big_listdict [210, 20]
+
+' ## Wrap under a common interface


I would like help working out what I am doing wrong in this section

oxinabox · 2020-12-14T16:45:49Z

If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?

A queck google suggest people commonly use the FNV hash algorithm which seems like it would be easy enough to implement for Int32

oxinabox

.

dan-zheng · 2020-12-18T22:59:45Z

dev branch has been merged into main, now that plotting has been revamped. dev branch will be deleted now, we can continue development on main.

I'll change the base branch of this PR to main now!

oxinabox · 2020-12-19T16:50:15Z

FNV based hash doesn't seem much faster.
Had to say though since i can't use %bench (#359)
I suspect the time to do the FFI call is dominating perhaps

%time
threefryHash 0 1
-841280227

Compile time: 322.792 ms
Run time:     114.000 us 
%time
threefryHash 0 1
-841280227

Compile time: 864.773 ms
Run time:     8.000 us 
%time
threefryHash 0 1
-841280227

Compile time: 224.515 ms
Run time:     5.000 us

vs

%time
fnvhash 0 1
-1717489731

Compile time: 36.299 ms
Run time:     9.000 us 
%time
fnvhash 0 1
-1717489731

Compile time: 22.220 ms
Run time:     6.000 us 
%time
fnvhash 0 1
-1717489731

Compile time: 22.274 ms
Run time:     5.000 us

oxinabox · 2020-12-24T23:47:00Z

examples/hashmap.dx

+    tryGet = tryGetHashDict
+
+
+' all of the following error except the first


This comment is wrong, only the tryGet error.
but I don't know why they do

:p tryGet eg_listdict 'a' Type error:Ambiguous type variables: [?2] ( () , ( ( tmp @> ( ((k:Type) ?-> (v:Type) ?-> (Associative (ListDict Word8 Int32) k v) ?=> (ListDict Word8 Int32) -> k -> Maybe v) , (tryGet (ListDict Word8 Int32)) ) , tmp1 @> ( ((v:Type) ?-> (Associative (ListDict Word8 Int32) Word8 v) ?=> (ListDict Word8 Int32) -> Word8 -> Maybe v) , (tmp Word8) ) , tmp2 @> ( ((Associative (ListDict Word8 Int32) Word8 ?2) ?=> (ListDict Word8 Int32) -> Word8 -> Maybe ?2) , (tmp1 ?2) ) , tmp3 @> (((ListDict Word8 Int32) -> Word8 -> Maybe ?2), (tmp2 _)) , tmp4 @> ((Word8 -> Maybe ?2), (tmp3 eg_listdict)) , tmp5 @> ((Maybe ?2), (tmp4 'a')) , tmp6 @> ((Maybe ?2), tmp5) , _ans_ @> ((Maybe ?2), tmp6) ) , [ tmp:((k:Type) ?-> (v:Type) ?-> (Associative (ListDict Word8 Int32) k v) ?=> (ListDict Word8 Int32) -> k -> Maybe v) = tryGet (ListDict Word8 Int32) , tmp1:((v:Type) ?-> (Associative (ListDict Word8 Int32) Word8 v) ?=> (ListDict Word8 Int32) -> Word8 -> Maybe v) = tmp Word8 , tmp2:((Associative (ListDict Word8 Int32) Word8 ?2) ?=> (ListDict Word8 Int32) -> Word8 -> Maybe ?2) = tmp1 ?2 , tmp3:((ListDict Word8 Int32) -> Word8 -> Maybe ?2) = tmp2 _ , tmp4:(Word8 -> Maybe ?2) = tmp3 eg_listdict , tmp5:(Maybe ?2) = tmp4 'a' , tmp6:(Maybe ?2) = tmp5 , _ans_:(Maybe ?2) = tmp6 ] ) )

I'm pretty sure that this is because Dex isn't aware that the dict parameter of Associative actually implies the values of k and v. Nothing prevents you from defining weird instances such as Associative (HashDict k v) Int Float, so it conservatively assumes that this code is ambiguous.

FWIW, the same problem in Haskell is solved using the FunctionalDependencies extension, but we don't have it in Dex, and I'm not sure if we should. Our type class design is still a bit up in the air.

examples/hashmap.dx

timing (squash me. should use %bench)

google-cla bot added the cla: no label Dec 12, 2020

oxinabox force-pushed the ox/hashmap branch from 6edb79c to 9738530 Compare December 12, 2020 16:44

google-cla bot added cla: yes and removed cla: no labels Dec 12, 2020

apaszke reviewed Dec 14, 2020

View reviewed changes

examples/hashmap.dx Outdated Show resolved Hide resolved

oxinabox commented Dec 14, 2020

View reviewed changes

oxinabox commented Dec 15, 2020

View reviewed changes

dan-zheng changed the base branch from dev to main December 18, 2020 22:59

oxinabox force-pushed the ox/hashmap branch from 5641b5f to 656af03 Compare December 19, 2020 11:30

oxinabox commented Dec 24, 2020

View reviewed changes

examples/hashmap.dx Outdated Show resolved Hide resolved

oxinabox marked this pull request as draft December 29, 2020 19:51

apaszke mentioned this pull request Jan 15, 2021

Allow bundling associated types in interfaces #460

Open

srush mentioned this pull request Jan 20, 2021

JSON implementation #474

Open

oxinabox added 4 commits February 14, 2021 19:57

Add Associative Collections example

5337bd1

timing (squash me. should use %bench)

Add FFI for FNV based hash

a9cf872

wip

b5c6f07

rework hashable typeclass to depend on Eq

94e8d03

oxinabox force-pushed the ox/hashmap branch from c95ebdb to 94e8d03 Compare February 14, 2021 19:57

update to new syntax

0d072b6

oxinabox mentioned this pull request Oct 1, 2021

Add explicit interface methods #660

Merged

apaszke force-pushed the main branch from 46b8727 to 8db43fc Compare May 13, 2022 14:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Associative Map Example #343

Associative Map Example #343

oxinabox commented Dec 12, 2020 •

edited

Loading

google-cla bot commented Dec 12, 2020

oxinabox commented Dec 12, 2020

oxinabox commented Dec 13, 2020 •

edited

Loading

dougalm commented Dec 14, 2020

apaszke left a comment

oxinabox Dec 14, 2020

oxinabox commented Dec 14, 2020

oxinabox left a comment

dan-zheng commented Dec 18, 2020

oxinabox commented Dec 19, 2020 •

edited

Loading

oxinabox Dec 24, 2020

apaszke Jan 6, 2021

apaszke Jan 6, 2021

		tryGet = tryGetHashDict


		' all of the following error except the first

Associative Map Example #343

Are you sure you want to change the base?

Associative Map Example #343

Conversation

oxinabox commented Dec 12, 2020 • edited Loading

google-cla bot commented Dec 12, 2020

oxinabox commented Dec 12, 2020

oxinabox commented Dec 13, 2020 • edited Loading

Benchmarks

Dex

Julia

dougalm commented Dec 14, 2020

apaszke left a comment

Choose a reason for hiding this comment

oxinabox Dec 14, 2020

Choose a reason for hiding this comment

oxinabox commented Dec 14, 2020

oxinabox left a comment

Choose a reason for hiding this comment

dan-zheng commented Dec 18, 2020

oxinabox commented Dec 19, 2020 • edited Loading

oxinabox Dec 24, 2020

Choose a reason for hiding this comment

apaszke Jan 6, 2021

Choose a reason for hiding this comment

apaszke Jan 6, 2021

Choose a reason for hiding this comment

oxinabox commented Dec 12, 2020 •

edited

Loading

oxinabox commented Dec 13, 2020 •

edited

Loading

oxinabox commented Dec 19, 2020 •

edited

Loading