-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic datapoint dimensions (fixed per tree) #53
Comments
First , thanks for the interest, it's so great to hear when people are actually using the code! Mhmm, good question about the dimensions though. I mean, you can fork the project, apply your proposed changes and then make a Pull Request. Then we can see how many changes are necessary and what the impact on the codebase and on the potential performance are? |
I'd be happy to make a pull request, but I'm not exactly sure how to properly do it while protecting the user from not putting different dimensions into it. I'm still pretty into OO design, ( i know..), so from that point of view, hey a tree should just know it's own dimensions. But enums don't do that. The init can take the first element and use the dimension of that one, we could throw error on init without values, we should loop over all values to check dimensions and throw error on difference. Or we provide a dimensions value in the init, but we can't store it nor check for it in future operations. The insert and remove operations can also get the dimension from the provided value, or get it as an extra argument. From what I see, that's it. I don't know how to make it so that we constantly guard the user against user error. I'd do it myself, for my purposes, I'll embrace the terrible crash if I mess up :). That's not friendly for most users though and shouldn't make it into the core. It will also break everyones code, since the base datapoint protocol needs changing. If you know of a good way around this I'm open to building it (except full rewrites :p) EDIT: removed unfinished sentence. |
Oh well, I'd say try to open a PR, to see that tests succeed and all and we go from there :) This way we can discuss the individiual changes in the code much better. (each node in the tree has an element, so if you add a new node, we can check f the new node has the same dimension, so it shouldn't be to hard to make a good error message). |
I have made a pull request without breaking anything, any thrown errors will make breaking changes. It is not as friendly as could be, but at least it is fully optional. |
Mhmm, it looks like the automatic Unit tests aren't running through. |
Sure, no worries. I’ll just continue building my integration after the weekend and we can see how (and if) you want to integrate the result.
|
Just for understanding, @AlexDM0 , in your use-case, the number of dimensions is very diferrent per house, it's not just that all houses fall into 2D, 3D and 4D categories, or s similarly limited list? |
A house contains a number of broadcasting beacons, which we sell (called Crownstones). Each user can have a different amount of beacons in his house. A user can be invited to join the houses of friends or an office, which can also contain a different number of beacons. I have 35 Crownstones in my house, so that is 35 dimensions. Another user may have 4, or 10. The more you have, the more accurate the room level localisation is. We have testing houses with 50+ Crownstones. We could limit the amount of houses to say 20, and keep 20 instances of datapoint definitions, where we can set the dimensions at runtime, but that's a little dirty :) |
Mhmm, yes, I see. So each beacon can then have a
And now for each house you have a Set of Float-vectors where each vector has the same size for that house&app-install?
|
I'm wondering whether this is actually a rather frequent use case and whether we should build it more clearly into the type I know, the dimensionsoverride works, but it feels less clean. |
Int value actually, they are signal strength indicators. I'll first construct a sorted list from all the collected datapoints. So index 0 is beacon X index 1, beacon Y etc. This array goes into each datapoint (with values for missing data) so all have the same dimensions. Each data point has a room label, and this sorted list. Every incoming dataset will be mapped to this sorted array, and queried in the KD tree. I'll use the distance function with one we have here to get the distance estimate between points. At least, that's what I had in mind so far. It's an experiment :). We have a number of difference classifiers and I wanted to add this method as a test. It is indeed an ugly solution, but if we'd do a cleaner version, it would break everyones code since they use the current datapoint protocol. I wouldnt blame you if you wouldnt merge this :) It is for a usecase that you not really designed for and thats perfectly fine. |
It might be that all people have to do is to delete the word Will look into it later today, I think there is also a way of annotating the code, so Xcode can make the right suggestion how people can fix their code when the load a new version of the repo. |
Except you lose the guarantee that all datapoints in the tree have the same dimension. This is an important point for the tree. |
You could have a class which gets the dimensions on initialisation, which can wrap your methods, do checks and error reporting and use the dimensionOverride internally. That way users won't see it, its safe, but its not functional anymore... pick your poison :) |
I think I can just create a warning in the |
Yes, but you can document that with a warning. If you use this, its up to you to have your stuff in order. A sort of advanced usage. Most users likely wont need it anyway and they're safe due to the static dimensions. |
Hi there! We'd like to use your KDTree classifier for indoor localisation in a smart home. I liked your talk so I was looking into experimenting with this lib.
Our datavectors (per house for instance) have X dimension, but another house may have Y dimensions. The user is in control of adding houses, and he determines how many dimensions a vector has (by having more/less beacons).
We want to generate a tree for each house and the issue then becomes the static let dimensions. We can set the dimensions of all datapoints by Element.dimensions = X. We could keep a number of types beforehand, but keeping this artificial limit on the amount of houses seems a bit ugly. I can't see a way to have a "type-factory" that generates these structs with custom dimensions either.
I could clone your lib and change the 3 instances where you take it from Element and instead take it from the provided element itself. This makes it of course the responsibility of the user to ensure all datapoints have the same dimensions.
What's your take on this?
The text was updated successfully, but these errors were encountered: