Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when run DenStream.runOnNewSample() without run DenStream.runInitialization() #1

Open
luisalfarob opened this issue Mar 5, 2018 · 2 comments

Comments

@luisalfarob
Copy link

luisalfarob commented Mar 5, 2018

Hello,
When I try to run your code without runInitialization function I get the follow error:
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 5, in
result.append(den1.runOnNewSample(sample))
File "C:\git\OutlierDenStream\DenStream.py", line 178, in runOnNewSample
self.buffer.append(sample)

This error can be resolved changing this line:
self.buffer.append(sample)
for:
self.buffer.append(sample.value)

Solved!
Now I get another error:
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 5, in
result.append(den1.runOnNewSample(sample))
File "C:\git\OutlierDenStream\DenStream.py", line 179, in runOnNewSample
if (len(self.buffer) >= self.numberInitialSamples):
TypeError: '>=' not supported between instances of 'int' and 'NoneType'

I noted that the variable numberInitialSamples is not defined. So I'm thinking if I need change the way that a run the code or there are bugs when try to run the code for multicluster operations...

I hope you can help me with this problem.
Regards.

@anrputina
Copy link
Owner

Hello,
Unfortunately, the algorithm has a very specific purpose right now (anomaly detection) so the initial clustering part is not very well developed. In particular, for now, the algorithm manages two clusters: Normal or abnormal. So if it is able to merge the new sample to the core-micro-clusters the output is 'Normal', 'Abnormal' otherwise.

In the next updates I can add the multicluster support (now, unfortunately, I work on something else). If you wanna contribute any help is welcome. If you try to use the multicluster operations now I think you're going to obtain only bad results.

For what concerns your error: self.numberInitialSamples is initialized to None thus you have to assign a value during the initialization phase, something like this:
den = DenStream(lamb=0.03, epsilon='auto', beta=0.03, mu='auto', ..., numberInitialSamples=XX)
Hope this can help you, but anyway the code is not ready for multiclustering (just 2 clusters: normal and abnormal in my case.)

Write to me if you need anything else.

@ZahoorAhmad
Copy link

Kindly let me know how to run for anomaly detection there is no help for startingBuffer in initialization..
Thnx in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants