Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

affinity propagation result is not consistent with sklearn in python #131

Open
xiuliren opened this issue Nov 9, 2018 · 3 comments
Open

Comments

@xiuliren
Copy link
Contributor

xiuliren commented Nov 9, 2018

the two clustering results are different. Julia version did not do any clustering since the assignment is just the index of each object! My similarity matrix is too large to show here.

using Clustering

@time affinityPropResult = Clustering.affinityprop(similarityMatrix)

affinityPropResult.assignments
using PyCall

@pyimport sklearn.cluster as cl
af = cl.AffinityPropagation(affinity="precomputed")[:fit](similarityMatrix)

labels = af[:labels_]

The travis test also did not verify the correctness of the result.

@alyst
Copy link
Member

alyst commented Nov 11, 2018

Thanks for the report! You are most welcome to submit a fix.
Otherwise, if you have a small (<100 entities, the smaller the better) reproducible example, I can look into that.

@xiuliren
Copy link
Contributor Author

I double checked the results, they are same with some random tests.
I did not add 1 for python result since python start from 0.

labels = af[:labels_] .+ 1

@xiuliren
Copy link
Contributor Author

this issue still exist. The tests shows that it requires that the diagnal value should be the median value of the similarity matrix, otherwise the result is not consistent with python!

This is my test code:

using Distances
using Clustering
using LinearAlgebra
using Random
using Statistics 
Random.seed!(123)

    d = 10
    n = 44
    x = rand(d, n)
    S = -pairwise(Euclidean(), x, x)

    # set diagonal value to median value
#     S = S - diagm(0 => diag(S)) + median(S)*I

    R = affinityprop(S)


    k = length(R.exemplars)
    

using PyCall

@pyimport sklearn.cluster as cl
af = cl.AffinityPropagation(affinity="precomputed")[:fit]( S )
ref_assignments = af[:labels_] .+ 1

@assert randindex(R.assignments, ref_assignments)[2]==1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants