Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialized type for Boolean sparse arrays #79

Open
heliosdrm opened this issue Sep 19, 2019 · 10 comments
Open

Specialized type for Boolean sparse arrays #79

heliosdrm opened this issue Sep 19, 2019 · 10 comments

Comments

@heliosdrm
Copy link
Collaborator

heliosdrm commented Sep 19, 2019

Recurrence matrices are wrappers of SparseMatrixCSC{Bool,Int}, which are useful to make some operations. However it is inefficient to create them (cf. #76), and also take more memory than what is needed (all nonzero values are true, so they don't add any information at all).

It would be useful to have a special type of sparse matrices for these cases, which only store row indices and the last index of each column - but not the values, with optimized methods. This might come in handy even for other applications, not only for recurrence analysis, so in my opinion it is worth to be in a separate package (outside JuliaDynamics).

I feel like investigating if such a package exists, and otherwise draft it. When it is working, we could use it to improve performance in this package.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@Datseris
Copy link
Member

Indeed this seems useful. But it could also be a lot of effort.

What I think would make sense is to first implement it here, and add the operations necessary for RecurrenceAnalysis to work. Then, if you feel you want to spent more effort to make it generally useful you can do it.

But why do you suggest for it to be outside of JuliaDynamics? It can be a separate package from DynamicalSystems.jl , but what is the specific reason for it to be in another org?

@Datseris
Copy link
Member

By the way, this is really funny because I was thinking to do exactly something along these lines next month, where I will be spending some time doing quality PRs all over JuliaDynamics. (Just defended my PhD last Friday!)

@heliosdrm
Copy link
Collaborator Author

I think it doesn't really matter whether that package is made inside or outside JuliaDynamics. I said that because the concept is not a specific tool for dynamical systems, so drafting it outside also makes sense, and would save the effort of maintenance by JuliaDynamics.

Nevertheless: I agree that we could start by implementing only what RecurrenceAnalysis needs. And then extend in a separate package it if it is worth the effort.

@heliosdrm
Copy link
Collaborator Author

By the way, this is really funny because I was thinking to do exactly something along these lines next month, where I will be spending some time doing quality PRs all over JuliaDynamics. (Just defended my PhD last Friday!)

Congratulations!!!

@pucicu
Copy link
Contributor

pucicu commented Sep 19, 2019 via email

@asinghvi17
Copy link
Member

asinghvi17 commented Sep 25, 2019

Congratulations!

Here's some code that should help avoid materializing the Boolean values:

module SingleValueArrays

struct SingleValueArray{T, N} <: AbstractArray{T, N}
    value::T
    size::NTuple{N, Int64}
end

SingleValueArray(i::Int64, val::T) where T = SingleValueArray{T, 1}(Tuple(i), val)

Base.size(s::SingleValueArray) = s.size

Base.length(s::SingleValueArray) = prod(s.size)

Base.getindex(s::SingleValueArray, args...) = s.value # we really don't care

Base.setindex!(s::SingleValueArray, args...) = error("SingleValueArrays are immutable!")

export SingleValueArray

end

Then, we could just pass this into the SparseMatrixCSC constructor.

I'm not sure if we can avoid materializing the array when constructing the sparse matrix, though. Will need to be tested.

Edit: one could envision a segmented vector, which has n values and breaks between them; that might avoid materializing the full column array, as well (at some runtime penalty for index lookup).

@Datseris
Copy link
Member

Base.getindex(s::SingleValueArray, args...) = s.value # we really don't care

well, we do care a little bit :P one has to check for out of bounds here. This type does not store any information about which entries have the value. I think the best way forward is indeed the original suggestion of making a specialized type of sparse array that has the I and the J of the sparse matrix, but not the values (the nnz).

@asinghvi17
Copy link
Member

My idea was that we could put this array type as the nzvals while constructing the sparse matrix, so that we can avoid materializing the trues. In that case, though we could do bounds checking, it may not really be necessary.

@Datseris
Copy link
Member

ah, okay, now I understand. Sorry. What does size achieve then?...

@asinghvi17
Copy link
Member

There's an internal size check in the SparseArray constructor IIRC, so one needs to be able to provide a size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants