Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zigzag persistence part1 #917

Merged
merged 69 commits into from
Oct 24, 2024
Merged

Conversation

hschreiber
Copy link
Collaborator

@hschreiber hschreiber commented Jul 21, 2023

New module to compute zigzag persistence. To simplify the review process, this version contains only the main functionality, that is to compute the barcode.
Following functionalities will be added in separated PRs:

  • python interface
  • construction of oscillating rips filtrations from given edges
  • use of discrete Morse theory to simplify filtration
  • implementation of Dey and Hou's FastZigzag algorithm (?)(further tests are necessary)

The code is based on PR #401 and requires PR #886 and draft #669.

src/Zigzag_persistence/concept/ZigzagPersistenceOptions.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
src/Zigzag_persistence/include/gudhi/Zigzag_persistence.h Outdated Show resolved Hide resolved
* non-zero coefficients generating it. A face should be represented by the arrow number when the face appeared for
* the first time in the filtration (if a face was inserted and then removed and reinserted etc., only the last
* insertion counts). The face range should be ordered by increasing arrow numbers.
* @param dimension Dimension of the inserted face.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, the only thing the dimension is used for is the output, and we wouldn't lose anything if we removed the dimension completely from this class, wrappers could handle it on their own.
I am not asking to do that (I haven't thought about it much), I just wanted to write this down.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the same than for the matrix module: the dimension is only useful for the persistence diagram. I could make it optional for both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine what removing dim from Zigzag_persistence would look like. I am not so sure what making it optional would mean. It sounds like a lot of complications, extra overloads and if constexpr ☹️ Since the dimension doesn't hurt that much in Zigzag_persistence (if we don't want it, we can pass 0 all the time and ignore it in the callback), and if there was no support for dimension it wouldn't be hard to work around (store the dimension in a specific location, known to the callback, before each insert/remove), I don't think it deserves complications by making it optional, we should just pick yes or no.
It looks like the dimension is already kind of optional in the matrix, since zigzags don't pass the dimension to the matrix as far as I can see 😉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the dimension is already kind of optional in the matrix, since zigzags don't pass the dimension to the matrix as far as I can see 😉

Ah, wait, this is not normal! Because Zigzag was based on the simplex tree before, it only worked with simplices and the matrix would automatically compute the dimension, so I didn't had to add it. But now that the code works for general cells, we have to give the dimension also to the matrices, as col.get_dimension is used to get the infinite bars...
In the matrix module, the dimension is always stored for chain/RU/boundary matrices and never for base matrices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests didn't notice the problem because... there were no infinite bars? Or they were all of the same dimension?

Copy link
Collaborator Author

@hschreiber hschreiber Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit error-prone indeed... But there is worse and at least it comes in handy, as simplices are still the bigger use case.
If the size is 1, the dimension will just be put to 0. Everything else will work completely alright. It is just that when returning the barcode, the dimension will be completely wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I am not completely right: there is one point where a wrong dimension will make problems and that is in the boundary matrix (with only R and not RU). The reduction algorithm uses the clearing optimization which uses the dimensions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the size is 1, the dimension will just be put to 0.

Doesn't that deserve an assert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make it optional for both.

How horrible would it look?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How horrible would it look?

It won't be more horrible than any other thing in the matrix module, but you will not like it for sure. I would need to make two interval types (are make the interval inherit its dimension) and then add some std::conditional and if constexpr here and there.

/**
* @brief Type for filtration values.
*/
using filtration_value = unspecified;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VincentRouvreau would have to weigh in, but I think the type Filtration_value is capitalized in all the rest of Gudhi, so it looks a bit strange here.

Copy link
Collaborator Author

@hschreiber hschreiber Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use lower case for all types which are or have high probability to be native types and upper case for classes/structs. But true, it makes the Filtration_value case weird. I don't have a very strong opinion about it, so I will let you both decide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get inspiration from somewhere for this style? Since we haven't used that in the rest of Gudhi, I am a bit reluctant to introduce this inconsistency whose benefits are not really obvious to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember where I got it from, but I think it was from a comment of someone, as I didn't paid much attention to this detail before. So either it was from someone working on Gudhi, or from one of my coauthors.

Like I said, I can change that for Gudhi if you prefer. Is it only for Filtration_value or is there some unwritten rule about type nomenclature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know specifically about Gudhi (although we seem to have mostly done it this way), but it is fairly common to capitalize all types, so you can tell immediately if something is a type or a variable (in a wide sense that also includes functions). You can wait for Vincent to come back, but I think I would prefer to stick to this convention. It also makes it unnecessary (although not forbidden) to end a type name in _t or _type.

Comment on lines 78 to 80
* using dimension_type = Zigzag_persistence::dimension_type;
* using index_type = Zigzag_persistence::index;
* using filtration_value_type = Filtered_zigzag_persistence::filtration_value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks strange that dimension_type already ends in _type while index becomes index_type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but when I just write using index = Zigzag_persistence::index; the compiler complains about some "Redefinition of 'index' as different kind of symbol" with a first definition in "strings.h". I could just change index everywhere to index_type, but as the type is used a lot, I prefer the shorter version. Or I change dimension_type to dimension. Or perhaps capitalizing the first letter is just enough (as you want me to do anyway).

Like I said, I have no preferences on this. But then I would change all type names once and for all to homogenize them. So, if I understood well, your preference would be: all types start with a capitalized letter and only the name of object without any "_type" or "_t" at the end? Eg. dimension_type ---> Dimension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index should work (unless it conflicts with something else), or having the typedef not in the global namespace (in a subnamespace, in a function/class, etc).
Yes on the description of my preference, I think that's mostly what we've been doing so far in Gudhi (with obvious exceptions like value_type for compatibility with the standard library, or Thing_type where "type" is not in the C++ sense but means "kind", "category").

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I would do a PR for the matrix module too, to keep things coherent.

/**
* @brief Type for filtration values.
*/
using filtration_value = unspecified;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get inspiration from somewhere for this style? Since we haven't used that in the rest of Gudhi, I am a bit reluctant to introduce this inconsistency whose benefits are not really obvious to me.

* non-zero coefficients generating it. A face should be represented by the arrow number when the face appeared for
* the first time in the filtration (if a face was inserted and then removed and reinserted etc., only the last
* insertion counts). The face range should be ordered by increasing arrow numbers.
* @param dimension Dimension of the inserted face.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine what removing dim from Zigzag_persistence would look like. I am not so sure what making it optional would mean. It sounds like a lot of complications, extra overloads and if constexpr ☹️ Since the dimension doesn't hurt that much in Zigzag_persistence (if we don't want it, we can pass 0 all the time and ignore it in the callback), and if there was no support for dimension it wouldn't be hard to work around (store the dimension in a specific location, known to the callback, before each insert/remove), I don't think it deserves complications by making it optional, we should just pick yes or no.
It looks like the dimension is already kind of optional in the matrix, since zigzags don't pass the dimension to the matrix as far as I can see 😉


// Only the closed bars where output so far, so the open/infinite bars still need to be retrieved.

// in this example, outputs (0, 0) and (0, 7)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not "[0] 0 - inf"?

Copy link
Collaborator Author

@hschreiber hschreiber Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback method will write "[0] 0 - inf", but it will be called with the arguments 0 and 0 (callback(0,0)).

Perhaps I should write "computes" instead of "outputs" to be more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the decimal values help more than the verb.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those are integers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, my comment seems completely off, sorry. I wonder if I intended to post it elsewhere instead...

src/Zigzag_persistence/doc/zigzag_ex.png Outdated Show resolved Hide resolved
*
* \li \gudhi_example_link{Zigzag_persistence,example_usage_zigzag_persistence.cpp} - A simple example to showcase how
* to use the @ref Zigzag_persistence class to compute a barcode.
* <details>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this doesn't work with the version of doxygen we use in the CI, we really need to upgrade before the next release (not this PR).

// The bars are stored within the class and where not output at all for now.

// get all bars in a vector
auto barcode = zp.get_persistence_diagram();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should use zp.get_persistence_diagram(0, true) so it matches the other examples? It can also be a hint to users that the documentation of this function may be useful. On the other hand, it complicates (slightly) the example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep the example as simple as possible. It shows also what the default parameters do. But I have no strong opinion about it.


// Only the closed bars where output so far, so the open/infinite bars still need to be retrieved.

// in this example, outputs (0, 0) and (0, 7)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the decimal values help more than the verb.

src/common/doc/main_page.md Outdated Show resolved Hide resolved
src/Zigzag_persistence/doc/COPYRIGHT Outdated Show resolved Hide resolved
hschreiber and others added 2 commits October 24, 2024 11:09
@VincentRouvreau VincentRouvreau added the 3.11.0 GUDHI version 3.11.0 label Oct 24, 2024
@VincentRouvreau VincentRouvreau merged commit 139e285 into GUDHI:master Oct 24, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11.0 GUDHI version 3.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants