Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depth_first_search: Return visitor by value #307

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

rkohrt
Copy link

@rkohrt rkohrt commented Sep 12, 2022

In the documentation the DFSVisitor is taken by reference: https://www.boost.org/doc/libs/1_80_0/libs/graph/doc/depth_first_visit.html , while in the actual header it is taken by copy.
I assume the code in the header is by mistake and I would actually like it to take the visitor by reference for my usecase.

As far as I can tell this mismatch existed already from the beginning: 121bb31

@jeremy-murphy
Copy link
Contributor

Yes, the algorithm has DFVisitor& in the declaration, but if you search for all the references to "visitor" on that page, you'll notice that it is otherwise defined as being taken by value.

If you recall, visitors to STL algorithms (such as std::for_each) are also taken by value, so that is the 'correct' thing to do here. Unfortunately, this algorithm doesn't implement the complete visitor design, which should include returning the visitor by value at the end.

So the real fix is to change these algorithms from returning void to returning DFSVisitor by value.

@jeremy-murphy
Copy link
Contributor

Thanks for the quick response. For a moment there I thought that detail::depth_first_visit_impl needed to be changed in the same way, but since it already takes the visitor by reference, well, let's leave it that way.

@jeremy-murphy
Copy link
Contributor

We can remove that [1] note now about a mutable visitor needing to hold data by pointer/reference.

@jeremy-murphy jeremy-murphy changed the title depth_first_search.hpp: align code to documentation depth_first_search: Return visitor by value Sep 15, 2022
@jeremy-murphy
Copy link
Contributor

Great, all tests pass, nothing broken!

@jeremy-murphy
Copy link
Contributor

Oh, now I just realized that this is only the depth_first_VISIT page, so we also have to modify the documentation page for depth_first_SEARCH.

There is also the BREADTH_first algorithm to consider. It would be odd to change one and not the other.

Also... we do need a test to show that the new interface works. Nothing complicated, just something that shows that a mutating visitor returns in the expected state. One that counts the number of times each different visit function is called would be good. When it is returned, you assert the counts equal what you expect.

I know this is more work than you expected, but it is a really positive improvement for the library. I'll advise you through it and merge it at the end.

@rkohrt
Copy link
Author

rkohrt commented Sep 16, 2022

Okay, will make these changes next week

@rkohrt
Copy link
Author

rkohrt commented Sep 27, 2022

Will need to have a closer look at the breath first search to make the change there, because the interface is different. But here is the test for the dfv

@jeremy-murphy
Copy link
Contributor

It's looking really good!

Comment on lines 87 to 88
BOOST_TEST(visitor_copy.vertex_events == 6u*2u + 1u); // discover_vertex + finish_vertex for each vertex and once start_vertex
BOOST_TEST(visitor_copy.seen_edges == 2u*9u);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic numbers are easily confusing; presumably 6 = g.num_vertices(), and 9 = g.num_edges()?

Comment on lines 62 to 63
size_t vertex_events = 0;
size_t seen_edges = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a compilation error in C++03, sorry. You'll have to initialize them the old-fashioned way, in the constructor.

@badfilms
Copy link

badfilms commented Oct 5, 2022

I believe that the issue with passing a visitor by value has to do with when the DFS implementation is recursive, which is why the source for that implementation explicitly has the following comment:

DFSVisitor& vis, // pass-by-reference here, important!

From my experience, this is due to the fact that visitors are inherited from (or otherwise defined in a manner that meets the visitor concept), and thus we cannot expect a user-defined visitor to not have state (which will not be maintained in the current recursive implementation unless passed by reference).

Obviously, the recursive implementation shouldn't be preferred to the iterative one in the majority of use cases, and I believe is more of a legacy implementation from many years ago, but it is worth noting.

But the inconsistencies make the design decision less transparent. Because of the point mentioned above regarding state, I was always under the impression that it was due to the fact that passing a visitor by value has the potential to be expensive--unlike the higher order functions passed by value to algorithms like std::for_each, which just take an invokable that should be trivially copyable.

@jeremy-murphy
Copy link
Contributor

Hey @badfilms , thanks for joining in and pointing out those facts that yes, make the original design decision less transparent and the best way forward now less obvious.

I'm still happy to proceed with the design that I have suggested, copying the visitor, until a definite red flag appears.

But the inconsistencies make the design decision less transparent. Because of the point mentioned above regarding state, I was always under the impression that it was due to the fact that passing a visitor by value has the potential to be expensive--unlike the higher order functions passed by value to algorithms like std::for_each, which just take an invokable that should be trivially copyable.

It's worth noting that std::for_each actually returns the visitor passed into it, which is only done because the visitor might have state that you want to inspect after the algorithm completes. In fact, one might even argue that the only reason to use std::for_each is to apply a stateful 'visitor' and inspect it at the end. (Any transformation (i.e. mutation) of the elements can be accomplished using std::transform.)

Having state does not mean copying is expensive, right? It's only if the copy/assignment constructor is non-trivial that it becomes an issue. Now if a visitor wanted to have a huge amount of state, then I would say it should just hold a pointer to it -- which itself is trivially copyable. And this argument goes for both 'visitors' in std::for_each and graph algorithms.

But on the other hand, I can't really argue against taking visitors by reference either. ¯\_(ツ)_/¯

@badfilms
Copy link

badfilms commented Oct 6, 2022

It's worth noting that std::for_each actually returns the visitor passed into it, which is only done because the visitor might have state that you want to inspect after the algorithm completes. In fact, one might even argue that the only reason to use std::for_each is to apply a stateful 'visitor' and inspect it at the end. (Any transformation (i.e. mutation) of the elements can be accomplished using std::transform.)

@jeremy-murphy Ahh right. I clearly wasn't thinking when I wrote that because my brain was differentiating a DFSVisitor, for example, from a visitor to a std algorithm due to the fact that it has multiple points of interest, forgetting the case of a function object.

}

size_t vertex_events;
size_t seen_edges;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually be tempted to count each distinct event separately, to get a complete picture of what happens.
This would also help catch any subtle bugs.

@jeremy-murphy
Copy link
Contributor

I'm still keen to see this finished, btw. Are you still interested in working on it, @rkohrt ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants