Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New behavior of igraph::disjoint_union #271

Open
bockthom opened this issue Oct 21, 2024 · 17 comments
Open

New behavior of igraph::disjoint_union #271

bockthom opened this issue Oct 21, 2024 · 17 comments

Comments

@bockthom
Copy link
Collaborator

The following comment has been posted by @maxloeffler in #260:


Unfortunately, somewhat about their new version breaks in our usage of igraph::disjoint_union (which worked prior to updating igraph). I researched a bit and found that the breaking change was introduced 4 months ago in this commit. I have not yet figured out a way to fix that issue.

Concrete error description

Error Message:

── Error (test-networks.R:404:5): Extraction of sub-networks ───────────────────
<vctrs_error_ptype2/vctrs_error_incompatible_type/vctrs_error_incompatible/vctrs_error/rlang_error/error/condition>
Error in `vctrs::vec_c(attr[[exattr[a]]], ea[[exattr[a]]])`: Can't combine `..1` <list> and `..2` <character>.
Backtrace:

  1. ├─global get.sample.network() at test-networks.R:404:5
  2. │ └─net.builder$get.multi.network() at util-networks.R:2179:5
  3. │   └─igraph::disjoint_union(authors.net, artifacts.net) at util-networks.R:1280:13
  4. │     └─vctrs::vec_c(attr[[exattr[a]]], ea[[exattr[a]]])
  5. └─vctrs (local) `<fn>`()
  6.   └─vctrs::vec_default_ptype2(...)
  7.     ├─base::withRestarts(...)
  8.     │ └─base (local) withOneRestart(expr, restarts[[1L]])
  9.     │   └─base (local) doWithOneRestart(return(expr), restart)
 10.     └─vctrs::stop_incompatible_type(...)
 11.       └─vctrs:::stop_incompatible(...)
 12.         └─vctrs:::stop_vctrs(...)
 13.           └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

When breaking the variables are instanciated as follows:

Browse[1]> attr
$date
$date[[1]]
[1] "2004-10-09 18:38:13 UTC" "2005-02-09 18:49:49 UTC"

$date[[2]]
[1] "2010-07-12 10:05:36 UTC" "2010-07-12 11:05:35 UTC"

$date[[3]]
[1] "2010-07-12 12:05:34 UTC" "2010-07-12 12:05:40 UTC"

$date[[4]]
[1] "2010-07-12 12:05:41 UTC" "2010-07-12 12:05:42 UTC"


$artifact.type
$artifact.type[[1]]
[1] "Mail" "Mail"

$artifact.type[[2]]
[1] "Mail" "Mail"

$artifact.type[[3]]
[1] "Mail" "Mail"

$artifact.type[[4]]
[1] "Mail" "Mail"


$message.id
$message.id[[1]]
[1] "<[email protected]>" "<[email protected]>"

$message.id[[2]]
[1] "<[email protected]>" "<[email protected]>"

$message.id[[3]]
[1] "<[email protected]>" "<[email protected]>"

$message.id[[4]]
[1] "<[email protected]>" "<[email protected]>"


$thread
$thread[[1]]
[1] "<thread-1>" "<thread-1>"

$thread[[2]]
[1] "<thread-2>" "<thread-2>"

$thread[[3]]
[1] "<thread-3>" "<thread-3>"

$thread[[4]]
[1] "<thread-4>" "<thread-4>"


$weight
[1] 2 2 2 2 1 1 1 1 1

$type
[1] "Unipartite" "Unipartite" "Unipartite" "Unipartite"

$relation
[1] "mail" "mail" "mail" "mail"

Browse[1]> ea
$weight
[1] 1 1 1 1 1

$artifact.type
[1] "Feature" "Feature" "Feature" "Feature" "Feature"

$type
[1] "Unipartite" "Unipartite" "Unipartite" "Unipartite" "Unipartite"

$relation
[1] "callgraph" "callgraph" "callgraph" "callgraph" "callgraph"

Browse[1]> a
[1] 2

Originally posted by @maxloeffler in #260 (comment)

@bockthom
Copy link
Collaborator Author

bockthom commented Oct 21, 2024

I've moved this to a new issue. I am actually not sure whether this is a problem on our side or on the side of igraph.

One and a half years ago, I spotted an issue in how igraph::disjoint_union handles types. Therefore, I implemented a workaround in coronet and reported the bug to igraph and requested a fix from them. With yesterday's release, the igraph people have released a fix for the bug report that I have opened one and a half years ago. For more information about my workaround, see #236 and this commit in coronet. In this commit, I also have added a link to the bug report that I had submitted to igraph.

What we have to do on our side now: Check whether we need to adjust our code to be compatible with igraph's fix - or whether there is a problem in igraph's fix that we need to report to igraph again.

@bockthom
Copy link
Collaborator Author

bockthom commented Oct 23, 2024

I looked a little bit deeper into this issue:

exattr <- intersect(names(ea), names(attr)) means that exattr is c("weight", "artifact.type", "type", "relation"). Consequently, exattr[a] is "artifact.type". When we look at attr and ea, we can see that the artifact.type in the first one is a list, while in the second one, it is a vector:

$artifact.type
$artifact.type[[1]]
[1] "Mail" "Mail"

$artifact.type[[2]]
[1] "Mail" "Mail"

$artifact.type[[3]]
[1] "Mail" "Mail"

$artifact.type[[4]]
[1] "Mail" "Mail"

vs.

$artifact.type
[1] "Feature" "Feature" "Feature" "Feature" "Feature"

The question now is: Why do we have different data structures for the artifact type for the different networks? The first one (i.e., the list) originates from the author network, while the second one (i.e., the character vector) originates from the artifact network.

@maxloeffler Could you please investigate why we have different data structures for artifact.type in author networks and artifact networks? Thanks! It could also be that this is independent of the network type, but just related to how the edges and their attributes are constructed.

@bockthom bockthom added this to the v4.5 milestone Oct 23, 2024
@bockthom
Copy link
Collaborator Author

bockthom commented Oct 23, 2024

@maxloeffler I forgot about mentioning one additional check in our meeting:

Could you please also check what happens in our test suite when you comment out the changes of this commit: a953555 With the new igraph version 2.1.1, the workaround introduced in that commit should not be necessary any more (at least, I'd hope so). However, even if it is not necessary any more, let's keep the workaround for a while not to abruptly cease coronet support of previous igraph versions.

@maxloeffler
Copy link
Contributor

Reverting the mentioned commit leads to an additional 159 tests failing, so I assume it is still necessary. Also, I realized that in order to patch around in igraphs code of disjoint_union I would need to recompile igraph from source, I'll postpone this check therefore and first evaluate the edge attributes in coronet as discussed.

@bockthom
Copy link
Collaborator Author

Reverting the mentioned commit leads to an additional 159 tests failing

Damn. Did you run it with the edge-attribute-handling fix that solves the list vs. character vector problem? If so, that many tests failing is in stark contrast to my expectations, because the fix of igraph should make the mentioned commit needless.

Could you please dive deeper into what's the problem in the 159 failing tests? It would be good to know whether there are similar list vs. vector situations that we did not come across yet, or whether something else is completely broken. Thanks!

@bockthom
Copy link
Collaborator Author

Also, I realized that in order to patch around in igraphs code of disjoint_union I would need to recompile igraph from source

It depends. If it is just R code in a function that is directly executed by us, you could try to just redefine the entire function and call your function instead of the one from the package. If it is nested within some other functions or interacts with some non-public functions, than you might be right that recompiling could be necessary. But lets first deal with the edge attributes before moving on to new problems...

@maxloeffler
Copy link
Contributor

Okay, seems like the 159 was fake news. I must have had some other change effect the outcome without my knowledge. Reverting the patch of a953555 does only raise one additional error.

@maxloeffler
Copy link
Contributor

Regarding the edge attributes sometimes being of list type and other times being of character type, I found that calling get.multi.network (where igraph::disjoint_union is being called in) may break due to every edge attribute that is not explicitly handled in EDGE.ATTR.COMB. The default behavior for edge attributes when simplifying is concatenating all values of edges that are being merged into a list. However, if there is just one entry the entry will remain of type character. The only solution I see for this problem is to precautiously convert every edge attribute to list by default.

@bockthom
Copy link
Collaborator Author

I'm afraid you are right. This would involve numerous tests to be changed and checks to be adjusted. But there is one part I am not sure about:

The default behavior for edge attributes when simplifying is concatenating all values of edges that are being merged into a list.

Really? This only would happen if the attributes are vectors of multiple length already beforehand. If there is only one value, then it should be combined into a vector instead of a list.

So, in general, we should check for each edge attribute (there is a limited number of them) whether this problem can occur at all. And if so, it might be necessary to wrap exactly these attributes into lists.

@maxloeffler
Copy link
Contributor

  1. Regarding the concatenating list/vector thing: I definitely do not have the extent of R knowledge that you have, but isn't that exactly what happened in the error I reported earlier? In total, eight edges, in pairs of two edges, each with artifact.type being Mail (singular), have been simplified by the concat edge.comb.attr, and the result was this list:
$artifact.type
$artifact.type[[1]]
[1] "Mail" "Mail"

$artifact.type[[2]]
[1] "Mail" "Mail"

$artifact.type[[3]]
[1] "Mail" "Mail"

$artifact.type[[4]]
[1] "Mail" "Mail"
  1. I tried to overwrite the igraph::disjoint_union function to test out what breaks when we apply the same fix they applied for the edge attributes also to the vertex attributes, but I was not able as the function makes use of internal igraph functions that are not exposed post-compile. Though compiling igraph manually should not be too much of a hassle, I believe.

@maxloeffler
Copy link
Contributor

maxloeffler commented Nov 4, 2024

A quick update:

I dug into the initial setting of edge attributes for all networks and found that it can vary how attributes are obtained and assinged. Therefore, I decided a good point to convert attributes to list is at the end of the get.author.network, get.artifact.network, etc., methods. I added my preliminary code there. It checks for each present edge attribute whether it is explicitly handled by EDGE.ATTR.HANDLING or not, and if not, then it reassigns itself as.list(...). All previous errors from igraph::disjoint_union are gone.

Here is my preliminary code for reference:

edges = igraph::as_data_frame(network, "edges")
for (attr in igraph::edge_attr_names(network)) {
  if (!attr %in% names(EDGE.ATTR.HANDLING)) {
      network = igraph::set_edge_attr(network, attr, value = as.list(edges[[attr]]))
  }
}

From these changes, around 140 tests are now failing. Half of them fail because they expect an edge attribute to be of type chr, num or whatever but receive instead a list (which still holds the correct values inside, just not in the correct format). These tests just each need a quick change in the expected values. The other half fail because somewhere in splitting by timestamps the coronet implementation now breaks.

The reason for this is caused by a slightly different behavior of vectors compared to lists in R, especially in combination with unlist and POSIXct values. Previously, the date edge attribute was formatted as a vector similar to all the attributes that are explicitly mentioned in EDGE.ATTR.HANDLING and are therefore not affected by my preliminary code. However, now they are lists.
Unfortunately, we cannot simply exclude the date attribute from the conversion. I made a few tests and found that igraph::disjoint_union may still break in some cases when we do not convert the date attribute, in concrete, when one network contains simplified edges containing multiple date values per edge and the other one doesn't.
This lets me conclude that the coronet implementation needs a few adjustments to work correctly with list dates. I suspect these to be of minor kind, e.g., removing some unlists or replacing some of the date conversion functions by others, but I cannot be certain yet.

Let me know when you have further input or questions

@maxloeffler
Copy link
Contributor

I started fixing the implementation to work with edge attributes all being in list format and stumbled upon yet another problem.

But first of all just a note: Our networks are now not consistent anymore with igraph networks. That means in all tests where we build a network and compare it against a network created via igraph, we need to call the helper function I introduced to 'listify' the edge attribute first. I don't know whether this is a problem to you, at least it allows for easy fixes to all these tests.

Here is the problem I ran into: Lets say we want to add bipartite edges to a network via add.edges.for.bipartite.relation. Here edges are constructed from data which is passed as a parameter and then these are appended with the igraph::add_edges function. Now we have to distinguish between two cases, 1) the network we pass is empty 2) it already has edges.
In case 1) The edges we have to pass to igraph::add_edges cannot already be 'listified' as then it complaines that the types expected for the attributes do not match, e.g., POSIXct expected list() received.
In case 2) the edges we add must be 'listified' already as otherwise we get strange conversion problems. This for example is the result of adding three non-listified edges to a network that already has two regular edges.

  .. ..$ date         :List of 5
  .. .. ..$ : POSIXct[1:1], format: "2016-07-12 15:58:59"
  .. .. ..$ : POSIXct[1:1], format: "2016-07-12 16:00:45"
  .. .. ..$ : num 1.47e+09
  .. .. ..$ : num 1.47e+09
  .. .. ..$ : num 1.47e+09

When listifying first, this problem dissolves.

Either we distinguish in code if the network already has edges and if yes only then 'listify', which in my opinion yields bad code or we remove (or adapt) the types of edge attributes for newly created empty networks. Instead of the date attribute being POSIXct it would have to be list or list(vector(POSIXct)) if it possible to specify types in such a detail in R.

@bockthom
Copy link
Collaborator Author

bockthom commented Nov 12, 2024

But first of all just a note: Our networks are now not consistent anymore with igraph networks. That means in all tests where we build a network and compare it against a network created via igraph, we need to call the helper function I introduced to 'listify' the edge attribute first. I don't know whether this is a problem to you, at least it allows for easy fixes to all these tests.

Hm, I am not really in favor of this solution. But it seems that it is, unfortunately, quite difficult to fix the tests without helper function. And just creating a new helper function for the tests does also not make sense. So, we need to make sure that the helper function itself is properly tested. If this is the case, I'd agree to use the helper function within the tests, but only with gritted teeth.

[...] Either we distinguish in code if the network already has edges and if yes only then 'listify', which in my opinion yields bad code or we remove (or adapt) the types of edge attributes for newly created empty networks. Instead of the date attribute being POSIXct it would have to be list or list(vector(POSIXct)) if it possible to specify types in such a detail in R.

Removing the types of edge attributes would be a step backwards - because then everything would be of type list(). Within a list, you can store elements of any type, there is no restriction possible (as far as I know). That is, we would loose all expected type information.

On the other hand, when we implement the discussed changes, we make every edge attribute (except for those handled separately by the existing constant) to be of type list anyway––so the type for newly created empty networks would already be wrong if everything should be a list (which then needs to be the case also for empty networks right after network creation).

In sum, no matter which solution we choose, we have to give up one of our principles: either we lose concrete types, or we lose type consistency. At the moment, it sounds to be least problematic to adjust case 1) such that empty networks have already listified edge attributes. This would involve that we would omit type information upon network construction, in general (except for those handled separately by the existing constant). In such a case, we would need to make sure that we don't run into any other problems with POSIXct conversions when the expected type is just list()...

@maxloeffler
Copy link
Contributor

Im kinda sorry to only come with bad news week after week but upon fixing the rest of the issues I fell over a bug that is both grave and hard to fix on our side without hacky code. Let me illustrate the issue with some quick quotes and a bit code.

This is from the igraph docs on disjoint_union:

For graphs that lack some vertex/edge attribute, the corresponding values in the new graph are set to NA.

Now as a quick recap, lets see how igraph does implement the edge attribute merging in version 2.1. Most importantly to notice is that they now use the vctrs package both for concatenating attributes together as well as for defining the value of args that are not present:

attr <- list()
  ec <- sapply(graphs, ecount)
  cumec <- c(0, cumsum(ec))
  for (i in seq_along(graphs)) {
    ea <- edge.attributes(graphs[[i]])
    exattr <- intersect(names(ea), names(attr)) # existing and present
    noattr <- setdiff(names(attr), names(ea)) # existint and missing
    newattr <- setdiff(names(ea), names(attr)) # new
    for (a in seq_along(exattr)) {
      attr[[exattr[a]]] <- vctrs::vec_c(attr[[exattr[a]]], ea[[exattr[a]]])
    }
    for (a in seq_along(noattr)) {
      attr[[noattr[a]]] <- vctrs::vec_c(attr[[noattr[a]]], vctrs::unspecified(ec[[i]]))
    }
    for (a in seq_along(newattr)) {
      attr[[newattr[a]]] <- vctrs::vec_c(vctrs::unspecified(cumec[[i]]), ea[[newattr[a]]])
    }
  }
  edge.attributes(res) <- attr

Finally, lets see how vctrs really behaves:

Browse[1]> str(vctrs::unspecified(1))
 'vctrs_unspecified' logi NA
 
 Browse[1]> str(x)
List of 4
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#9>"
 $ : chr "<thread-13#9>"

Browse[1]> str(vctrs::vec_c(x, vctrs::unspecified(1)))
List of 5
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#9>"
 $ : chr "<thread-13#9>"
 $ : NULL

For what reason soever we get a NULL here instead of an NA. Of course this messes with all tests that expect NAs as well makes the output of the networks hugely inconsistent with some values being NAs and some being NULLs. The docs of vctrs vec_c also have no explanation for this.

This time I see good option to fix this. Either, we post-convert NULLs to NAs but this will effectively remove the possibility to have NULLs on purpose as well as being yet another layer of processing, or we replace every NA we have in the tests and the implementation (regarding edge attributes) by NULL as a new way of representing an empty attribute value.

@bockthom
Copy link
Collaborator Author

Im kinda sorry to only come with bad news week after week

You don't need to be sorry for that. Everything is just the way it is.

This is from the igraph docs on disjoint_union:

For graphs that lack some vertex/edge attribute, the corresponding values in the new graph are set to NA.

[...]

Browse[1]> str(vctrs::vec_c(x, vctrs::unspecified(1)))
List of 5
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#8>"
 $ : chr "<thread-13#9>"
 $ : chr "<thread-13#9>"
 $ : NULL

For what reason soever we get a NULL here instead of an NA. Of course this messes with all tests that expect NAs as well makes the output of the networks hugely inconsistent with some values being NAs and some being NULLs. The docs of vctrs vec_c also have no explanation for this.

If this is neither in line with the docs of igraph, nor with the docs of vctrs, this may be something that we should report. The question is: Where should we report it? Maybe to both? Or just to igraph to let them decide on how to deal with that?
We should also find out whether this is something that has changed since they started to use vctrs or whether this was also inconsistent to their docs before. If we report those things, I would appreciate if you could provide me with some minimum example that I could use for reporting the bug/inconsistency.

And we should have an argument why we exactly need this behavior, to convince them that this something that should be fixed.

But when we report this, we should also find out prior to reporting what happens with the vertex attributes in coronet when they would use vctrs also for vertex attributes - to be able to mention both problems in one issue.

This time I see good option to fix this. Either, we post-convert NULLs to NAs but this will effectively remove the possibility to have NULLs on purpose as well as being yet another layer of processing,

No, I don't want to treat NA and NULL identically.

or we replace every NA we have in the tests and the implementation (regarding edge attributes) by NULL as a new way of representing an empty attribute value.

I also don't like this solution.

If this is really a problem of igraph or vctrs, we should ask them to fix these problems on their side.
However, we would need to think about a temporary fix on our side just in case they need quite some time to fix it on their sides...

@maxloeffler
Copy link
Contributor

It is indeed not addressed in the docs of vctrs but in may be consistent with the rest of their code which it is definitely not in igraph. Here is a minimal working example:

# Create network with a duplicate edge.
# Edges have the attr.one edge-attribute.
network.one = igraph::make_empty_graph() +
     igraph::vertices("A", "B") +
     igraph::edges(c("A", "B", "A", "B"), attr.one = "test")

# Simplify both edges into one.
# Use the "concat" strategy for the attr.one edge-attribute.
network.one = igraph::simplify(network.one, edge.attr.comb = list(attr.one = "concat"))

# Create second network without the attr.one edge-attribute.
network.two = igraph::make_empty_graph() +
     igraph::vertices("C", "D") +
     igraph::edges(c("C", "D"), attr.two = "test")

# Join both networks.
union = igraph::disjoint_union(network.one, network.two)

After running this code you can observe how the non-existence of attr.one in edges of network.two is expressed as a NULL instead of an NA, while the non-existence of attr.two in edges of network.one is expressed as an NA.

Browse[1]> str(igraph::as_data_frame(union,"edges"))
'data.frame':   2 obs. of  4 variables:
 $ from    : chr  "A" "C"
 $ to      : chr  "B" "D"
 $ attr.one:List of 2
  ..$ : chr  "test" "test"
  ..$ : NULL
 $ attr.two: chr  NA "test"

This problem is really niche so we have to get a bit more creative if it does not suffice to say that it is not consistent with their documentation. Maybe you know of some statistic evaluation where they use is.na to filter out non-present values (is.na(NULL) != is.na(NA))? I couldn't really find anything directly.

I will now check what happens with the vertex attributes.

@maxloeffler
Copy link
Contributor

Using vctrs for the vertex-attributes as well does not have any negative impact on our codebase. I changed the implementation, then re-compiled, then verified that the correct code is being loaded and did not observe any tests that break from it.

maxloeffler added a commit to maxloeffler/coronet that referenced this issue Dec 3, 2024
Since igraph version 2.1, when joining networks using
'igraph::disjoint_union', edge attributes of the joining networks
require identical types. As simplifiying networks necessarily converts
types of edge attributes to list when merging edges, attributes now have
to be of type list by default.

Edge attributes that are explicitly considered during simplification
and, therefore, are not converted to lists are excluded from this rule.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
maxloeffler added a commit to maxloeffler/coronet that referenced this issue Dec 3, 2024
Adjust the tests in accordance to converting edge attributes to list
type in the implementation.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
maxloeffler added a commit to maxloeffler/coronet that referenced this issue Dec 3, 2024
'plyr::rbind.fill' uses NULL to fill missing values in lists. As we now
use lists for most edge attributes, we need to handle this case
separately to ensure missing values are filled with NAs instead.

To fix this issue, we need to instantiate missing columns in dataframes
with NAs before calling 'plyr::rbind.fill'. This operation is constant
with respect to the amount of rows and should not impact performance too
much.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants