Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad conversion of NULL when passing integer vector #1564

Open
szhorvat opened this issue Oct 22, 2024 · 8 comments
Open

Bad conversion of NULL when passing integer vector #1564

szhorvat opened this issue Oct 22, 2024 · 8 comments
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@szhorvat
Copy link
Member

szhorvat commented Oct 22, 2024

When passing an integer vector from R to C, the NULL value is converted to an empty vector.

This is the ultimate cause of the UBSan failure that CRAN caught,

> sample_motifs(g, 3)
vendor/cigraph/src/misc/motifs.c:719:12: runtime error: -nan is outside the range of representable values of type 'long'
    #0 0x7fdade541593 in igraph_motifs_randesu_estimate /data/gannet/ripley/R/packages/tests-clang-UBSAN/igraph/src/vendor/cigraph/src/misc/motifs.c:719:12
    #1 0x7fdadde7184f in R_igraph_motifs_randesu_estimate /data/gannet/ripley/R/packages/tests-clang-UBSAN/igraph/src/rinterface.c:8689:3
    #2 0x55ea33fe822a in R_doDotCall (/data/gannet/ripley/R/R-clang/bin/exec/R+0xf122a)
    #3 0x55ea34030024 in bcEval_loop eval.c
    #4 0x55ea3401f39b in bcEval eval.c
    #5 0x55ea3401eb24 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x127b24)
    #6 0x55ea34042141 in R_execClosure eval.c
    #7 0x55ea3404162b in applyClosure_core eval.c
    #8 0x55ea3401ef75 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x127f75)
    #9 0x55ea34076b00 in Rf_ReplIteration (/data/gannet/ripley/R/R-clang/bin/exec/R+0x17fb00)
    #10 0x55ea3407849e in run_Rmainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x18149e)
    #11 0x55ea3407850a in Rf_mainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x18150a)
    #12 0x55ea33f5e947 in main (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67947)
    #13 0x7fdaeee2950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
    #14 0x7fdaeee295c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x295c8) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
    #15 0x55ea33f5e864 in _start (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67864)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior vendor/cigraph/src/misc/motifs.c:719:12 
[1] -9.223372e+18

It's not just an UBSan warning. sample_motifs() is essentially broken, and likely so are some other functions that take optional integer vectors.

In rinterface.c, I see this:

  if (!Rf_isNull(sample)) {
    R_SEXP_to_vector_int_copy(sample, &c_sample);
    IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
  } else {
    IGRAPH_R_CHECK(igraph_vector_int_init(&c_sample, 0));
    IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
  }

It should produce a NULL pointer instead of an empty vector.

This should be changed, but be aware that doing so may cause issues elsewhere, which will need to be fixed as well.

I don't see the null check in types-RC.yaml, so I assume it comes directly from Stimulus. Can you have a look, @Antonov548 ?

@szhorvat
Copy link
Member Author

Generally, for any OPTIONAL vector, NULL in R should be converted to a NULL pointer in C. If this is not appropriate for some function, we should either make changes in the C core (ping me) or special-case it in the R interface.

@szhorvat szhorvat added this to the 2.1.2 milestone Oct 22, 2024
@krlmlr
Copy link
Contributor

krlmlr commented Oct 30, 2024

Thanks for the analysis. I'm not sure I follow.

rigraph/src/rinterface.c

Lines 8681 to 8689 in a871789

if (!Rf_isNull(sample)) {
R_SEXP_to_vector_int_copy(sample, &c_sample);
IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
} else {
IGRAPH_R_CHECK(igraph_vector_int_init(&c_sample, 0));
IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
}
/* Call igraph */
IGRAPH_R_CHECK(igraph_motifs_randesu_estimate(&c_graph, &c_est, c_size, (Rf_isNull(cut_prob) ? 0 : &c_cut_prob), c_sample_size, (Rf_isNull(sample) ? 0 : &c_sample)));

In this code, line 8689 (the one shown in CRAN's backtrace) checks sample for being an R NULL, and passes 0 (a C NULL) in this case. Initialization of c_sample may be redundant, but perhaps harmless?

Can this be replicated with Docker and https://github.com/cynkra/docker-images/pkgs/container/docker-images%2Frigraph-san ?

@krlmlr
Copy link
Contributor

krlmlr commented Oct 30, 2024

Fortunately, the fix is easy.

@krlmlr krlmlr added the bug an unexpected problem or unintended behavior label Oct 30, 2024
@szhorvat
Copy link
Member Author

Fortunately, the fix is easy.

OK, I see. I didn't read the code carefully, just observed the behaviour.

@krlmlr
Copy link
Contributor

krlmlr commented Oct 31, 2024

The CI shows another unrelated error, hopefully similarly simple to fix.

https://github.com/igraph/rigraph/actions/runs/11601556838/job/32304898594#step:5:6515

@szhorvat
Copy link
Member Author

So what happened was that as.numeric(NULL) gives a zero-length vector, right? Which is precisely what we don't want here. I investigated this from the C core side, improved the error messages there, and when I saw zero-length vectors arriving, I assumed that R/igraph was intentionally passing those. Some C functions do interpret zero-length vectors as "nothing" but we're getting rid of that behaviour.

There are a lot of as.numeric() in the R glue code ... a lot of the in-conversion in type-RR.yaml just does as.numeric(). Can we / should we replace this with a function that keeps NULL as NULL?

In C, generally, the convention is that a vector parameter can be omitted by passing NULL for it. Not all parameters can be omitted. interfaces.yaml should indicate which one can using OPTIONAL. For some functions, it still doesn't use OPTIONAL, but specifies NULL as a default value.

When a parameter cannot be omitted, but the user still passes NULL in C, the result is a crash. So blindly passing NULL from R even for non-OPTIONAL parameters is not okay.

This should be reviewed carefully, the glue code should be adapted. What I can do on the C side is to make sure that OPTIONAL is always used when appropriate.

@szhorvat
Copy link
Member Author

@ntamas What is the proper way to specify an optional vector parameter in functions.yaml? Should it be OPTIONAL SOMETHING param=NULL or is OPTIONAL SOMETHING param enough? It would make sense to me that =NULL shouldn't be necessary when OPTIONAL is already given.

@szhorvat
Copy link
Member Author

szhorvat commented Oct 31, 2024

See igraph/igraph#2690 and #1567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants