Limit scope of libc usage in note-c. #126

haydenroche5 · 2023-12-11T23:00:57Z

See commit messages for more details.

zfields · 2023-12-12T16:18:23Z

n_cjson.c

+ // Greater than z -> not a valid letter.
+ // Between a and Z -> not a valid letter.
+ // Between a and z (inclusive) -> already lower.
+ if (c < 'A' || c > 'z' || (c < 'a' && c > 'Z') || (c >= 'a' && c <= 'z')) {


This seems overly complicated.

It becomes more obvious if you swap in numbers.

Return the letter if:
x < 65 OR x > 122 OR (x < 97 AND x > 90) OR (x >= 97 AND x <= 122)

-128 >= x < 65,
x > 122,
90 < x < 97,
97 <= x <= 122

-128 <------ 65 ______ 91 --97--122--> 127

Suggested change

if (c < 'A' || c > 'z' || (c < 'a' && c > 'Z') || (c >= 'a' && c <= 'z')) {

if (c < 'A' || c > 'Z') {

Am I missing something?

Nah I think your change works.

zfields · 2023-12-12T16:19:50Z

scripts/check_libc_dependencies.sh

+ "memset"
+ # These string functions are ok.
+ "strchr"
+ "strcmp"


This seems like a good opportunity to not allow these unsafe functions, especially when the safer version is listed below (strncmp).

Obviously there may be a good reason, but if we can get away from it we probably should.

FWIW, I don't think this should prevent this commit from happening. If it's a chip shot then get it, but otherwise we should add it to the backlog - especially now that there is a mechanism for protecting against this.

I'd have to re-immerse myself in the discourse on this age-old argument, but my rough understanding is that you're not that much better off using strncmp over strcmp: https://stackoverflow.com/questions/24353504/whats-wrong-with-strcmp

If your string isn't NULL-terminated, all bets are off.

To me, the two most compelling arguments from the post you linked are:

FOR:

Reading, writing, executing... it doesn't matter. Any memory reference to an unintended address is undefined behavior. In the most apparent scenario, you attempt to access a page that isn't mapped into your process's address space, causing a page fault, and subsequent SIGSEGV. In the worst case, you sometimes run into a \0 byte, but othertimes you run into some other buffer, causing inconstant program behavior. –
Jonathon Reinhart

AGAINST:

And if you're trying to call strcmp or strncmp with an array that you thought contained a null-terminated string but actually doesn't, then your code already has a bug. Using strncmp() might help you avoid the immediate symptom of that bug, but it won't fix it. - Keith Thompson

To my way of thinking, strncmp() is more intentional than strcmp(), because it requires the programmer to consider/know the size of the string they want to test. Yes, bugs can exist with both, but the scope of the damage can be lessened with strncmp(). As a counter-point, it's usually easier to find a smoking gun when the offense is greater, but that damage is inflicted upon the user (so that's not a great experience for them).

All in all, I would go ahead and merge the PR today, but in my opinion, I think we would be better served to reduce our libc footprint and move toward intentional coding.

CMakeLists.txt

n_cjson.c

CMakeLists.txt

haydenroche5 · 2023-12-12T17:10:39Z

n_cjson.c

+ // Greater than z -> not a valid letter.
+ // Between a and Z -> not a valid letter.
+ // Between a and z (inclusive) -> already lower.
+ if (c < 'A' || c > 'z' || (c < 'a' && c > 'Z') || (c >= 'a' && c <= 'z')) {


Nah I think your change works.

haydenroche5 · 2023-12-12T17:14:20Z

scripts/check_libc_dependencies.sh

+ "memset"
+ # These string functions are ok.
+ "strchr"
+ "strcmp"


I'd have to re-immerse myself in the discourse on this age-old argument, but my rough understanding is that you're not that much better off using strncmp over strcmp: https://stackoverflow.com/questions/24353504/whats-wrong-with-strcmp

If your string isn't NULL-terminated, all bets are off.

After some discussion with Ray, we settled on certain libc functions that we're ok using in note-c. Everything else from libc should be excluded. This commit adds a CMake option to build the note-c library without libc. It also tells the linker to generate errors for undefined references. The result is that we can run this build to see what libc functions we're using in note-c. A new Bash script, check_libc_dependencies.sh, processes the output of this build. If any non-permitted functions are found in the undefined reference errors, the script fails. This protects us against the introduction of non-permitted libc functions. A new job in ci.yml runs this script on every PR and push to master.

zfields · 2023-12-12T19:25:26Z

scripts/check_libc_dependencies.sh

+ "memset"
+ # These string functions are ok.
+ "strchr"
+ "strcmp"


To me, the two most compelling arguments from the post you linked are:

FOR:

Reading, writing, executing... it doesn't matter. Any memory reference to an unintended address is undefined behavior. In the most apparent scenario, you attempt to access a page that isn't mapped into your process's address space, causing a page fault, and subsequent SIGSEGV. In the worst case, you sometimes run into a \0 byte, but othertimes you run into some other buffer, causing inconstant program behavior. –
Jonathon Reinhart

AGAINST:

And if you're trying to call strcmp or strncmp with an array that you thought contained a null-terminated string but actually doesn't, then your code already has a bug. Using strncmp() might help you avoid the immediate symptom of that bug, but it won't fix it. - Keith Thompson

To my way of thinking, strncmp() is more intentional than strcmp(), because it requires the programmer to consider/know the size of the string they want to test. Yes, bugs can exist with both, but the scope of the damage can be lessened with strncmp(). As a counter-point, it's usually easier to find a smoking gun when the offense is greater, but that damage is inflicted upon the user (so that's not a great experience for them).

All in all, I would go ahead and merge the PR today, but in my opinion, I think we would be better served to reduce our libc footprint and move toward intentional coding.

haydenroche5 requested a review from zfields December 11, 2023 23:00

haydenroche5 self-assigned this Dec 11, 2023

haydenroche5 force-pushed the libc_purge branch from 43d59bc to 42f86cb Compare December 11, 2023 23:01

Replace all instances of strcpy with strlcpy.

506b4c6

haydenroche5 force-pushed the libc_purge branch from 42f86cb to 5578119 Compare December 11, 2023 23:04

zfields reviewed Dec 12, 2023

View reviewed changes

CMakeLists.txt Show resolved Hide resolved

haydenroche5 commented Dec 12, 2023

View reviewed changes

haydenroche5 added 2 commits December 12, 2023 09:21

Replace tolower with Jtolower (our own implementation).

81cb072

haydenroche5 force-pushed the libc_purge branch from 5578119 to ec5653b Compare December 12, 2023 17:21

haydenroche5 requested a review from zfields December 12, 2023 17:21

zfields approved these changes Dec 12, 2023

View reviewed changes

haydenroche5 merged commit 2d3f0b1 into blues:master Dec 12, 2023
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit scope of libc usage in note-c. #126

Limit scope of libc usage in note-c. #126

haydenroche5 commented Dec 11, 2023

zfields Dec 12, 2023 •

edited

Loading

haydenroche5 Dec 12, 2023

zfields Dec 12, 2023 •

edited

Loading

zfields Dec 12, 2023 •

edited

Loading

haydenroche5 Dec 12, 2023

zfields Dec 12, 2023

haydenroche5 Dec 12, 2023

haydenroche5 Dec 12, 2023

zfields Dec 12, 2023

	if (c < 'A' \|\| c > 'z' \|\| (c < 'a' && c > 'Z') \|\| (c >= 'a' && c <= 'z')) {
	if (c < 'A' \|\| c > 'Z') {

Limit scope of libc usage in note-c. #126

Limit scope of libc usage in note-c. #126

Conversation

haydenroche5 commented Dec 11, 2023

zfields Dec 12, 2023 • edited Loading

Choose a reason for hiding this comment

haydenroche5 Dec 12, 2023

Choose a reason for hiding this comment

zfields Dec 12, 2023 • edited Loading

Choose a reason for hiding this comment

zfields Dec 12, 2023 • edited Loading

Choose a reason for hiding this comment

haydenroche5 Dec 12, 2023

Choose a reason for hiding this comment

zfields Dec 12, 2023

Choose a reason for hiding this comment

haydenroche5 Dec 12, 2023

Choose a reason for hiding this comment

haydenroche5 Dec 12, 2023

Choose a reason for hiding this comment

zfields Dec 12, 2023

Choose a reason for hiding this comment

zfields Dec 12, 2023 •

edited

Loading

zfields Dec 12, 2023 •

edited

Loading

zfields Dec 12, 2023 •

edited

Loading