Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork actions must not allocate #593

Merged
merged 2 commits into from
Jul 28, 2023
Merged

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Jul 27, 2023

The execve action allocated the arrays in the forked child process. However, in a multi-threaded program we might have forked while another thread had the malloc lock. In that case, the child would wait forever because it inherited the locked mutex but not the thread that would unlock it. e.g.

#0  futex_wait (private=0, expected=2, futex_word=0xffff9509cb10 <main_arena>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait_private (futex=futex@entry=0xffff9509cb10 <main_arena>) at ./nptl/lowlevellock.c:34
#2  0x0000ffff94f8e780 in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at ./malloc/malloc.c:3650
#3  0x0000aaaac67cfa68 in make_string_array (errors=errors@entry=37, v_array=281472912006504) at fork_action.c:47
#4  0x0000aaaac67cfaf4 in action_execve (errors=37, v_config=281472912003024) at fork_action.c:61
#5  0x0000aaaac67cf93c in eio_unix_run_fork_actions (errors=errors@entry=37, v_actions=281472912002960) at fork_action.c:19

@talex5 talex5 added the bug Something isn't working label Jul 27, 2023
talex5 added a commit to talex5/solver-service that referenced this pull request Jul 27, 2023
talex5 added a commit to talex5/solver-service that referenced this pull request Jul 27, 2023
The `execve` action allocated the arrays in the forked child process.
However, in a multi-threaded program we might have forked while another
thread had the malloc lock. In that case, the child would wait forever
because it inherited the locked mutex but not the thread that would
unlock it. e.g.

    #0  futex_wait (private=0, expected=2, futex_word=0xffff9509cb10 <main_arena>) at ../sysdeps/nptl/futex-internal.h:146
    ocaml-multicore#1  __GI___lll_lock_wait_private (futex=futex@entry=0xffff9509cb10 <main_arena>) at ./nptl/lowlevellock.c:34
    ocaml-multicore#2  0x0000ffff94f8e780 in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at ./malloc/malloc.c:3650
    ocaml-multicore#3  0x0000aaaac67cfa68 in make_string_array (errors=errors@entry=37, v_array=281472912006504) at fork_action.c:47
    ocaml-multicore#4  0x0000aaaac67cfaf4 in action_execve (errors=37, v_config=281472912003024) at fork_action.c:61
    ocaml-multicore#5  0x0000aaaac67cf93c in eio_unix_run_fork_actions (errors=errors@entry=37, v_actions=281472912002960) at fork_action.c:19
talex5 added a commit to talex5/solver-service that referenced this pull request Jul 28, 2023
@talex5 talex5 merged commit 47f4d20 into ocaml-multicore:main Jul 28, 2023
4 checks passed
talex5 added a commit to talex5/opam-repository that referenced this pull request Aug 29, 2023
CHANGES:

New features / API changes:

- Replace objects with variants (@talex5 @patricoferris ocaml-multicore/eio#553 ocaml-multicore/eio#605 ocaml-multicore/eio#608, reviewed by @avsm).
  Some potential users found object types confusing, so we now use an alternative scheme for OS resources.
  For users of the resources, the only thing that changes is the types:

  - Instead of taking an argument of type `#foo`, you should now take `_ foo`.
  - Instead of returning a value of type `foo`, you should now return `foo_ty Eio.Resource.t`.

  To provide your own implementation of an interface, you now provide a module rather than an object.
  For example, to provide your own source flow, use `Eio.Flow.Pi.source (module My_source)`.

  If you want to define your own interfaces, see the `Eio.Resource` module documentation.

- Add `Eio.Pool` (@talex5 @darrenldl ocaml-multicore/eio#602, reviewed by @patricoferris).
  A lock-free pool of resources. This is similar to `Lwt_pool`.

- Add `Eio.Lazy` (@talex5 ocaml-multicore/eio#609, reviewed by @SGrondin).
  If one fiber tries to force a lazy value while another is already doing it,
  this will wait for the first one to finish rather than raising an exception (as `Stdlib.Lazy` does).

- Add `Eio.Path.native` (@talex5 ocaml-multicore/eio#603, reviewed by @patricoferris).
  This is useful when interacting with non-Eio libraries, for spawning sub-processes, and for displaying paths to users.

- Add `Flow.single_write` (@talex5 ocaml-multicore/eio#598).

- Add `Eio.Flow.Pi.simple_copy` (@talex5 ocaml-multicore/eio#611).
  Provides an easy way to implement the `copy` operation when making your own sink.

- Eio_unix: add FD passing (@talex5 ocaml-multicore/eio#522).
  Allows opening a file and passing the handle over a Unix-domain socket.

- Add `Process.run ?is_success` to control definition of success (@SGrondin ocaml-multicore/eio#586, reviewed by @talex5).

- Add `Eio_mock.Domain_manager` (@talex5 ocaml-multicore/eio#610).
  This mock domain manager runs everything in a single domain, allowing tests to remain deterministic.

- Add `Eio.Debug.with_trace_prefix` (@talex5 ocaml-multicore/eio#610).
  Allows prefixing all `traceln` output. The mock domain manager uses this to indicate which fake domain is running.

Bug fixes:

- Fork actions must not allocate (@talex5 ocaml-multicore/eio#593).
  When using multiple domains, child processes could get stuck if they forked while another domain held the malloc lock.

- eio_posix: ignore some errors writing to the wake-up pipe (@talex5 ocaml-multicore/eio#600).
  If the pipe is full or closed, the wake-up should simply be ignored.

Build/test fixes:

- Fix some MDX problems on Windows (@polytypic ocaml-multicore/eio#597).

- The README depends on kcas (@talex5 ocaml-multicore/eio#606).

- Clarify configuration for lib_eio_linux and enable tests on other arches (@dra27 ocaml-multicore/eio#592).

- eio_linux tests: skip fixed buffer test if not available (@talex5 ocaml-multicore/eio#604).

- eio_windows: update available line to win32 (@talex5 ocaml-multicore/eio#588 ocaml-multicore/eio#591).
talex5 added a commit to talex5/opam-repository that referenced this pull request Aug 29, 2023
CHANGES:

New features / API changes:

- Replace objects with variants (@talex5 @patricoferris ocaml-multicore/eio#553 ocaml-multicore/eio#605 ocaml-multicore/eio#608, reviewed by @avsm).
  Some potential users found object types confusing, so we now use an alternative scheme for OS resources.
  For users of the resources, the only thing that changes is the types:

  - Instead of taking an argument of type `#foo`, you should now take `_ foo`.
  - Instead of returning a value of type `foo`, you should now return `foo_ty Eio.Resource.t`.

  To provide your own implementation of an interface, you now provide a module rather than an object.
  For example, to provide your own source flow, use `Eio.Flow.Pi.source (module My_source)`.

  If you want to define your own interfaces, see the `Eio.Resource` module documentation.

- Add `Eio.Pool` (@talex5 @darrenldl ocaml-multicore/eio#602, reviewed by @patricoferris).
  A lock-free pool of resources. This is similar to `Lwt_pool`.

- Add `Eio.Lazy` (@talex5 ocaml-multicore/eio#609, reviewed by @SGrondin).
  If one fiber tries to force a lazy value while another is already doing it,
  this will wait for the first one to finish rather than raising an exception (as `Stdlib.Lazy` does).

- Add `Eio.Path.native` (@talex5 ocaml-multicore/eio#603, reviewed by @patricoferris).
  This is useful when interacting with non-Eio libraries, for spawning sub-processes, and for displaying paths to users.

- Add `Flow.single_write` (@talex5 ocaml-multicore/eio#598).

- Add `Eio.Flow.Pi.simple_copy` (@talex5 ocaml-multicore/eio#611).
  Provides an easy way to implement the `copy` operation when making your own sink.

- Eio_unix: add FD passing (@talex5 ocaml-multicore/eio#522).
  Allows opening a file and passing the handle over a Unix-domain socket.

- Add `Process.run ?is_success` to control definition of success (@SGrondin ocaml-multicore/eio#586, reviewed by @talex5).

- Add `Eio_mock.Domain_manager` (@talex5 ocaml-multicore/eio#610).
  This mock domain manager runs everything in a single domain, allowing tests to remain deterministic.

- Add `Eio.Debug.with_trace_prefix` (@talex5 ocaml-multicore/eio#610).
  Allows prefixing all `traceln` output. The mock domain manager uses this to indicate which fake domain is running.

Bug fixes:

- Fork actions must not allocate (@talex5 ocaml-multicore/eio#593).
  When using multiple domains, child processes could get stuck if they forked while another domain held the malloc lock.

- eio_posix: ignore some errors writing to the wake-up pipe (@talex5 ocaml-multicore/eio#600).
  If the pipe is full or closed, the wake-up should simply be ignored.

Build/test fixes:

- Fix some MDX problems on Windows (@polytypic ocaml-multicore/eio#597).

- The README depends on kcas (@talex5 ocaml-multicore/eio#606).

- Clarify configuration for lib_eio_linux and enable tests on other arches (@dra27 ocaml-multicore/eio#592).

- eio_linux tests: skip fixed buffer test if not available (@talex5 ocaml-multicore/eio#604).

- eio_windows: update available line to win32 (@talex5 ocaml-multicore/eio#588 ocaml-multicore/eio#591).
nberth pushed a commit to nberth/opam-repository that referenced this pull request Jun 18, 2024
CHANGES:

New features / API changes:

- Replace objects with variants (@talex5 @patricoferris ocaml-multicore/eio#553 ocaml-multicore/eio#605 ocaml-multicore/eio#608, reviewed by @avsm).
  Some potential users found object types confusing, so we now use an alternative scheme for OS resources.
  For users of the resources, the only thing that changes is the types:

  - Instead of taking an argument of type `#foo`, you should now take `_ foo`.
  - Instead of returning a value of type `foo`, you should now return `foo_ty Eio.Resource.t`.

  To provide your own implementation of an interface, you now provide a module rather than an object.
  For example, to provide your own source flow, use `Eio.Flow.Pi.source (module My_source)`.

  If you want to define your own interfaces, see the `Eio.Resource` module documentation.

- Add `Eio.Pool` (@talex5 @darrenldl ocaml-multicore/eio#602, reviewed by @patricoferris).
  A lock-free pool of resources. This is similar to `Lwt_pool`.

- Add `Eio.Lazy` (@talex5 ocaml-multicore/eio#609, reviewed by @SGrondin).
  If one fiber tries to force a lazy value while another is already doing it,
  this will wait for the first one to finish rather than raising an exception (as `Stdlib.Lazy` does).

- Add `Eio.Path.native` (@talex5 ocaml-multicore/eio#603, reviewed by @patricoferris).
  This is useful when interacting with non-Eio libraries, for spawning sub-processes, and for displaying paths to users.

- Add `Flow.single_write` (@talex5 ocaml-multicore/eio#598).

- Add `Eio.Flow.Pi.simple_copy` (@talex5 ocaml-multicore/eio#611).
  Provides an easy way to implement the `copy` operation when making your own sink.

- Eio_unix: add FD passing (@talex5 ocaml-multicore/eio#522).
  Allows opening a file and passing the handle over a Unix-domain socket.

- Add `Process.run ?is_success` to control definition of success (@SGrondin ocaml-multicore/eio#586, reviewed by @talex5).

- Add `Eio_mock.Domain_manager` (@talex5 ocaml-multicore/eio#610).
  This mock domain manager runs everything in a single domain, allowing tests to remain deterministic.

- Add `Eio.Debug.with_trace_prefix` (@talex5 ocaml-multicore/eio#610).
  Allows prefixing all `traceln` output. The mock domain manager uses this to indicate which fake domain is running.

Bug fixes:

- Fork actions must not allocate (@talex5 ocaml-multicore/eio#593).
  When using multiple domains, child processes could get stuck if they forked while another domain held the malloc lock.

- eio_posix: ignore some errors writing to the wake-up pipe (@talex5 ocaml-multicore/eio#600).
  If the pipe is full or closed, the wake-up should simply be ignored.

Build/test fixes:

- Fix some MDX problems on Windows (@polytypic ocaml-multicore/eio#597).

- The README depends on kcas (@talex5 ocaml-multicore/eio#606).

- Clarify configuration for lib_eio_linux and enable tests on other arches (@dra27 ocaml-multicore/eio#592).

- eio_linux tests: skip fixed buffer test if not available (@talex5 ocaml-multicore/eio#604).

- eio_windows: update available line to win32 (@talex5 ocaml-multicore/eio#588 ocaml-multicore/eio#591).
@talex5 talex5 deleted the fork-alloc branch November 11, 2024 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant