Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: bump to Pyodide 3.12 #938

Merged
merged 8 commits into from
Jun 8, 2024
Merged

ci: bump to Pyodide 3.12 #938

merged 8 commits into from
Jun 8, 2024

Conversation

henryiii
Copy link
Member

@henryiii henryiii commented May 28, 2024

I expect this to fail. (Okay, not that early!)

Signed-off-by: Henry Schreiner <[email protected]>
@github-actions github-actions bot added the needs changelog Might need a changelog entry label May 28, 2024
@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Improvement:

Pyodide has suffered a fatal error. Please report this to the Pyodide maintainers.
The cause of the fatal error was:
CppException pybind11::stop_iteration: The exception is an object of type pybind11::stop_iteration at address 65499896 which does not inherit from std::exception
    at convertCppException (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:49457)
    at API.fatal_error (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:49750)
    at main (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/python:180:17) {
  ty: 'pybind11::stop_iteration',
  pyodide_fatal_error: true
}

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Oh well the problem I'm having is that I am building without -fexceptions. But also it was not correctly handling the fatal.

@henryiii
Copy link
Member Author

henryiii commented Jun 1, 2024

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

This is a bit of a side track. But it decides that it does not inherit from std::exception because the following code sets message to NULL when given the exception
https://github.com/emscripten-core/emscripten/blob/main/system/lib/libcxxabi/src/cxa_exception_js_utils.cpp#L57

void __get_exception_message(void* thrown_object, char** type, char** message) {
  __cxa_exception* exception_header =
    cxa_exception_from_thrown_object(thrown_object);
  const __shim_type_info* thrown_type =
    static_cast<const __shim_type_info*>(exception_header->exceptionType);
  const char* type_name = thrown_type->name();

  int status = 0;
  char* demangled_buf = __cxa_demangle(type_name, 0, 0, &status);
  if (status == 0 && demangled_buf) {
    *type = demangled_buf;
  } else {
    if (demangled_buf) {
      free(demangled_buf);
    }
    *type = (char*)malloc(strlen(type_name) + 1);
    strcpy(*type, type_name);
  }

  *message = NULL;
  const __shim_type_info* catch_type =
    static_cast<const __shim_type_info*>(&typeid(std::exception));
  int can_catch = catch_type->can_catch(thrown_type, thrown_object);
  if (can_catch) {
    const char* what =
      static_cast<const std::exception*>(thrown_object)->what();
    *message = (char*)malloc(strlen(what) + 1);
    strcpy(*message, what);
  }
}

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

But it sure looks like *message can only be NULL if

const __shim_type_info* catch_type =
    static_cast<const __shim_type_info*>(&typeid(std::exception));
int can_catch = catch_type->can_catch(thrown_type, thrown_object);

returns False. This seems to be asking if it would be caught by a catch(std::exception x)... Maybe the trouble is that can_catch misbehaves if the code that threw the exception was compiled without -fexceptions?

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Anyways now that I am building with -fexceptions I reproduce the error we see in CI:

Pyodide has suffered a fatal error. Please report this to the Pyodide maintainers.
The cause of the fatal error was:
TypeError: getWasmTableEntry(...) is not a function
    at invoke_viii (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:1225094)
    at wasm://wasm/00de1546:wasm-function[1873]:0x1e3b2b
    at invoke_vii (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:1224639)
    at wasm://wasm/00de1546:wasm-function[1872]:0x1e3792
    at invoke_ii (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:1223536)
    at wasm://wasm/00de1546:wasm-function[803]:0x14b50a
    at invoke_ii (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:1223536)
    at wasm://wasm/00de1546:wasm-function[298]:0xf638e
    at _PyEM_TrampolineCall_JS (/home/rchatham/Documents/programming/boost-histogram/.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:125866)
    at wasm://wasm/0267d6e2:wasm-function[2178]:0x20af20 {
  pyodide_fatal_error: true
}

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Seems that invoke_viii is being called with a null function pointer for some reason.

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

So the problem is that global.get $GOT.func.__cxa_throw is getting a null pointer...

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

If I fill in GOT.__cxa_throw manually the rest of the test suite works:

const func = new WebAssembly.Function({parameters: ["i32", "i32", "i32"], results:[]}, Module.___cxa_throw);
const slot = Module.wasmTable.grow(1);
Module.wasmTable.set(slot, func);
GOT.__cxa_throw.value = slot;

@sbc100

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Actually, just calling Module.reportUndefinedSymbols() fixes the problem.

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

I think this Emscripten patch should fix the dynamic linking problems in the cli runner. Probably also solves the reason that scipy-pytest.js is needed @lesteve

--- a/src/library_dylink.js
+++ b/src/library_dylink.js
@@ -1143,7 +1143,9 @@ var LibraryDylink = {
     }
 
     try {
-      return loadDynamicLibrary(filename, combinedFlags, localScope, handle)
+      const result = loadDynamicLibrary(filename, combinedFlags, localScope, handle);
+      reportUndefinedSymbols();
+      return result;
     } catch (e) {
 #if ASSERTIONS
       err(`Error in loading dynamic library ${filename}: ${e}`);

@lesteve
Copy link

lesteve commented Jun 1, 2024

Great to hear there is a potential fix for this! I was wondering indeed whether this was not going to be an issue to use cibuildwheel for scikit-learn. I guess this is the issue you have in mind: pyodide/pyodide#3865.

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Hmm darn that sounds different. Here symbol resolution succeeded but the code to lazily initialize the GOT failed. There it looks like it failed at symbol resolution. So there's several bugs it seems.

We're using this same mechanism for loading shared libs at cloudflare (it's necessary for performance) so there's a lot of reason to fix it...

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Minimal reproduction here is:

from boost_histogram._core import axis as ca
ax1 = ca.regular_uflow(1, 2, 3)
ax2 = ca.regular_none(1, 2, 3)
ax1 == ax2

Now I guess I'll have to start minimizing the C++ code =/
Presumably I can modify pybind11 in place too?

@henryiii
Copy link
Member Author

henryiii commented Jun 1, 2024

You can copy it locally and swap out the find_package for add_subdirectory.

@henryiii
Copy link
Member Author

henryiii commented Jun 1, 2024

The repl is pretty broken on iOS in 0.26 otherwise I’d check it with the in-tree one. Can do later on a computer.

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

I think I'm making pretty good progress on minimizing. So far have the C++ code in src down to:

#include <bh_python/pybind11.hpp>

#include <bh_python/axis.hpp>
#include <bh_python/register_axis.hpp>

void register_axes(py::module& mod) {
    auto ax = register_axis<axis::regular_none>(mod);
    ax.def(py::init<unsigned, double, double>(), "bins"_a, "start"_a, "stop"_a);
    ax = register_axis<axis::regular_uflow>(mod);
    ax.def(py::init<unsigned, double, double>(), "bins"_a, "start"_a, "stop"_a);
}

trying to reduce register_axis.hpp right now.

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

axis.hpp should directly include #include <boost/histogram/axis/regular.hpp>, it picks it up accidentally from regular_numpy.hpp

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Currently I have module.cpp:

#include <pybind11/pybind11.h>
#include <boost/histogram/axis/regular.hpp>
#include <boost/histogram/axis.hpp>

namespace py = pybind11;
namespace bh = boost::histogram;

using regular_none
    = bh::axis::regular<double, bh::use_default, int, bh::axis::option::none_t>;
using regular_uflow
    = bh::axis::regular<double, bh::use_default, double, bh::axis::option::none_t>;

PYBIND11_MODULE(_core, m) {
    py::module mod = m.def_submodule("axis");
    mod.def("do_the_thing", [] (const py::object& a) {
        py::cast<regular_none>(a);
        return 77;
    });
    py::class_<regular_uflow> ax(mod, "regular_uflow");
    ax.def(py::init<unsigned, double, double>());
}

and repro.py:

from boost_histogram._core import axis as ca
ca.do_the_thing(ca.regular_uflow(1, 2, 3))

Without the fix, it is a fatal error. With the fix, it raises:

RuntimeError: Unable to cast Python instance of type <class 'boost_histogram._core.axis.regular_uflow'> to C++ type 'boost::histogram::axis::regular<double, boost::use_default, int, boost::histogram::axis::option::bitset<0u>>'

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

@henryiii looks like at this point I should be ready to remove boost/histogram/... entirely. I just need suitable replacements for regular_none and regular_uflow...

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

Okay, no boost/histogram anymore:

#include <pybind11/pybind11.h>

template <class Value>
class regular {};

namespace py = pybind11;


PYBIND11_MODULE(_core, m) {
    py::module mod = m.def_submodule("axis");
    mod.def("do_the_thing", [] (const py::object& a) {
        py::cast<regular<double>>(a);
        return 77;
    });
    py::class_<regular<int>> ax(mod, "Thing");
    ax.def(py::init<>());
}

and:

from boost_histogram._core import axis as ca
print(ca.do_the_thing(ca.Thing()))

@hoodmane
Copy link

hoodmane commented Jun 1, 2024

If I can't reduce this any further, this already would be an acceptable test case. Still, would be nice to get rid of pybind11 and Python and get a C++ only reproducer...

@hoodmane
Copy link

hoodmane commented Jun 2, 2024

Actually, if you delete __attribute__((visibility("default"))) the problem goes away.

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

I've been using --exports whole_archive, are you using that? The default pyodide-build behavior IIRC picks out just the PyInit. Not sure why anything would be affected here, as default visibility shouldn't hide other symbols by itself?

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

FYI, this works fine with the in-tree build of boost-histogram. That is based on the old setuptools build system instead of CMake, but assuming this shouldn't matter much? (the out-of-tree build system worked fine until the latest pyodide)

@hoodmane
Copy link

hoodmane commented Jun 3, 2024

Well it's just the out of tree CLI runner. It uses a lazier dynamic linking strategy. Normal Pyodide does pessimistic eager loads which is easier to get right but is slow. It would be nice to use the lazier approach always. But for now, I took a different tradeoff so that python -c 'print("hi")' doesn't load a bunch of dsos that aren't needed and when dsos are loaded we don't preresolve GOT entries that aren't needed, etc. It really matters to make this work well in the long run.

@hoodmane
Copy link

hoodmane commented Jun 3, 2024

Anyways, the compile flags are:

-O3 -fPIC -fvisibility=hidden -fexceptions -g2

If we remove -fvisibility=hidden then __attribute__((visibility("default"))) stops mattering. The full diff between the dumped crashing object file and the noncrashing object file is just one byte:

-okay.o:	file format wasm 0x1
+crash.o:	file format wasm 0x1
 
 Section Details:
 
@@ -58,7 +58,7 @@ Data[2]:
 Custom:
  - name: "linking"
   - symbol table [count=26]
-   - 0: F <PyInit__core> func=18 [ binding=global vis=hidden ]
+   - 0: F <PyInit__core> func=18 [ binding=global vis=default ]
    - 1: D <__THREW__> [ undefined binding=global vis=default ]
    - 2: F <env.__cxa_allocate_exception> func=0 [ undefined binding=global vis=default ]
    - 3: G <env.__memory_base> global=0 [ undefined binding=global vis=default ]

@hoodmane
Copy link

hoodmane commented Jun 3, 2024

Right now I have:

build.sh:

set -x

em++ \
-isystem \
.pyodide-xbuildenv-0.26.0/0.26.0/xbuildenv/pyodide-root/cpython/installs/python-3.12.1/include/python3.12 \
-c src/module.cpp \
-o module.cpp.o \
-O3 -fPIC -fexceptions -g2 \
#-fvisibility=hidden 
# error happens with -fvisibility=hidden commented out but not if we pass it

em++ \
-o dist/boost_histogram-1.4.2.dev14+g54eb2dd.d20240603/boost_histogram/_core.cpython-312-wasm32-emscripten.so \
module.cpp.o \
-s WASM_BIGINT -s SIDE_MODULE=1 -fexceptions -g2 -O2 \

wheel pack dist/boost_histogram-1.4.2.dev14+g54eb2dd.d20240603 -d dist

and module.cpp:

#include "Python.h"
#include <stdexcept>

extern "C"
PyObject *PyInit__core() { 
    try { 
        throw std::runtime_error("");
    } catch (const std::exception &e) { 
        PyErr_SetString(PyExc_ImportError, "oops"); 
    } 
    return nullptr; 
} 

and repro.py:

print("=" * 20)
try:
    from boost_histogram._core import do_the_thing
except Exception:
    print("ok!")

and run.sh:

rm -rf .venv-pyodide
pyodide venv .venv-pyodide
.venv-pyodide/bin/pip install dist/boost_histogram-1.4.2*.whl
.venv-pyodide/bin/python repro.py

@hoodmane
Copy link

hoodmane commented Jun 3, 2024

Okay I'm being silly, of course when we don't export PyInit__core, we get:

ImportError('dynamic module does not define module export function (PyInit__core)')

instead of the desired crash but this isn't an interesting observation.

@hoodmane
Copy link

hoodmane commented Jun 4, 2024

Emscripten bug:
emscripten-core/emscripten#22052

hoodmane added a commit to hoodmane/pyodide that referenced this pull request Jun 4, 2024
After many hours of debugging, I minimized the problem down to this:
emscripten-core/emscripten#22052

Thanks to @henryiii for reporting. See thread here for my comments while I was
debugging:
scikit-hep/boost-histogram#938
hoodmane added a commit to pyodide/pyodide that referenced this pull request Jun 7, 2024
After many hours of debugging, I minimized the problem down to this: emscripten-core/emscripten#22052

Thanks to @henryiii for reporting. See thread here for my comments while I was debugging:
scikit-hep/boost-histogram#938

Resolves #2964.
hoodmane added a commit to hoodmane/pyodide that referenced this pull request Jun 7, 2024
After many hours of debugging, I minimized the problem down to this: emscripten-core/emscripten#22052

Thanks to @henryiii for reporting. See thread here for my comments while I was debugging:
scikit-hep/boost-histogram#938

Resolves pyodide#2964.
@henryiii henryiii marked this pull request as ready for review June 8, 2024 03:34
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@henryiii
Copy link
Member Author

henryiii commented Jun 8, 2024

Curious, I naively would have thought:

- run: CFLAGS=-fexceptions LDFLAGS=-fexceptions pyodide build --exports whole_archive
+ run: pyodide build --exports whole_archive

Along with

[tool.pyodide.build]
cxxflags = "-fexceptions"
ldflags = "-fexceptions"

Would have been the same, but now it's reporting

Failed to load lib  /home/runner/work/boost-histogram/boost-histogram/.venv-pyodide/lib/python3.12/site-packages/boost_histogram/_core.cpython-312-wasm32-emscripten.so
Error: need the dylink section to be first
    at failIf (/home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:137604)
    at getDylinkMetadata (/home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:138151)
    at loadWebAssemblyModule (/home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:147672)
    at /home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:154278
    at async Object.loadDynlib (/home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/pyodide.asm.js:10:89059)
    at async main (/home/runner/work/boost-histogram/boost-histogram/.pyodide-xbuildenv-0.26.1/0.26.1/xbuildenv/pyodide-root/dist/python:139:13)
ImportError while loading conftest '/home/runner/work/boost-histogram/boost-histogram/tests/conftest.py'.
tests/conftest.py:15: in <module>
    import boost_histogram as bh
.venv-pyodide/lib/python3.12/site-packages/boost_histogram/__init__.py:3: in <module>
    from . import accumulators, axis, numpy, storage
.venv-pyodide/lib/python3.12/site-packages/boost_histogram/accumulators.py:3: in <module>
    from ._core.accumulators import (  # pylint: disable=import-error,no-name-in-module
E   ImportError: dynamic module does not define module export function (PyInit__core)

I do see this warning when building, which is suspicious:

linking a library with `-shared` will emit a static object file.  This is a form of emulation to support existing build systems.  If you want to build a runtime shared library use the SIDE_MODULE setting

pyproject.toml Outdated Show resolved Hide resolved
@henryiii henryiii merged commit ccda481 into develop Jun 8, 2024
16 checks passed
@henryiii henryiii deleted the henryiii/ci/emscripten4 branch June 8, 2024 04:29
@hoodmane
Copy link

hoodmane commented Jun 8, 2024

now it's reporting ...

Well that sounds like another thing to look into...

@henryiii henryiii removed the needs changelog Might need a changelog entry label Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants