Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add nested dict support #208

Merged
merged 20 commits into from
Feb 21, 2024
Merged

add nested dict support #208

merged 20 commits into from
Feb 21, 2024

Conversation

synodriver
Copy link
Contributor

@synodriver synodriver commented Apr 2, 2022

Closes #199

@synodriver
Copy link
Contributor Author

synodriver commented Apr 2, 2022

By the way, I've fixed some compiler warnings on msvc about type convertion from Py_ssize_t to int.

@synodriver
Copy link
Contributor Author

synodriver commented Apr 2, 2022

Wow the ci fails somehow. Maybe lua should build first and link python extension against lua.dll or liblua.so. See my github action's script

Copy link
Owner

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Seems ok in general. I left some comments.

lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/tests/test.py Outdated Show resolved Hide resolved
self.assertEqual(table["a"]["a"], "foo")
self.assertEqual(table["b"]["b"], "bar")
self.lua.globals()["data"] = table
self.lua.eval("assert(data.a.a=='foo', 'failed')")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test where we make sure that this raises an exception if the condition fails, so that the test actually fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I would add it later

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you've added such a test.
I guess you could also move the test assertions out of Lua and write something like this instead:

self.assertEqual(self.lua.eval("data.a.a"), 'foo')

Feel free to add a test helper method to make this shorter:

def assertLuaResult(lua_expression, result):
    self.assertEqual(self.lua.eval(lua_expression), result)

lupa/tests/test.py Outdated Show resolved Hide resolved
lupa/tests/test.py Outdated Show resolved Hide resolved
setup.py Outdated
Comment on lines 336 to 341
ext_modules = cythonize(ext_modules, compiler_directives={
"always_allow_keywords": True,
"cdivision": True,
"boundscheck": False,
"wraparound": False
})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to this PR and is worth a discussion.

Copy link
Contributor Author

@synodriver synodriver Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just speed up generated code by disabling extra check.

@synodriver
Copy link
Contributor Author

Conflicts are resolved now.

lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/_lupa.pyx Outdated Show resolved Hide resolved
lupa/_lupa.pyx Outdated
Comment on lines 160 to 163
cdef inline int _len_as_int(Py_ssize_t obj) except -1:
if obj > <Py_ssize_t>LONG_MAX:
raise OverflowError
return <int>obj
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably still get a C compiler warning here if sizeof(long) == sizeof(Py_ssize_t), because it would detect that the condition is always false and, thus, useless.

BTW, why do you compare to LONG_MAX and not INT_MAX if you want to return an int?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to deal with it or we should direct cast it?

lupa/_lupa.pyx Outdated
o = (<_PyProtocolWrapper> o)._obj
pushed_values_count = py_to_lua_custom(runtime, L, o, type_flags)
elif recursive and isinstance(o, Sequence):
lua.lua_createtable(L, _len_as_int(len(o)), 0) # create a table at the top of stack, with narr already known
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's probably an error to handle if the allocation fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that used to prevent from over nesting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lua_createtable has no return value, so how do we know if it fails?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this applies to a couple of other conversions as well. Lua has an error handler that uses longjmp, that probably applies here, too. That is quite annoying, because Lupa does not handle this currently (and it's really not trivial to handle, especially in Cython).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. I think detect reference cycles in the data structures is not easy, and probably we should just limit the maximum depth of recursion

lupa/_lupa.pyx Outdated Show resolved Hide resolved
Copy link
Owner

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a comment in the original ticket asking about reference cycles in the data structures. And that is definitely an issue. As it stands, it looks like the current PR can run into an infinite loop and crash if you pass a dict with a nested reference to itself. This needs to be handled.

lupa/_lupa.pyx Outdated
o = (<_PyProtocolWrapper> o)._obj
pushed_values_count = py_to_lua_custom(runtime, L, o, type_flags)
elif recursive and isinstance(o, Sequence):
lua.lua_createtable(L, _len_as_int(len(o)), 0) # create a table at the top of stack, with narr already known
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this applies to a couple of other conversions as well. Lua has an error handler that uses longjmp, that probably applies here, too. That is quite annoying, because Lupa does not handle this currently (and it's really not trivial to handle, especially in Cython).

@scoder
Copy link
Owner

scoder commented Jan 2, 2024

Rather than requiring users to think about a hard depth limit for their data structures, what do you think about keeping a dict of mapped objects (by original object ID) and reusing already mapped object references? That way, we could really map recursive structures 1:1 without the risk to run into infinite loops while mapping.

@synodriver
Copy link
Contributor Author

Rather than requiring users to think about a hard depth limit for their data structures, what do you think about keeping a dict of mapped objects (by original object ID) and reusing already mapped object references? That way, we could really map recursive structures 1:1 without the risk to run into infinite loops while mapping.

Great idea, I've implemented it in the latest commit.

lupa/_lupa.pyx Outdated
Comment on lines 550 to 551
py_to_lua(self, L, key, True, recursive)
py_to_lua(self, L, value, False, recursive)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call arguments like None or False tend to be opaque – unless you know by heart which parameter they refer to, you have to look up the signature in order to understand what is requested. Better use keyword arguments. It's a C function call, so they are resolved at compile time.

Suggested change
py_to_lua(self, L, key, True, recursive)
py_to_lua(self, L, value, False, recursive)
py_to_lua(self, L, key, wrap_none=True, recursive=recursive)
py_to_lua(self, L, value, wrap_none=False, recursive=recursive)

lupa/_lupa.pyx Outdated
lua.lua_rawset(L, -3)
else:
for arg in obj:
py_to_lua(self, L, arg)
py_to_lua(self, L, arg, False, recursive)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
py_to_lua(self, L, arg, False, recursive)
py_to_lua(self, L, arg, wrap_none=False, recursive=recursive)

lupa/_lupa.pyx Outdated
"""Create a new table from Python mapping or iterable.

table_from() accepts either a dict/mapping or an iterable with items.
Items from dicts are set as key-value pairs; items from iterables
are placed in the table in order.

Nested mappings / iterables are passed to Lua as userdata
(wrapped Python objects); they are not converted to Lua tables.
(wrapped Python objects) if `recursive` is False, they are not converted to Lua tables.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what you mean, right?

Suggested change
(wrapped Python objects) if `recursive` is False, they are not converted to Lua tables.
(wrapped Python objects). If `recursive` is False, they are not converted to Lua tables.

lupa/_lupa.pyx Outdated
Comment on lines 1609 to 1635
elif recursive and isinstance(o, (list, Sequence)):
if mapped_objs is None:
mapped_objs = {}
if id(o) not in mapped_objs:
lua.lua_createtable(L, _len_as_int(len(o)), 0) # create a table at the top of stack, with narr already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for i, v in enumerate(o, 1):
py_to_lua(runtime, L, v, wrap_none, recursive, mapped_objs)
lua.lua_rawseti(L, -2, i)
else: # self-reference detected
idx = mapped_objs[id(o)]
lua.lua_pushvalue(L, idx)
pushed_values_count = 1
elif recursive and isinstance(o, (dict, Mapping)):
if mapped_objs is None:
mapped_objs = {}
if id(o) not in mapped_objs:
lua.lua_createtable(L, 0, _len_as_int(len(o))) # create a table at the top of stack, with nrec already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for key, value in o.items():
py_to_lua(runtime, L, key, wrap_none, recursive, mapped_objs)
py_to_lua(runtime, L, value, wrap_none, recursive, mapped_objs)
lua.lua_rawset(L, -3)
else: # self-reference detected
idx = mapped_objs[id(o)]
lua.lua_pushvalue(L, idx)
pushed_values_count = 1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reverse the conditions to put the simple "already mapped" case first.

Suggested change
elif recursive and isinstance(o, (list, Sequence)):
if mapped_objs is None:
mapped_objs = {}
if id(o) not in mapped_objs:
lua.lua_createtable(L, _len_as_int(len(o)), 0) # create a table at the top of stack, with narr already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for i, v in enumerate(o, 1):
py_to_lua(runtime, L, v, wrap_none, recursive, mapped_objs)
lua.lua_rawseti(L, -2, i)
else: # self-reference detected
idx = mapped_objs[id(o)]
lua.lua_pushvalue(L, idx)
pushed_values_count = 1
elif recursive and isinstance(o, (dict, Mapping)):
if mapped_objs is None:
mapped_objs = {}
if id(o) not in mapped_objs:
lua.lua_createtable(L, 0, _len_as_int(len(o))) # create a table at the top of stack, with nrec already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for key, value in o.items():
py_to_lua(runtime, L, key, wrap_none, recursive, mapped_objs)
py_to_lua(runtime, L, value, wrap_none, recursive, mapped_objs)
lua.lua_rawset(L, -3)
else: # self-reference detected
idx = mapped_objs[id(o)]
lua.lua_pushvalue(L, idx)
pushed_values_count = 1
elif recursive and isinstance(o, (list, Sequence)):
if mapped_objs is None:
mapped_objs = {}
if id(o) in mapped_objs:
obj = mapped_objs[id(o)]
lua.lua_pushvalue(L, obj)
else:
lua.lua_createtable(L, _len_as_int(len(o)), 0) # create a table at the top of stack, with narr already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for i, v in enumerate(o, 1):
py_to_lua(runtime, L, v, wrap_none, recursive, mapped_objs)
lua.lua_rawseti(L, -2, i)
pushed_values_count = 1
elif recursive and isinstance(o, (dict, Mapping)):
if mapped_objs is None:
mapped_objs = {}
if id(o) in mapped_objs:
obj = mapped_objs[id(o)]
lua.lua_pushvalue(L, obj)
lua.lua_createtable(L, 0, _len_as_int(len(o))) # create a table at the top of stack, with nrec already known
mapped_objs[id(o)] = lua.lua_gettop(L)
for key, value in o.items():
py_to_lua(runtime, L, key, wrap_none, recursive, mapped_objs)
py_to_lua(runtime, L, value, wrap_none, recursive, mapped_objs)
lua.lua_rawset(L, -3)
pushed_values_count = 1

Actually, I'd even reuse the existing LuaRuntime.table_from() implementation here. Or, rather, extract a cdef function from it and use it in both places. This seems needlessly duplicated code.

lupa/__init__.py Outdated
Comment on lines 66 to 69
try:
_newest_lib = __import__(module_name[0], level=1, fromlist="*", globals=globals())
except ModuleNotFoundError:
_newest_lib = __import__(module_name[1], level=1, fromlist="*", globals=globals())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated. Why did you change it?

Copy link
Contributor Author

@synodriver synodriver Jan 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is related to another bug. If you link to your custom lua libs, the original code will fail to find corret module. Reated to this issue

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please revert it. You can propose it as a separate PR, if you think that it's the right thing to do. But it's unrelated to the change in this "nested dict" PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will put it in a seperate pr.

@scoder
Copy link
Owner

scoder commented Jan 6, 2024

I've extracted the logic from LuaRuntime.table_from() in the master branch to make it reusable for the recursive container mappings. Could you please adapt your code? Most of the changes in py_to_lua can probably be avoided now.

@synodriver
Copy link
Contributor Author

synodriver commented Jan 6, 2024

Most of the changes in py_to_lua can probably be avoided now.

I'm afraid not because py_to_lua_table still make calls to py_to_lua

@scoder
Copy link
Owner

scoder commented Jan 6, 2024

Most of the changes in py_to_lua can probably be avoided now.

I'm afraid not because py_to_lua_table still make calls to py_to_lua

Sorry, I just meant the additions, not the flags etc. Both functions need to call each other, but you can avoid the dict/list mappings in py_to_lua now.

@synodriver
Copy link
Contributor Author

avoid the dict/list mappings in py_to_lua now.

Could you please give me some advice about that? Perhaps I misunderstood your idea, but I think there might be dict in dict, so we have to check them in py_to_lua again.

@scoder
Copy link
Owner

scoder commented Jan 6, 2024

avoid the dict/list mappings in py_to_lua now.

Could you please give me some advice about that? Perhaps I misunderstood your idea, but I think there might be dict in dict, so we have to check them in py_to_lua again.

I think it mostly boils down to something like this:

     elif recursive and isinstance(o, (list, dict, Sequence, Mapping)):
        # check existing mapped_objs cache, then failing that:
        table = py_to_lua_table([o], recursive=True)
        # update cache

@synodriver
Copy link
Contributor Author

Finally... I made it, looks better now.

Copy link
Owner

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that removes a lot of duplicated functionality. Looks much better now.

lupa/_lupa.pyx Outdated
Comment on lines 1686 to 1687
if id(obj) not in mapped_objs:
mapped_objs[id(obj)] = lua.lua_gettop(L)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this part. Why would the Lua object on the top of the stack be related to the next object that will be mapped? Shouldn't the cache update happen after the mapping in the loop body?

Copy link
Contributor Author

@synodriver synodriver Jan 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the call is recursive. If the cache is moved to the end of the loop body, py_to_lua might trigger a call to
py_to_lua_table again during the loop, which triggers another call and so on, so there might be no chance to set the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, this function creates a table at the beginning, which will be filled with data from python objects, and that table will be converted into a _LuaTable, so we just need to keep an eye on that table.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try mapping a Python list like [[3, 3, 3]] to Lua with this? I.e. one that actually repeats the same value multiple times, so that the cache would be hit? It looks like it might end up referencing the mapped table itself as a value in the Lua table at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[[3, 3, 3]] have a depth of 2, it will call py_to_lua_table twice. Each time the id of the list is cached, and at the second time, after the inner list is cached, it just goes into the list handler of the loop, without caching the PyLongObject. That means, only list and dict is cached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new test have been added. It works as expected.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, this function creates a table at the beginning, which will be filled with data from python objects, and that table will be converted into a _LuaTable, so we just need to keep an eye on that table.

Ok, so, do I understand correctly that the invariant in this for-loop is that the top of the Lua stack always refers to the Lua table that we are filling? And that is the value that we put into mapped_objs?

Why don't you read the stack top once, before the loop, give it a name (like lua_table_ref) and use that instead? That would be much clearer than reading seemingly randomly from the top of the Lua stack and putting whatever we get into mapped_objs for an object ID that we have never seen before.

self.assertEqual(table["a"]["a"], "foo")
self.assertEqual(table["b"]["b"], "bar")
self.lua.globals()["data"] = table
self.lua.eval("assert(data.a.a=='foo', 'failed')")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you've added such a test.
I guess you could also move the test assertions out of Lua and write something like this instead:

self.assertEqual(self.lua.eval("data.a.a"), 'foo')

Feel free to add a test helper method to make this shorter:

def assertLuaResult(lua_expression, result):
    self.assertEqual(self.lua.eval(lua_expression), result)

lupa/tests/test.py Outdated Show resolved Hide resolved
lupa/tests/test.py Outdated Show resolved Hide resolved
lupa/__init__.py Outdated
Comment on lines 66 to 69
try:
_newest_lib = __import__(module_name[0], level=1, fromlist="*", globals=globals())
except ModuleNotFoundError:
_newest_lib = __import__(module_name[1], level=1, fromlist="*", globals=globals())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please revert it. You can propose it as a separate PR, if you think that it's the right thing to do. But it's unrelated to the change in this "nested dict" PR.

lupa/_lupa.pyx Outdated
"""Create a new table from Python mapping or iterable.

table_from() accepts either a dict/mapping or an iterable with items.
Items from dicts are set as key-value pairs; items from iterables
are placed in the table in order.

Nested mappings / iterables are passed to Lua as userdata
(wrapped Python objects); they are not converted to Lua tables.
(wrapped Python objects). If `recursive` is False, they are not converted to Lua tables.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(wrapped Python objects). If `recursive` is False, they are not converted to Lua tables.
(wrapped Python objects). If `recursive` is False (the default),
they are not converted to Lua tables.

lupa/_lupa.pyx Outdated
@@ -1511,7 +1516,7 @@ cdef int py_to_lua_handle_overflow(LuaRuntime runtime, lua_State *L, object o) e
lua.lua_settop(L, old_top)
raise

cdef int py_to_lua(LuaRuntime runtime, lua_State *L, object o, bint wrap_none=False) except -1:
cdef int py_to_lua(LuaRuntime runtime, lua_State *L, object o, bint wrap_none=False, bint recursive=False, dict mapped_objs = None) except -1:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use consistent code style.

Suggested change
cdef int py_to_lua(LuaRuntime runtime, lua_State *L, object o, bint wrap_none=False, bint recursive=False, dict mapped_objs = None) except -1:
cdef int py_to_lua(LuaRuntime runtime, lua_State *L, object o, bint wrap_none=False, bint recursive=False, dict mapped_objs=None) except -1:

lupa/_lupa.pyx Outdated
Comment on lines 1686 to 1687
if id(obj) not in mapped_objs:
mapped_objs[id(obj)] = lua.lua_gettop(L)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, this function creates a table at the beginning, which will be filled with data from python objects, and that table will be converted into a _LuaTable, so we just need to keep an eye on that table.

Ok, so, do I understand correctly that the invariant in this for-loop is that the top of the Lua stack always refers to the Lua table that we are filling? And that is the value that we put into mapped_objs?

Why don't you read the stack top once, before the loop, give it a name (like lua_table_ref) and use that instead? That would be much clearer than reading seemingly randomly from the top of the Lua stack and putting whatever we get into mapped_objs for an object ID that we have never seen before.

lupa/_lupa.pyx Outdated Show resolved Hide resolved
Comment on lines +604 to +605
def test_table_from_nested(self):
table = self.lua.table_from([[3, 3, 3]], recursive=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not have been the best example, just something that I came up with trying to understand your implementation. However, a more deeply nested example would be good, to make sure that, say, a dict of lists of dicts of dicts of lists of lists also works, or a (list) table of (mapping) tables. You're only testing one level of nesting, always starting with a dict. Maybe you can generate a data structure (or a few), map it back and forth, and then compare the result to itself? That would allow testing something larger and deeper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accutally there are other tests which are far more deeper, for instance, the test_table_from_self_ref_obj, which is a infinite deep structure.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't so much referring to the depth, rather different combinations of nestings. It might make a difference whether you're mapping a list of dicts or a dict of lists. It might make a difference whether you're mapping a list of simple values or a list of lists of lists. The tests should cover different combinations. Thus, generating data structures and validating the mapping generically would be great, because it would allow quickly testing a larger range of possible structures, making sure that we're not missing edge cases, and making sure we're touching all mapping code. We're probably not covering all of it currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to cover more tests, like List[Dict], Dict[str, Dict], List[List], Dict[str, List]```

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like this:

def test():
    from itertools import count
    def make_ds(*children):
        for child in children:
            yield child
        yield list(children)
        yield dict(zip(count(), children))
        yield {chr(ord('A') + i): child for i, child in enumerate(children)}

    for ds1 in make_ds(1, 2, 'x', 'y'):
        for ds2 in make_ds(ds1):
            for ds in make_ds(ds1, ds2):
                table = self.lua.table_from(ds)
                # validate table == ds

I hope it's mostly clear, the idea is to generate 0-2 level data structures of all sorts of nesting combinations, and then validate that the result looks the same in Lua as in Python. The validation would probably involve a couple of type checks and back-mappings Lua->Python, but shouldn't be too complex. The above loops generate somewhat redundant results, AFAICT, but not too bad. Maybe you can come up with something smarter even.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opps... I found a bug, and it can be reproduced in master branch, just use lua.table_from(1), in this way py_to_lua_table got a (1, ), and as it iterate over the tuple, the PyLongObject is passed to

 else:
       for arg in obj:

but int is not iterable. Should fix it somehow so that I can continus with my pr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structures generated by this function should also be applied to master branch I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, there might be some gc problem with cpython itself, in the loop test there are some lua tables can't match the corresponding pyobject, but when I test them in a individual test, the problem just gone.

Copy link
Owner

@scoder scoder Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My test code proposal was buggy. Here's a revised version that does not generate single elements:

def test():
    from itertools import count
    def make_ds(*children):
        yield list(children)
        yield dict(zip(count(), children))
        yield {chr(ord('A') + i): child for i, child in enumerate(children)}

    elements = [1, 2, 'x', 'y']
    for ds1 in make_ds(*elements):
        for ds2 in make_ds(ds1):
            for ds3 in make_ds(ds1, elements, ds2):
                for ds in make_ds(ds1, ds2, ds3):
                    table = self.lua.table_from(ds)
                    # validate table == ds

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this test to the master branch and merged it into this PR. Since the "validation" in the master branch is really rudimentary, it hopefully won't fail here, but can now be adapted to validate the transitive translation.

lupa/_lupa.pyx Outdated
Comment on lines 1681 to 1682
cdef int lua_table_ref = lua.lua_gettop(L) # the index of the lua table which we are filling
# FIXME: how to check for failure?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment was referring to the line above. And it's probably outdated, because we can't check for failures, Lua would just jump to the error handler.

Suggested change
cdef int lua_table_ref = lua.lua_gettop(L) # the index of the lua table which we are filling
# FIXME: how to check for failure?
# FIXME: handle allocation errors
cdef int lua_table_ref = lua.lua_gettop(L) # the index of the lua table which we are filling

lupa/_lupa.pyx Outdated
Comment on lines 1692 to 1694
# this object has been cached, just get the corresponding lua table's index
idx = mapped_objs[id(obj)]
return new_lua_table(runtime, L, <int>idx)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for that comment, that actually helps. Then you're not using mapped_objs as it's named but rather as a mapped_tables, right? Seems worth renaming then to make the code clearer.

lupa/_lupa.pyx Outdated Show resolved Hide resolved
Comment on lines +604 to +605
def test_table_from_nested(self):
table = self.lua.table_from([[3, 3, 3]], recursive=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like this:

def test():
    from itertools import count
    def make_ds(*children):
        for child in children:
            yield child
        yield list(children)
        yield dict(zip(count(), children))
        yield {chr(ord('A') + i): child for i, child in enumerate(children)}

    for ds1 in make_ds(1, 2, 'x', 'y'):
        for ds2 in make_ds(ds1):
            for ds in make_ds(ds1, ds2):
                table = self.lua.table_from(ds)
                # validate table == ds

I hope it's mostly clear, the idea is to generate 0-2 level data structures of all sorts of nesting combinations, and then validate that the result looks the same in Lua as in Python. The validation would probably involve a couple of type checks and back-mappings Lua->Python, but shouldn't be too complex. The above loops generate somewhat redundant results, AFAICT, but not too bad. Maybe you can come up with something smarter even.

@synodriver
Copy link
Contributor Author

Iet me explaint what I found these days. There is a bug in master branch which can not pass your testcase.

lua.table_from(1)

will always end up with an error: int is not iterable.
At the same time, there might be some gc problem with cpython itself, in the loop test there are some lua tables can't match the corresponding pyobject, but when I test them in a individual test, the problem just disappear, making it very hard to debug.

@scoder
Copy link
Owner

scoder commented Feb 20, 2024

There is a bug in master branch which can not pass your testcase.

lua.table_from(1)

That's not a bug. The method expects one or more iterables or mappings as arguments, not plain values. I.e., you can write lua.from_table([1], [2], [3,4,5]), but not lua.from_table(1,2,3,4,5). That's also what the docstring says, at least as I read it.

Edit: BTW, it's a good thing that we currently reject simple arguments. There's negative precedence with Python's builtin min and max functions, which accept either a single iterable or multiple simple values as arguments to compare. The problem is that a single argument becomes ambiguous. You cannot say max(5), because it gives you the irritating error message "5 is not iterable", whereas max(5,6) just works and does what you expected, without needing any kind of obvious iteration. In lua.table_from(), at least you get this error message for whatever simple (non-iterable) arguments you pass.

@synodriver
Copy link
Contributor Author

Thanks for your help with the nested data structure generator, it looks better now. The last thing to do is figure out a way to compare them without encountering strange garbage collection bugs in CPython.....

@scoder scoder merged commit b204445 into scoder:master Feb 21, 2024
88 checks passed
@scoder
Copy link
Owner

scoder commented Feb 21, 2024

Thanks for all the work on this. I think it's good enough to be released in Lupa 2.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nested dict to lua dictionary
2 participants