Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dtype mod #327

Draft
wants to merge 20 commits into
base: master
Choose a base branch
from
Draft

Dtype mod #327

wants to merge 20 commits into from

Conversation

v923z
Copy link
Owner

@v923z v923z commented Feb 18, 2021

This PR adds the option to extend ulab in a transparent way. This means that the user is able to add their own data container in the C implementation, and if they supply a readout function, then various numpy methods should be able to access the data in the container. Such a facility could be exploited to process data that do not reside in RAM, either because they are not available, or because the amount would be prohibitive.

Two possible use cases are

  1. implementing complicated generator expressions
  2. processing image data that contain megapixels of information (openmv, ulab integration into openmv openmv/openmv#881; pixels in the image can be accessed via https://github.com/openmv/openmv/blob/master/src/omv/modules/py_image.c#L402)

The type definition of ndarray is extended with a blocks_block_obj_t structure:

typedef struct _blocks_block_obj_t {
mp_obj_base_t base;
uint8_t ndim;
void *ndarray;
void *arrfunc;
uint8_t *subarray;
size_t shape[ULAB_MAX_DIMS];
void *origin;
} blocks_block_obj_t;
typedef struct _ndarray_obj_t {
mp_obj_base_t base;
dtype_dtype dtype;
uint8_t itemsize;
uint8_t boolean;
uint8_t ndim;
size_t len;
size_t shape[ULAB_MAX_DIMS];
int32_t strides[ULAB_MAX_DIMS];
void *array;
#if ULAB_HAS_BLOCKS
uint8_t flags;
blocks_block_obj_t *block;
#endif
} ndarray_obj_t;

In blocks_block_obj_t, a pointer to the readout function can be attached, *arrfunc, as well as a temporary container, *subarray, can be pointed to. The subarray has to be able to hold a single line of data, i.e., subarray must be at least as long as the longest axis of the tensor. This single line can than be passed to the innermost loop of all numerical functions, binary operators, etc. An example is the summation macro

#define RUN_SUM1(type, ndarray, array, results, rarray, ss)\
({\
type sum = 0;\
uint8_t *barray = (array);\
int32_t increment = (ss).strides[0];\
if((ndarray)->flags) {\
void (*arrfunc)(ndarray_obj_t *, void *, int32_t *, size_t) = (ndarray)->block->arrfunc;\
arrfunc((ndarray), (array), &increment, (ss).shape[0]);\
barray = (ndarray)->block->subarray;\
}\
for(size_t i=0; i < (ss).shape[0]; i++) {\
sum += *((type *)(barray));\
barray += increment;\
}\
(array) += (ss).shape[0] * (ss).strides[0];\
memcpy((rarray), &sum, (results)->itemsize);\
(rarray) += (results)->itemsize;\
})

The user can then simply attach their readout function by defining a type

extern const mp_obj_type_t imreader_type;
void imreader_imreader(ndarray_obj_t *ndarray, void *array, int32_t *strides, size_t count) {
blocks_block_obj_t *block = (blocks_block_obj_t *)ndarray->block;
uint8_t *barray = (uint8_t *)block->subarray;
// if necessary, get the coordinates in the original reference frame, i.e.,
// in the coordinates used at the time of the creation of the object
size_t *coords = blocks_coords_from_pointer(array, ndarray);
uint8_t x = (uint8_t)coords[ULAB_MAX_DIMS - 2] * (uint8_t)block->shape[ULAB_MAX_DIMS - 2];
for(size_t i = 0; i < count; i++) {
// fill up the array with dummy data
*barray++ = (uint8_t)((x + i) * (x + i));
}
// The subarray is a forward propagating dense array, so set the strides to the itemsize
*strides = ndarray->itemsize;
}
mp_obj_t imreader_make_new(const mp_obj_type_t *type, size_t n_args, size_t n_kw, const mp_obj_t *args) {
(void)type;
mp_arg_check_num(n_args, n_kw, 0, 1, true);
mp_map_t kw_args;
mp_map_init_fixed_table(&kw_args, n_kw, args + n_args);
static const mp_arg_t allowed_args[] = {
{ MP_QSTR_, MP_ARG_OBJ, { .u_obj = mp_const_none } },
};
mp_arg_val_t _args[MP_ARRAY_SIZE(allowed_args)];
mp_arg_parse_all(n_args, args, &kw_args, MP_ARRAY_SIZE(allowed_args), allowed_args, _args);
blocks_transformer_obj_t *transformer = m_new_obj(blocks_transformer_obj_t);
transformer->base.type = &blocks_transformer_type;
transformer->arrfunc = imreader_imreader;
transformer->array = NULL;
return MP_OBJ_FROM_PTR(transformer);
}
const mp_obj_type_t imreader_type = {
{ &mp_type_type },
.name = MP_QSTR_imreader,
.make_new = imreader_make_new,
};

The example above calculates the sum of squares.

In python, the mock-up looks like this

from ulab import blocks
from ulab import user
from ulab import numpy as np

f = blocks.ndarray(shape=(5,5), transformer=user.imreader(), dtype=np.uint8)

print(f)
print(np.sum(f, axis=1))
for i in f:
    print(i, np.sum(i, axis=0))

Slicing, indexing and the like happens in the usual way, since even in the standard case, such operations only update the array header, and move the position pointer.

In a numerical operation, a tensor is always traversed along an axis. Given the position of the data pointer, the coordinates of the pointer position can be calculated with the help of the size_t *blocks_coords_from_pointer(void *p1, ndarray_obj_t *ndarray) function:

size_t *blocks_coords_from_pointer(void *p1, ndarray_obj_t *ndarray) {
// Calculates the coordinates in the original tensor from the position of the pointer
// The original view is assumed to be dense, i.e., the strides can be computed from the shape
// This is a utility function, and is not exposed to the python interpreter
blocks_block_obj_t *block = ndarray->block;
size_t diff = (uint8_t *)p1 - (uint8_t *)block->origin;
size_t stride = ndarray->itemsize;
size_t *coords = m_new(size_t, ULAB_MAX_DIMS);
// first, calculate the very first stride
for(uint8_t i = 0; i < block->ndim - 1; i++) {
stride *= block->shape[ULAB_MAX_DIMS - i - 1];
}
for(uint8_t i = block->ndim; i > 1; i--) {
coords[ULAB_MAX_DIMS - i] = diff / stride;
diff -= coords[ULAB_MAX_DIMS - i] * block->shape[ULAB_MAX_DIMS - i];
stride /= block->shape[ULAB_MAX_DIMS - i + 1];
}
return coords;
}
, and hence the imreader can easily fill up the subarray.

Such a construct should not support the buffer protocol, since there might not be an easy way of resolving what should happen:
#327 (comment), #327 (comment)

Passing arguments to the type (imreader above) should be possible as, e.g., with ndarray.

Outstanding issues:

  1. sort out those functions that do not make sense for such a structure (e.g., rolling might not be relevant)
  2. what should happen with overflows? These could be handled by declaring the image of float type, but I am not sure, whether this would lead to problems later on.
  3. memory footprint, reducing the size of the extra payload in dtype. This last question is probably sorted out by attaching the extra structure to the ndarray only then, when it is needed. We carry only a pointer, but RAM is reserved for it only in
    blocks_block_obj_t *block = m_new_obj(blocks_block_obj_t);

Functions and features

  • binary operators, broadcasting
  • sum
  • mean
  • std
  • sorting
  • diff
  • flip
  • fft
  • convolve
  • interp, scipy.signal?
  • numerical functions
  • polyfit, polyval
  • comparison functions

@kwagyeman
Copy link

This is more or less what we need to support images. The data type we'd use would be a float. Since users have often asked to do weird things with images that you can't do with ints.

What questions do you need answered from me? Given a row/color_channel it's very easy for me to fill a float array with pixel values. I can also easily fill a column.

@v923z
Copy link
Owner Author

v923z commented Feb 19, 2021

@kwagyeman

This is more or less what we need to support images. The data type we'd use would be a float. Since users have often asked to do weird things with images that you can't do with ints.

I don't have to know what data type you want to hold, int, or float. This will be resolved automatically, when the function pointer is called.

What questions do you need answered from me? Given a row/color_channel it's very easy for me to fill a float array with pixel values. I can also easily fill a column.

I think, we are pretty much on the same page. Let me iron out the implementation, and we can then pick it up from there.

@v923z
Copy link
Owner Author

v923z commented Feb 21, 2021

@kwagyeman @iabdalkader What should happen with this construct, if the one wants to use the buffer protocol? Some context can be found here #335, and here #328,

The problem is that in

mp_int_t ndarray_get_buffer(mp_obj_t self_in, mp_buffer_info_t *bufinfo, mp_uint_t flags) {
ndarray_obj_t *self = MP_OBJ_TO_PTR(self_in);
if(!ndarray_is_dense(self)) {
return 1;
}
bufinfo->len = self->itemsize * self->len;
bufinfo->buf = self->array;
bufinfo->typecode = self->dtype;
return 0;
}
we have to set a pointer to the underlying data, which I won't hold, because self->array will not point to actual data.

I think the cleanest solution is to simply bail out, if the special flag is set for an ndarray. Do you agree? But you still have to point this out in your documentation.

@kwagyeman
Copy link

We have a buffer protocol for the image object already. There is no need to duplicate it. So bailing makes sense.

@v923z
Copy link
Owner Author

v923z commented Feb 22, 2021

We have a buffer protocol for the image object already. There is no need to duplicate it. So bailing makes sense.

OK, thanks!

@kwagyeman
Copy link

@v923z Would you like to get on our slack? Email [email protected]

@v923z
Copy link
Owner Author

v923z commented Feb 23, 2021

@v923z Would you like to get on our slack? Email [email protected]

@kwagyeman Thanks for the invitation! Sure!

@kwagyeman
Copy link

You need to email me since you hide all contact info on your public profile.

@v923z
Copy link
Owner Author

v923z commented Mar 4, 2021

@kwagyeman, @iabdalkader I have updated my original comment, and uploaded a working prototype. At the moment, only sum, std, and mean are supported, and you can iterate over the tensor elements. Adding the rest is not hard, but we should first converge on an interface function. The example

from ulab import user
from ulab import numpy as np

f = blocks.ndarray(shape=(5,5), transformer=user.imreader(), dtype=np.uint8)

print(f)
print(np.sum(f, axis=1))
for i in f:
    print(i, np.sum(i, axis=0))

creates an ndarray header with shape=(5,5), attaches the function pointer via user.imreader, and sets the dtype to uint8. From this point, f behaves like an ndarray, except, it fetches the data by means of the function pointer, if data are needed. I simply return square numbers in

for(size_t i = 0; i < count; i++) {
// fill up the array with dummy data
*barray++ = (uint8_t)((x + i) * (x + i));
}
. This is, where one would implement a function that reads actual pixels.

I think this solution is quite flexible (by which I mean that the ulab core doesn't have to know anything about the implementation of imreader, it can completely be detached), but I might very well have overlooked something. Let me know what you think.

What would, perhaps, be great is, if we could call the .print function of imreader, so that in

void blocks_block_print(const mp_print_t *print, mp_obj_t self_in, mp_print_kind_t kind) {
(void)kind;
blocks_block_obj_t *self = MP_OBJ_TO_PTR(self_in);
ndarray_obj_t *ndarray = (ndarray_obj_t *)self->ndarray;
mp_printf(print, "block(shape=(%ld,", ndarray->shape[ULAB_MAX_DIMS - ndarray->ndim]);
for(uint8_t i = 1; i < ndarray->ndim - 1; i++) {
mp_printf(print, " %ld,", ndarray->shape[ULAB_MAX_DIMS - ndarray->ndim + i]);
}
if(ndarray->ndim > 1) {
mp_printf(print, " %ld", ndarray->shape[ULAB_MAX_DIMS - 1]);
}
mp_print_str(print, "), dtype=");
ndarray_print_dtype(print, ndarray);
mp_print_str(print, ")");
}
, we could also indicate, which transformer is used in a particular block. I haven't yet found a way of doing this, however.

@dhalbert
Copy link
Contributor

dhalbert commented Apr 2, 2021

@v923z This is very interesting. I am unsure: is this only for reading out-of-RAM data? Do you need a corresponding write function?

@v923z
Copy link
Owner Author

v923z commented Apr 2, 2021

@dhalbert

is this only for reading out-of-RAM data? Do you need a corresponding write function?

At the moment, this would only read; we could think about adding a write function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants