Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null or none value #1025

Open
ustcfdm opened this issue Apr 25, 2024 · 23 comments
Open

null or none value #1025

ustcfdm opened this issue Apr 25, 2024 · 23 comments
Labels

Comments

@ustcfdm
Copy link

ustcfdm commented Apr 25, 2024

I have noticed that there has been a lot discussion about null of none value in toml, i.e. #30 #802 , but I have a question in my case.

In my case, users will set parameters to crop an image (top, bottom, left, right). If they don't want to crop that side, it seems that setting it to None is the best choice. To simplify the example, I will use a 1D image to illustrate it.

>>> import numpy as np
>>> a = np.arange(10)    # 1D image
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[1:]         # Only crop the left by one pixel, do not crop the right side
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[1:None]   # None can be the paramter
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[1:0]          # Cannot use 0
array([], dtype=int32)
>>> a[1:-1]        # Cannot use -1, because the last element is lost
array([1, 2, 3, 4, 5, 6, 7, 8])
>>> a[1:np.inf]  # Cannot use inf, since it is not an integer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: slice indices must be integers or None or have an __index__ method

In the above case, if we can have a toml such as

[config]
crop = [1, none]

it will allow users easily to set no crop for the right side of image. 0, -1, and inf does not work for this case. Value 10 works for this case, but that means users have to know the image size which may be different every time. This is not convenient.

If none is not allowed in toml, is there a good way to solve my question? Appreciated.

@arp242
Copy link
Contributor

arp242 commented Apr 25, 2024

a[0:a.index(-1)], or something along those lines.

This is really a Python question, not a TOML question.

@ustcfdm
Copy link
Author

ustcfdm commented Apr 25, 2024

Could you be more specific? a[0:a.index(-1)] returns an error.

>>> a[0:a.index(-1)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'numpy.ndarray' object has no attribute 'index'

@arp242
Copy link
Contributor

arp242 commented Apr 25, 2024

It works for regular lists:

[~]% python
Python 3.12.3 (main, Apr 12 2024, 12:34:12) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> l = [1, 2, 3, -1, 9, 9]
>>> l[0:l.index(-1)]
[1, 2, 3]
>>>

I don't know about numpy.

@ustcfdm
Copy link
Author

ustcfdm commented Apr 25, 2024

This is not what I need. index method finds the index of a certain value, but what I want is to directly set the index instead of calculating the index based on an element value. In addition, it is possilbe that the array could be float type. For example:

>>> a = np.random.rand(10)
>>> a
array([0.10020089, 0.60065517, 0.48733387, 0.64033642, 0.67891021,
       0.87084298, 0.61960618, 0.44778654, 0.47544359, 0.11425436])

@arp242
Copy link
Contributor

arp242 commented Apr 25, 2024

Guess I don't really understand what you want then.

But like I said, this is a Python/numpy question, not a TOML specification issue.

@ustcfdm
Copy link
Author

ustcfdm commented Apr 26, 2024

Well, make it short. What I want is that there could be a none or null value in toml which can be mapped into None in Python. This is not a Python question, but a TOML issue, because Python has None, but TMOL does not.

@eksortso
Copy link
Contributor

Do you have a way to set the crop parameters without relying on an array to do it? Surely someone coded the logic between your TOML and the program's settings. Have them change it so you can name each component of that erstwhile array, then use an inline table to supply the values from your TOML and default to None where they are not provided.

For instance, say you set left and right values this way:

[config]
crop = {left = 1, right = 10}

You can then make the right open-ended by not providing a value for right, like so:

[config]
crop = {left = 1}

Or, more succinctly:

[config]
crop.left = 1

It could be coded so that crop will correctly handle both arrays and tables. But only with tables and explicit keys would you be able to set the right side to None in your Python.

@ustcfdm
Copy link
Author

ustcfdm commented Apr 28, 2024

This is a workaround. It can solve my question to a certain extent, but not as good as I expected. In my opinion, parameters should be explicitly provided. Hidden default values are not good, especially when there are a lot of hidden default values. This may make users confused. In addition, in my example, if we want to set both values to be None, they we have to set:

[config]
crop = {}

This doesn't tell good information. If we can set

[config]
crop = {left=none, right=none}

it will be more obvious and much clearer.

@ChristianSi
Copy link
Contributor

So what would be the difference between these four statements?

crop = {}

crop = {left=none}

crop = {right=none}

crop = {left=none, right=none}

If there wouldn't be any difference, surely the shortest would be the best one.

@ustcfdm
Copy link
Author

ustcfdm commented Apr 30, 2024

Well, the difference is readability as what I have explained. crop = {} is not a good format. It violates Python's Zen "Explicit is better than implicit". For example, can you tell cop = {} is for 1D image or 2D image? Well, it is actually

crop = {left = 5, right = 5, top = 5, bottom = 5, front = 5, rear = 5}

Oh, it is for 3D image, and the default is 5, not none! Isn't this surprising? Did you expect this answer before I tell you? In addition, when the default is 5, how do you write it if you want a none value?

@eksortso
Copy link
Contributor

First off, TOML is not Python. But table keys are more explicit than anonymous tuples in Python, and math.inf is a thing that you can check for from TOML already, instead of just blindly feeding configuration values straight into your slices.

They were right. This isn't a TOML issue. It is a Python issue, and you're looking for someone else to make logic shortcuts for you.

You want explicit, so use inf and -inf in your TOML.

crop.left = 5
crop.right = inf  # left side is 5, right side is unbound
crop.top = -inf
crop.bottom = 5   # bottom side is 5, top side is unbound
crop.front = -inf
crop.rear = inf   # no need for this, but who expected three dimensions before you moved the goal posts?

In your Python code, let's say conf came directly from your TOML, and a is your one-dimensional array. Try this.

from math import inf

crop = conf.get("crop", {})
left = None if (left_val := crop.get("left", None)) == (-inf) else left_val
right = None if (right_val := crop.get("right", None)) == inf else right_val
result = a[left:right]

If your users are informed of how to do this, then you just need to convert inf and -inf to None when you read them in. Or you could just do what we said before: if keys are not defined, use None in your dict as default values for them.

Use these tips, rely on smarter configuration logic, stop insisting that you don't have what you need, and you'll be fine.

@ustcfdm
Copy link
Author

ustcfdm commented May 6, 2024

Thanks for your suggestion. You are right. I am indeed looking for a logic shortcut, otherwise I won't post the question here. inf is one of workaounds. In fact, any value except an integer can be used as a substitute, such as a bool, a string, etc. This method is what I have already done. In general, we have to use a different type of value to substitute none, right? It is not unworkable, but not good. For another example, I want to set a number as a parameter for the function in https://scikit-image.org/docs/stable/api/skimage.restoration.html#skimage.restoration.denoise_wavelet. The parameter 'sigma' could be float or None. How should I set none in TOML config file? inf? Well, inf has its physical meaning and is a valid parameter. Maybe I have to use other type of values, such as a string 'none'. Then it is becoming confusing. The parameter is supposed to be a number, why it can also be a string? In my opinion, none has its special place. It can has a substitue in a specific application, but none of others can truely replace its function and place.

@ChristianSi
Copy link
Contributor

Like I and others here have said before: Just omit the key/value pairs (whether parameters or anything else) for which you don't want a non-null value.

[params]
denoise_wavelet = { image = "myfile.jpg", wavelet = "db2", convert2ycbcr = true }
# Other params are left at their default values (usually none)

@ustcfdm
Copy link
Author

ustcfdm commented May 8, 2024

Then it goes back to what I have said: it's implict, not explict. If there are many hidden parameters, you can't see them and don't what they are. From your example, how can I know there are alose other parameters like 'sigma', 'wavelet_levels', 'mode', 'method', 'resscale_sigma', etc. Too many are hidden. Users even don't know their existence, not to mention to modify them. Even if they know their existence, it will be troublesome to modify them. For example, if they want to change one paramter from default to a different value, they will have to manually add a "new" key/value pairs. When they want to change it back to default, they have to totally delete it. Isn't this a bad design? And again, how can you set it to none when the default is not none?

@ChristianSi
Copy link
Contributor

ChristianSi commented May 9, 2024

Well, TOML is not usable as a software documentation tool, nor is it meant to be used so. Don't expect it to replace your docstrings. Assuming you had sigma = null, mode = null in your parameter list, how would that help the user? It doesn't show the type of the parameter, the values it can take, nor what it's good for. So without looking it up, they won't be able to change that null to anything else anyway.

@ustcfdm
Copy link
Author

ustcfdm commented May 10, 2024

First, please answer my question: how can you set it to none when the default is not none?
I didn't say it's for documentaion tool. The example is to show you the disadvantages of hidden defaults: too many parameters are invisible and you don't know their existence. You don't need to worry the type of the parameter. null is type of null, not number, not string, not others. It's of its own type. Other formats (such as YAML, json) have null. Do you mean their users will be in trouble when they see a null value?

@ChristianSi
Copy link
Contributor

@ustcfdm:

how can you set it to none when the default is not none?

Frankly, you shouldn't do that. If it's necessary, it's a signal of bad design. Any nullable parameter should have null as its default value, while for any parameter with a different default value, the function should be able to except that it'll never be null – after all, that's just why you specify a default value.

Also, I meant exactly what I wrote. If you didn't understand it, read it again. From my viewpoint, there is nothing more to add.

@ustcfdm
Copy link
Author

ustcfdm commented May 11, 2024

Therefore, we need to delete to a parameter if we want to set it to null, since "any nullable parameter should have null as its default value". Do I understand it correct? If so, could you please read this again and answer the omitted question?

Then it goes back to what I have said: it's implict, not explict. If there are many hidden parameters, you can't see them and don't what they are. From your example, how can I know there are alose other parameters like 'sigma', 'wavelet_levels', 'mode', 'method', 'resscale_sigma', etc. Too many are hidden. Users even don't know their existence, not to mention to modify them. Even if they know their existence, it will be troublesome to modify them. For example, if they want to change one paramter from default to a different value, they will have to manually add a "new" key/value pairs. When they want to change it back to default, they have to totally delete it. Isn't this a bad design?

@eksortso
Copy link
Contributor

Therefore, we need to delete to a parameter if we want to set it to null,

Or comment it out. When I write config templates, that's what I do.

[options]
# rarely_touched_option =  # Default None.

since "any nullable parameter should have null as its default value".

It needs to be said that TOML does not define default values. That is because default values are defined in your Python code, where your logic is. Also, in any documentation that you have for your code, the defaults ought to be specified there as well. You can do that in TOML comments, even though documentation is not TOML's job either.

rarely_touched_option = options.get("rarely_touched_option", None)

TOML is a configuration language at its heart. If you want to pass around data, or the explicit lack thereof, then there are formats more suited to that task. That's why JSON has null and TOML doesn't. If you don't want to provide a value to a key, comment it out. If you want to use default values, put that in your code, and not in your configuration.

@ustcfdm
Copy link
Author

ustcfdm commented May 14, 2024

If I understand it correctly, you agree that hidden default is not a good design. You said

If you want to use default values, put that in your code, and not in your configuration.

I totally agree with you! We shouldn't rely on default in configuration. Therefore, whatever deleting a parameter or commenting a parameter, something like the following

[options]
# rarely_touched_option =  # Default None.

is not as good as

[options]
rarely_touched_option =  none

which is more explicit and does not rely on default of configuration. Do you agree?

In addition, commenting out a parameter is not always better than deleting it. For example,

crop = {left=1, right=-2, top=3, bottom=-4}

I want to set right to none, how can you comment out only the right option without influencing others? It seems that you have to comment out the entire line and write a new line without the right parameter. On the contrary, simply deleting right=-2 is much easier and won't mess up the configuration with messy comments. However, none of them are as good as setting right=none, which is clear, concise, and explicit.

@Paalon
Copy link

Paalon commented May 16, 2024

There's nothing wrong with TOML Project's design decisions, but there are many cases where null values are necessary. Rust and Julia, which use TOML as a configuration file, have null (None in Rust and nothing in Julia) and it's an important concept. Julia and R have not only null but also missing (missing in Julia and NA in R). Such a essential singleton has important meaning in statistical data. Null thing is a only element of a type null, thus an empty table {} can't be an alternative of null because it is not a singleton. People who need null or missing will use other file formats like JSON or YAML, or binary format like Apache Arrow instead of TOML. Of course, I can understand the minimalistic goodness of TOML, so I can understand both feelings.

@Paalon
Copy link

Paalon commented May 16, 2024

I'm Julian, so I might be biased, the Julia document

explains clearly the concepts even if you are Pythonic or TOMLic.

@MikeHart85
Copy link

A configuration file format, the purpose of which is to set values in applications, refusing to support setting a specific value, which is broadly used and supported, is quite an absurd situation.

The various workarounds suggested are akin to refusing to support nan and telling people to use 0 instead, or refusing to support inf and claiming that using 999999999 is just as good since it's pretty much the same thing.

It is not the same thing.

There is a difference, both semantic and functional, between not setting a value and actively setting it to an unset state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants