Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move brToString(BytesRef) to ToStringUtils #13068

Merged
merged 11 commits into from
Feb 15, 2024
Merged

Conversation

sabi0
Copy link
Contributor

@sabi0 sabi0 commented Feb 3, 2024

I noticed there are brToString() methods in 16 different classes.
What do you think of providing a shared implementation in the ToStringUtils class?

If you find the proposal useful I will proceed with removing the other copies.

Otherwise the ToStringUtils class could probably be deleted - the two methods there byteArray() and longHex() are not used anywhere.

@almogtavor
Copy link

almogtavor commented Feb 3, 2024

@sabi0 i think you should also migrate these 16 classes to use the method from the ToStringUtils in the same PR. Otherwise there will be 17 classes with brToString

@sabi0
Copy link
Contributor Author

sabi0 commented Feb 3, 2024

That's what I meant with

I will proceed with removing the other copies

Did not want to waste time on this in case the suggestion would be rejected.

I will push the changes shortly.

@uschindler uschindler self-assigned this Feb 4, 2024
@uschindler uschindler added this to the 9.10.0 milestone Feb 4, 2024
@uschindler uschindler requested a review from mikemccand February 4, 2024 20:18
@sabi0
Copy link
Contributor Author

sabi0 commented Feb 8, 2024

Thank you @cpoerschke!

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sabi0 -- I can't believe how many (badly) forked copies of this method had proliferated.

I just left one naming comment, otherwise this looks awesome. Naming is the hardest part :)

public static String longHex(long x) {
char[] asHex = new char[16];
for (int i = 16; --i >= 0; x >>>= 4) {
asHex[i] = HEX[(int) x & 0x0F];
}
return "0x" + new String(asHex);
}

@SuppressWarnings("unused")
public static String brToString(BytesRef b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename it to bytesRefToString?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just toString(BytesRef b)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think toString(br) will look confusingly similar to br.toString().

How about format(br)? Or prettyPrint(br) maybe?
(bytesRefToString(br) does look a bit long)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I rename it to bytesRefToString() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please do. Better a bit more verbose than cryptic.

public static String longHex(long x) {
char[] asHex = new char[16];
for (int i = 16; --i >= 0; x >>>= 4) {
asHex[i] = HEX[(int) x & 0x0F];
}
return "0x" + new String(asHex);
}

@SuppressWarnings("unused")
public static String brToString(BytesRef b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add a javadoc explaining how this differs from BytesRef.toString? E.g. this one does not assume valid UTF-8 but the other one days (I think)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method could be the default toString on BytesRef, actually... in most cases when you wish to dump it using toString it's for debugging purposes: an array of bytes would always be there, utf8 as an additional helpful hint... why add another method? Even the fact that this additional method exists points at the need to actually dump BytesRef in a more verbose way...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too. But reckoned that could be too much of a change if all BytesRef's start spitting out text.

I guess the format will have to be changed somewhat to avoid the BytesRef's text "blending in" with the preceding text:

... previous " + bytesRef;  // ... previous term [ww xx yy zz]

Also there is an unfortunate side-effect of an exception being thrown for "binary" data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Small steps. We can always redirect and inline this method later on.

Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me.

We must move the changes entry to next version as the release process already started.

@uschindler
Copy link
Contributor

Please move entry to 9.11.0

@uschindler
Copy link
Contributor

Will take care of this tomorrow. Sorry for delay.

@uschindler
Copy link
Contributor

To the other reviewers: I would like to merge this as a first step. To me this looks fine as it removes the code duplication and we have a better method name. Ok?

If we remove the static method and instead use toString() for that, should possibly another issue. I like the discussion started by @mikemccand and @dweiss.

I plan to merge this late afternoon German time.

@uschindler uschindler merged commit 9206bde into apache:main Feb 15, 2024
4 checks passed
@sabi0 sabi0 deleted the brToString branch February 15, 2024 17:21
asfgit pushed a commit that referenced this pull request Feb 15, 2024
@uschindler
Copy link
Contributor

Backport was easy, no conflicts. I also checked with fgrep -R brToString * but no occurrences found.

Thanks @sabi0 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants