Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(to_string): change the behaviour of bytes to string #1141

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

qazxcdswe123
Copy link
Contributor

BREAKING CHANGE for bytes to string

Copy link

peter-jerry-ye-code-review bot commented Oct 18, 2024

From the provided git diff output, here are three observations that might indicate potential issues or areas for improvement in the code:

  1. Hexadecimal Representation in Multiline Strings:

    • File: builtin/bigint_wbtest.mbt
    • Lines: Multiple occurrences (e.g., @@ -709,7 +709,7 @@ test "to_octets" {)
    • Observation: The changes replace the hexadecimal byte sequences with more readable ASCII representations. While this might be intended for better readability, it could potentially obscure the exact byte values being tested, which might be crucial for certain tests. It's important to ensure that the tests are still accurately verifying the intended byte sequences.
  2. String Construction Logic in to_string Function:

    • File: builtin/bytes.mbt
    • Lines: @@ -14,7 +14,55 @@
    • Observation: The original implementation directly converted the byte sequence to a string. The new implementation introduces a more complex logic that includes escaping special characters and converting non-printable characters to their hexadecimal representation. This change could potentially impact the performance and might not be necessary if the primary use case does not involve escaping. It's important to verify that this new logic is indeed required for the intended functionality and does not introduce unintended side effects.
  3. Redundant Code in Show Implementation for Bytes:

    • File: builtin/show.mbt
    • Lines: @@ -39,24 +39,7 @@ pub impl Show for Byte with output(self, logger) {
    • Observation: The original implementation of the Show trait for Bytes included a detailed method to convert bytes to their hexadecimal representation for logging purposes. The new implementation simply calls the to_string method on the Bytes object. This change reduces redundancy but relies on the to_string method, which has been significantly altered. It's crucial to ensure that the to_string method provides the correct output format for logging purposes and that this change does not inadvertently simplify logging beyond its intended scope.

In summary, while the changes aim to improve readability and maintainability, they also introduce more complex logic and reliance on newly modified methods. It's essential to thoroughly test these changes to ensure they do not break existing functionality or introduce performance issues.

Char::from_int('a'.to_int() + (i - 10))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we print ascii as is?
e.g, b"a\0b\0"

Copy link
Contributor Author

@qazxcdswe123 qazxcdswe123 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's kinda hard to find the ascii inside a bunch of \00.

Also encounter: Fatal error: exception File "inlined_snapshot/_build/Core_format.ml", line 200, characters 8-14: Assertion failed. Investigating

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we print ascii as is? e.g, b"a\0b\0"

displaying some byte as ASCII may not be a good idea. Bytes are not always converted from a String, they may come from data of an image or some pieces of memory. ASCII is meaningless in such cases.

Copy link
Contributor Author

@qazxcdswe123 qazxcdswe123 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second this as UTF-16 encoded string has lots of \x00 and printing them is very ugly.

And Bytes should just be Bytes, it should print in Byte format, not readble format.

@coveralls
Copy link
Collaborator

coveralls commented Oct 21, 2024

Pull Request Test Coverage Report for Build 3658

Details

  • 19 of 19 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.05%) to 82.791%

Totals Coverage Status
Change from base Build 3652: 0.05%
Covered Lines: 4272
Relevant Lines: 5160

💛 - Coveralls

inspect!(
"\n\t\"\\\r\b".to_bytes(),
content=
#|b"\n\x00\t\x00\"\x00\\\x00\r\x00\b\x00"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not quite sure whether \n should yield \\n or \n like this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\\n should never be used unless the user absolutely wants to see the backslash, which is typically not the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants