Salvage #13772 #13935

Swiftb0y · 2024-11-25T12:48:26Z

Simplified alternative of #13772 which implements the original scope I proposed.

Allow to convert from UTF-8 to whatever encoding the device supports

acolombier

LGTM!

acolombier · 2024-11-25T23:22:56Z

src/test/controllerscriptenginelegacy_test.cpp

+    case ISO_10646_UCS_2:
+        return 68;
+    }
+    // unreachable (TODO assert false?)


That assert would make sense IMO

yeah, still TODO

Returning part of a usable result instead of nullbytes since those likely terminate the string early or even corrupt the underlying binary message format the buffer is embedded in.

src/controllers/scripting/legacy/controllerscriptinterfacelegacy.h

Blocking issue resolved

christophehenry · 2024-11-27T21:57:22Z

res/controllers/engine-api.d.ts

+     * @param value The string to encode
+     * @returns The converted String as an array of bytes. Will return an empty buffer on conversion error or unavailable charset.
+     */
+    function convertCharset(targetCharset: string, value: string): ArrayBuffer


Wait this is still documented even though it's now an internal API?

forgot to remove. thanks for the reminder.

JoergAtGithub · 2024-11-27T22:12:04Z

src/controllers/scripting/legacy/controllerscriptinterfacelegacy.cpp

+    const char* encoderName = encoderNameArray.constData();
+#endif
+    QStringEncoder fromUtf16 = QStringEncoder(encoderName);
+    if (!fromUtf16.isValid()) {


Why you removed the flags here?

see the commit message, writing the replacement char is better than replacing with null bytes IMO.

This may be a problem. During the tests I noticed that replacement char varies between Ubuntu and Fedora (maybe between Qt versions). Replace invalid chars with \0x00 is the most predictable option.

Are you sure? I think it depends on the encoding. The Qt docs say they use QChar::ReplacementCharacter or a question mark.

This is what I noticed here. The relevant commit is probably this one. Although I tested your branch and the tests pass here too so I'm not sure anymore.

christophehenry · 2024-11-28T13:42:26Z

src/test/controllerscriptenginelegacy_test.cpp

+TEST_F(ControllerScriptEngineLegacyTest, convertCharsetCorrectValueStringCharset) {
+    const auto result = evaluate("engine.convertCharset('ISO-8859-15', 'Hello!')");
+
+    // ISO-8859-15 ecoded 'Hello!'
+    EXPECT_EQ(qjsvalue_cast<QByteArray>(result),
+            QByteArrayView::fromArray({'\x48', '\x65', '\x6c', '\x6c', '\x6f', '\x21'}));
+}
+
+TEST_F(ControllerScriptEngineLegacyTest, convertCharsetUnsupportedChars) {
+    auto result = qjsvalue_cast<QByteArray>(
+            evaluate("engine.convertCharset('ISO-8859-15', 'مايأ نامز')"));
+    char sub = '\x1A'; // ASCII/Latin9 SUB character
+    EXPECT_EQ(result,
+            QByteArrayView::fromArray(
+                    {sub, sub, sub, sub, '\x20', sub, sub, sub, sub}));
+}


Those tests aren't relevant anymore since it's an internal API now, are they?

mostly yes. I should translate them to their enum equivalent so we can at least be reasonably sure about the output data.

christophehenry · 2024-11-28T13:43:08Z

src/test/controllerscriptenginelegacy_test.cpp

+TEST_F(ControllerScriptEngineLegacyTest, convertCharsetUndefinedOnUnknownCharset) {
+    const auto result = evaluate("engine.convertCharset('NULL', 'Hello!')");
+
+    EXPECT_EQ(qjsvalue_cast<QByteArray>(result), QByteArrayView(""));
+}


yeah... Not quite sure what to do here honestly. It seems that there are implicit conversions to enums happening, which isn't great if it just silently succeeds. I wonder what value plain undefined results in? I'd guess undefined->0->US_ASCII?

Expose convertCharset convenience function to controller

58d8b98

Allow to convert from UTF-8 to whatever encoding the device supports

github-actions bot added controller mappings code quality controller backend labels Nov 25, 2024

Swiftb0y force-pushed the review/christophehenry/gh13772-charset-encoding-salvage branch 4 times, most recently from f4219e9 to bd5996c Compare November 25, 2024 14:25

simplify ControllerScriptInterfaceLegacy::convertCharset

d131d1c

Swiftb0y force-pushed the review/christophehenry/gh13772-charset-encoding-salvage branch from bd5996c to d131d1c Compare November 25, 2024 22:04

acolombier approved these changes Nov 25, 2024

View reviewed changes

Swiftb0y added 2 commits November 27, 2024 20:34

feat: add a couple more encodings to enum, don't throw on invalid codec

57007c8

fix: don't convert invalid chars to null

01b80a5

Returning part of a usable result instead of nullbytes since those likely terminate the string early or even corrupt the underlying binary message format the buffer is embedded in.

Swiftb0y requested a review from acolombier November 27, 2024 19:39

JoergAtGithub previously requested changes Nov 27, 2024

View reviewed changes

src/controllers/scripting/legacy/controllerscriptinterfacelegacy.h Outdated Show resolved Hide resolved

Swiftb0y added 2 commits November 27, 2024 22:48

fix: test

f876724

align closer with upstream

e2c74a9

christophehenry reviewed Nov 27, 2024

View reviewed changes

JoergAtGithub reviewed Nov 27, 2024

View reviewed changes

christophehenry reviewed Nov 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Salvage #13772 #13935

Salvage #13772 #13935

Swiftb0y commented Nov 25, 2024

acolombier left a comment

acolombier Nov 25, 2024

Swiftb0y Nov 25, 2024

christophehenry Nov 27, 2024

Swiftb0y Nov 28, 2024

JoergAtGithub Nov 27, 2024

Swiftb0y Nov 28, 2024

christophehenry Nov 28, 2024

Swiftb0y Nov 28, 2024 •

edited

Loading

christophehenry Nov 28, 2024 •

edited

Loading

christophehenry Nov 28, 2024

Swiftb0y Nov 28, 2024

christophehenry Nov 28, 2024

Swiftb0y Nov 28, 2024

Salvage #13772 #13935

Are you sure you want to change the base?

Salvage #13772 #13935

Conversation

Swiftb0y commented Nov 25, 2024

acolombier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Swiftb0y Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

christophehenry Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Swiftb0y Nov 28, 2024 •

edited

Loading

christophehenry Nov 28, 2024 •

edited

Loading