Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] handling of UTF-16 and UTF-32 encoding #16

Open
ohault opened this issue May 20, 2024 · 7 comments
Open

[Feature request] handling of UTF-16 and UTF-32 encoding #16

ohault opened this issue May 20, 2024 · 7 comments

Comments

@ohault
Copy link

ohault commented May 20, 2024

In follow-up of #15, a level of support of UTF-16 LE encoding will be desired.

A very first level, would be to recognize a file encoded in either UTF-16 or UTF-32 and to handle a specific error to instruct/advice the user to convert input file into a encoding supported by initool.

In inicomp tool, an error message is displayed with an instruction for the user.
"Error: File .................. is encoded as UTF16. Please convert it to ANSI or UTF8 first."

@dbohdan
Copy link
Owner

dbohdan commented May 20, 2024

Good idea to detect the BOM. I have implemented it and released v0.16.0.

The programming language Standard ML and the compiler I use currently have very limited support for Unicode. There is no library I know for working with encodings like UTF-16 and UTF-32 or detecting them using heuristics. This means opening a UTF-16 or UTF-32 file without the BOM will keep generating an "invalid line" error in the foreseeable future. There is also no way to pass Unicode command-line arguments to initool on Windows.

I will keep this issue open in case the Unicode situation changes.

@ohault
Copy link
Author

ohault commented May 22, 2024

I have just tested version 0.16.0 with a HCU-Test.reg using the encoding UTF-16 LE.

Here are the results:

C:>initool version
0.16.0

C:>initool -p get HCU-Test.reg HKEY_CURRENT_USER\Test\subkey1

C:>echo %errorlevel%
1

C:>initool -p get HCU-Test.reg HKEY_CURRENT_USER\Test\subkey1 """test_string"""

C:>echo %errorlevel%
1

C:>initool -p get HCU-Test.reg
■W i n d o w s R e g i s t r y E d i t o r V e r s i o n 5 . 0 0

[ H K E Y _ C U R R E N T _ U S E R \ T e s t ]
" b i n " = h e x : 0 0 , 0 1 , 0 0 , 1 0 , 1 0 , 1 0 , 1 0

[ H K E Y _ C U R R E N T _ U S E R \ T e s t \ s u b k e y 1 ]
" D W O R D 6 4 " = h e x ( b ) : 2 3 , f e , 5 3 , 0 0 , 0 0 , 0 0 , 0 0 , 0 0
" t e s t _ s t r i n g " = " b l a b l a b l a "

[ H K E Y _ C U R R E N T _ U S E R \ T e s t \ s u b k e y 2 ]
" M u l t i " = h e x ( 7 ) : 4 6 , 0 0 , 6 f , 0 0 , 6 f , 0 0 , 0 0 , 0 0 , 4 2 , 0 0 , 6 1 , 0 0 , 7 2 , 0 0 , 0 0 , 0 0 , 0 0 , 0 0

For the last command, I guess it should also return an errorlevel 1

@dbohdan
Copy link
Owner

dbohdan commented May 22, 2024

Could you attach HCU-Test.reg to a comment? You may need to change the extension to .txt or put it in a ZIP archive.

@ohault
Copy link
Author

ohault commented May 22, 2024

HCU-Test.reg.txt.zip
Please find attached the requested file.

@dbohdan
Copy link
Owner

dbohdan commented May 22, 2024

Thanks.

@dbohdan
Copy link
Owner

dbohdan commented May 23, 2024

This was a bug in BOM detection. Thanks for reporting it. I have fixed the bug and released version 0.17.0, which includes other improvements.

@ohault
Copy link
Author

ohault commented May 23, 2024

Thank you @dbohdan

C:>initool.exe version
0.17.0

C:>initool -p get HCU-Test.reg
Error: unsupported encoding: UTF-16 LE

C:>echo %errorlevel%
1

C:>initool -p get HCU-Test.reg HKEY_CURRENT_USER\Test\subkey1
Error: unsupported encoding: UTF-16 LE

C:>echo %errorlevel%
1

C:>initool -p get HCU-Test.reg HKEY_CURRENT_USER\Test\subkey1 """test_string"""
Error: unsupported encoding: UTF-16 LE

C:>echo %errorlevel%
1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants