-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANSI to UTF8 conversion #2
base: master
Are you sure you want to change the base?
Conversation
I've found some issues, not ready to be added to trunk now. |
UTF8 without BOM works properly UTF8 with BOM is having issues(the very first field is unrecognized, which can lead to issues if the field isn't specified by user in the GUI)
UTF8 without BOM properly working UTF8 with BOM having issues (very first field not automatically recognized, must be set by user in the match view)
Conflicts: app/controllers/importer_controller.rb test/ansi2utf8/utf8_accent.csv test/ansi2utf8/utf8_noaccent.csv
ANSI issues are now resolved and fully tested. Talking about charsets, there is still an issue with charsets (which is not related to what I added). Sorry for the previous issues. My add is now fully-tested and working. (btw, I'm new to github and don't know how to add my new commit in this pull request) |
* unwanted separators at end of lines are removed * unwanted spaces at start of line and around separators are removed
Thanks for the pull request! I'm sorry I just haven't gotten a chance to review and pull it, I'm also out of the country for a couple weeks but I'll get to it I promise! |
No worries ! For later, let me explain you why we needed this and what I've done. We're using a lot MS Excel and we wanted to be able to fill in the issues fileds in Excel and export them to CSV. The way I implemented it was to add the ANSI choice and convert the ANSI data to UTF-8 (with iconv). It's just preprocessing of data before FasterCSV parsing. Because Excel makes any cell exist since it has been clicked on, this generated lot of irrelevant data (lines of separators ";;;;;;;;;..." until the clicked cell for example). I added some processing to handle that. A last point was the presence of blanks (" ") around separators which made the plugin return a 500 Error, or before the first field (which could make it unrecognised in the match view of the plugin). I also handled those cases. As I said before, there is a last thing regarding to charset issues, it's the UTF-8 with BOM which makes the very first field unrecognised (don't understand why FasterCSV wouldn't handle that). I did'nt take care of this in my commits since I wasn't really impacted by the issue, but I guess something like this might do the trick:
(there's must be a cleaner way to do this, but here is the idea) I'll also add some notes to the code so you can have a cleaner view of what I've done. |
end | ||
|
||
# preparing the import | ||
chomp_separators!(iip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the call to the function removing the irrelevant separators in the csv_data
Hi Thib, I'm finally getting a chance to look at this in detail, sorry for the long delay. First off, could you let me know what version/flavor of Excel you're using? Most of my work with the plugin has been with Excel-generated CSV files on Windows, but I'm confused on two points.
A couple other comments:
Sorry to pull you back to something you did so long ago! Leo |
Russian locale
I've added support for CSV files encoded in ANSI.
Importing ANSI files in UTF8 mode into redmine replaces accents like é,à or è by some unrecognized character.
I've implemented transcoding ANSI into UTF8 before parsing (FasterCSV does not parse ANSI files). This uses 'iconv' from ruby standard library.
This is convenient for windows users. Feel free to ask for questions or give some feedback.
thank you !