Allow switching the encoding to UTF-8 when exporting to CSV
|Assignee:||Go MAEDA||% Done:|
In the current implementation, the encoding of exported CSV is fixed for each language and users cannot change it.
Sometimes it is problematic for teams using multiple languages. Suppose the situation that issues in a project are written in Japanese or Chinese (mixed language project; some issue are written in Japanese and some issues are written Chinese). If you want to export those issues to CSV, using UTF-8 as the CSV encoding is only solution to get readable (not garbled) CSV file. But actually the encoding is fixed to CP932 for Japanese users (source:tags/3.3.3/config/locales/ja.yml#L164) and gb18030 for Chinese users (source:tags/3.3.3/config/locales/zh.yml#L147). No one can get CSV file in UTF-8 without modifying source code of Redmine.
I think the problem can be resolved if users can override general_csv_encoding setting in CSV export options window like the following picture. The encoding in the drop-down list is defaults to general_csv_encoding and users can change to arbitrary encoding. We already have the similar drop-down in CSV import feature.
#2 Updated by Felix Schäfer over 1 year ago
Thank you for this suggestion, I think this would be a great improvement to the CSV export function.
I can corroborate the problem, we (at Planio) have had numerous customers in the last months, for example Russian users that use Planio/Redmine in English, which will have ??? in the export instead of Cyrillic characters because the export encoding for English does not support Cyrillic characters.
#3 Updated by Go MAEDA over 1 year ago
Thanks, Felix. I want your advice.
At first I wrote that "users can change to arbitrary encoding". I thought that I can make a list of encodings from Setting::ENCODINGS constant. But now I think it is useless to display such a long list in the CSV export options and displaying only 2 choices, UTF-8 and the value of general_csv_encoding is enough. Because there are many incompatible encodings for each languages. For example, Japanese text is conpatible only with small number of encodings such as CP932, Shift_JIS, ISO-2022-JP, EUC-JP and so on.
What do you think about this? Displaying two choices (UTF-8 and general_csv_encoding) are enough?
#5 Updated by Tatsuya Saito over 1 year ago
IMHO, this option isn't needed to show EVERYTIME.
If we can change the encoding on account or system setting, almost case is probably ok.
For example of bad case, the environment is mixed several languages, and Excel on Mac which cannot recognize utf-8 CSV on opening by default.
But in this case, it will not be solved with the option on CSV exporting dialog because it will need to export with utf-8.
Any other bad case when only change the default encoding?
#8 Updated by Go MAEDA over 1 year ago
- Target version set to Candidate for next major release
Mizuki Ishikawa, thanks for writing the patch. I tried out and it works fine on Issues and Spent time.
With the patch, we can override general_csv_encoding with UTF-8. "Encoding" drop-down is not displayed if general_csv_encoding for an user is UTF-8.
#12 Updated by Holger Just over 1 year ago
This is a common issue for Redmine installations using multiple languages from different language families like English with Japanese or German with Russian. since stored content might be mixed between the languages, it is generally very hard to find a common encoding for those. Forcing users to chose a Japanese locale in order to allow them to export Japanese text in their Redmine even if they usually use English is probably not the best user experience.
In addition to that, I understand that the current language-dependent encodings where chosen to match whatever the default encoding of Excel on Windows was for the respective language. However, this default doesn't even hold correct for other platforms. E.g. Excel 2016 for Mac with en-US locale defaults to a semicolon instead of a comma for the expected separator. In any case, both the separators as well as the encoding can be changed during opening of the file (File -> Import -> CSV file in Excel). This however requires that the exported CSV is saved with an encoding that is able to represent all of the characters in the fields.
The approach of allowing the selection of the default language and UTF-8 is in my opinion the right one. As such, I think the patch of Mizuki Ishikawa looks fine on first check since it allows to use the default encoding for the common case (as it is now) and also allows the use of UTF-8 for the more complex case. In any case, I think it's a good idea to show the user an indication of the encoding the file will be in so that they can configure their readers accordingly.
In the patch, I'd only add a check for a valid encoding in
lib/redmine/export/csv.rb to avoid an exception in case an invalid encoding was specified.
- Subject changed from Allow to overide general_csv_encoding in CSV export options window to Allow switching the encoding to UTF-8 when exporting to CSV
- Status changed from New to Closed
- Assignee set to Go MAEDA
- Target version changed from 4.1.0 to 4.0.0
- Resolution set to Fixed
Committed. Thank you for improving Redmine.