Defect #29442
closedVendor-defined characters in ISO-2022-JP email subject break issue's subject
Description
When you create an issue via email, the subject field in the issue will be broken if the subject line in the email is encoded with ISO-2022-JP and contains vendor-defined characters such as "①".
Suppose a user is trying to create an issue via email with the subject "①丸数字テスト" and it is encoded with ISO-2022-JP. Actually, the subject field of the issue will be "$B-!(B $B4]?t;z%F%9%H(B".

This is because the email's subject contains the character "①". The CIRCLED DIGIT ONE character, U+2460 in Unicode, is not defined in ISO-2022-JP. But it is defined in some vendor-extended variants like ISO-2022-JP-MS and widely used in the real world. Probably Ruby would not process "①" in ISO-2022-JP and raises Encoding::UndefinedConversionError.
The undefined character "①" should be replaced with the replacement character instead of breaking the whole subject. Please see ":undef" and ":replace" option of String#encode in http://ruby-doc.org/core-2.5.1/String.html#method-i-encode.
The attached patch is a test to detect the problem.
$ ruby test/unit/mail_handler_test.rb
.
(snip)
.
Encoding conversion failed "\xAD\xA1" to UTF-8 in conversion from ISO-2022-JP to stateless-ISO-2022-JP to EUC-JP to UTF-8
Encoding conversion failed "\xAD\xA1" to UTF-8 in conversion from ISO-2022-JP to stateless-ISO-2022-JP to EUC-JP to UTF-8
Encoding conversion failed "\xAD\xA1" to UTF-8 in conversion from ISO-2022-JP to stateless-ISO-2022-JP to EUC-JP to UTF-8
Encoding conversion failed "\xAD\xA1" to UTF-8 in conversion from ISO-2022-JP to stateless-ISO-2022-JP to EUC-JP to UTF-8
F
Failure:
MailHandlerTest#test_add_issue_with_iso_2022_jp_ms_subject [test/unit/mail_handler_test.rb:753]:
Expected /丸数字テスト/ to match "\e$B-!\e(B \e$B4]?t;z%F%9%H\e(B".
bin/rails test test/unit/mail_handler_test.rb:742
Files
Related issues