https://www.redmine.org/https://www.redmine.org/favicon.ico?16793021292016-12-29T07:14:55ZRedmineRedmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=754532016-12-29T07:14:55ZGo MAEDA
<ul><li><strong>File</strong> <a href="/attachments/17361">defect-24616.diff</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/17361/defect-24616.diff">defect-24616.diff</a> added</li><li><strong>Target version</strong> set to <i>3.3.2</i></li></ul><p>Looks good to me.</p>
<pre><code class="ruby syntaxhl"><span class="c1"># valid UTF-8 string</span>
<span class="n">text</span> <span class="o">=</span> <span class="s2">"こんにちは"</span>
<span class="nb">p</span> <span class="n">text</span><span class="p">.</span><span class="nf">valid_encoding?</span> <span class="c1"># => true</span>
<span class="c1"># making invalid UTF-8 string</span>
<span class="n">text</span><span class="p">.</span><span class="nf">force_encoding</span><span class="p">(</span><span class="s1">'ASCII-8BIT'</span><span class="p">)</span>
<span class="n">text</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0xff</span><span class="p">.</span><span class="nf">chr</span>
<span class="n">text</span><span class="p">.</span><span class="nf">force_encoding</span><span class="p">(</span><span class="s2">"UTF-8"</span><span class="p">)</span>
<span class="nb">p</span> <span class="n">text</span><span class="p">.</span><span class="nf">valid_encoding?</span> <span class="c1"># => false</span>
<span class="nb">p</span> <span class="n">text</span> <span class="c1"># => "こんにち\xE3\x81\xFF" </span>
<span class="c1"># Current code of Redmine</span>
<span class="nb">p</span> <span class="n">text</span><span class="p">.</span><span class="nf">encode</span><span class="p">(</span><span class="s2">"US-ASCII"</span><span class="p">,</span> <span class="ss">:invalid</span> <span class="o">=></span> <span class="ss">:replace</span><span class="p">,</span> <span class="ss">:undef</span> <span class="o">=></span> <span class="ss">:replace</span><span class="p">,</span> <span class="ss">:replace</span> <span class="o">=></span> <span class="s1">'?'</span><span class="p">).</span><span class="nf">encode</span><span class="p">(</span><span class="s2">"UTF-8"</span><span class="p">)</span>
<span class="c1"># => "??????" </span>
<span class="c1"># Fixed code by Pavel Rosický</span>
<span class="nb">p</span> <span class="n">text</span><span class="p">.</span><span class="nf">encode</span><span class="p">(</span><span class="s2">"UTF-8"</span><span class="p">,</span> <span class="ss">:invalid</span> <span class="o">=></span> <span class="ss">:replace</span><span class="p">,</span> <span class="ss">:undef</span> <span class="o">=></span> <span class="ss">:replace</span><span class="p">,</span> <span class="ss">:replace</span> <span class="o">=></span> <span class="s1">'?'</span><span class="p">)</span>
<span class="c1"># => "こんにち??" </span>
</pre></pre></code></pre> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=754542016-12-29T08:31:56ZToshi MARUYAMA
<ul><li><strong>Target version</strong> deleted (<del><i>3.3.2</i></del>)</li></ul><p>Did you run whole tests?<br />Especially this test.<br /><a class="source" href="https://www.redmine.org/projects/redmine/repository/svn/entry/tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb">source:tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb</a></p> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=754552016-12-29T08:34:17ZToshi MARUYAMA
<ul></ul><p>Pavel Rosický wrote:</p>
<blockquote>
<p>Hello,<br />I've an email, that is encoded in utf8, but it contains an invalid character. In this case, redmine converts the content to us-ascii and then to utf8. This step will replace non-ascii compatible chars to "?". Why?</p>
</blockquote>
<p>You can see this function purpose.<br /><a class="source" href="https://www.redmine.org/projects/redmine/repository/svn/entry/tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb#L68">source:tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb#L68</a></p> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=754562016-12-29T09:41:34ZPavel Rosický
<ul></ul><p>Thanks Toshi, I rechecked it again and all tests are passing.</p>
<p><a class="source" href="https://www.redmine.org/projects/redmine/repository/svn/entry/tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb#L68">source:tags/3.3.1/test/unit/lib/redmine/codeset_util_test.rb#L68</a><br />In this case, my change has no effect on the result, because the string contains just one invalid utf-8 character.</p>
<pre>
s1.encode('us-ascii', :invalid => :replace, :undef => :replace, :replace => '?').encode('utf-8')
"Texte encod? en ISO-8859-1."
</pre>
<pre>
# patched
s1.encode('utf-8', :invalid => :replace, :undef => :replace, :replace => '?')
"Texte encod? en ISO-8859-1."
</pre>
<p>but a combination of valid and invalid utf-8 chars (non-ascii-compatible) will result both characters are stripped. Try out GO Media's example.</p> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=763452017-01-28T04:19:53ZToshi MARUYAMA
<ul></ul><pre>
$ irb
1.9.3-p551 :001 > text = "こんにち\xE3\x81\xFF"
=> "こんにち\xE3\x81\xFF"
1.9.3-p551 :002 > text = text.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => '?')
=> "こんにち\xE3\x81\xFF"
1.9.3-p551 :003 > text.valid_encoding?
=> false
</pre>
<pre>
$ irb
2.3.3 :001 > text = "こんにち\xE3\x81\xFF"
=> "こんにち\xE3\x81\xFF"
2.3.3 :002 > text = text.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => '?')
=> "こんにち??"
2.3.3 :003 > text.valid_encoding?
=> true
</pre> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=763472017-01-28T06:07:58ZToshi MARUYAMA
<ul></ul><p>Pavel Rosický wrote:</p>
<blockquote>
<p>Hello,<br />I've an email, that is encoded in utf8, but it contains an invalid character. In this case, redmine converts the content to us-ascii and then to utf8. This step will replace non-ascii compatible chars to "?". Why?</p>
</blockquote>
<p>Because of Ruby 1.8.7 behavior compatibility.<br /><a class="source" href="https://www.redmine.org/projects/redmine/repository/svn/entry/tags/2.6.9/lib/redmine/codeset_util.rb">source:tags/2.6.9/lib/redmine/codeset_util.rb</a></p> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=763482017-01-28T06:09:20ZToshi MARUYAMA
<ul><li><strong>Subject</strong> changed from <i>encoding error if email contains an invalid utf8 character</i> to <i>Should not replace all invalid utf8 characters</i></li><li><strong>Category</strong> changed from <i>Email receiving</i> to <i>I18n</i></li></ul> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=763492017-01-28T06:10:15ZToshi MARUYAMA
<ul><li><strong>Subject</strong> changed from <i>Should not replace all invalid utf8 characters</i> to <i>Should not replace all invalid utf8 characters (e.g in mail)</i></li></ul> Redmine - Defect #24616: Should not replace all invalid utf8 characters (e.g in mail)https://www.redmine.org/issues/24616?journal_id=763512017-01-28T07:56:08ZToshi MARUYAMA
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Closed</i></li><li><strong>Target version</strong> set to <i>3.4.0</i></li><li><strong>Resolution</strong> set to <i>Fixed</i></li></ul><p>I have committed <a class="changeset" title="do not replace all invalid utf8 (#24616)" href="https://www.redmine.org/projects/redmine/repository/svn/revisions/16273">r16273</a> to pass on Ruby 1.9.3.<br />I don't want to change behavior on stable.</p>