Defect #11089

UTF-8 encoding not showing correctly when looking highlighted php file contents

Added by Troex Nevelin over 5 years ago. Updated over 4 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Text formatting
Target version:-
Resolution: Affected version:2.0.1

Description

Ruby version              1.9.3 (x86_64-linux)
RubyGems version          1.8.11
Rack version              1.4
Rails version             3.2.5
Database adapter          mysql2
Database schema version   20120422150750
Git version               1.7.2.5

When I request file to see it contents (repository/revisions/HASH/entry) instead UTF-8 text I get '???'.
I'm using Git SCM and my files are valid UTF-8 (without BOM). I have this problem with Chineses, Russian, Thai and other scripts than latin.
However seeing diff's and attached utf-8 files are okay.

utf-8-not-shown-in-file-contents-view.png (50.4 KB) Troex Nevelin, 2012-06-04 19:13

diff-view-is-okay.png (56.6 KB) Troex Nevelin, 2012-06-04 19:13

gh-new-d7e2a66d.png (53.8 KB) Toshi MARUYAMA, 2012-06-05 15:13

git-show-component.php.txt Magnifier (2.61 KB) Troex Nevelin, 2012-06-05 15:36

git-show-component.php Magnifier - the same file but with different extension (2.61 KB) Troex Nevelin, 2012-06-05 15:39

issue-attached-files.png (15.2 KB) Troex Nevelin, 2012-06-05 15:45

test-file-with-php-ext.png (145 KB) Troex Nevelin, 2012-06-05 15:45

test-file-with-txt-ext.png (93.5 KB) Troex Nevelin, 2012-06-05 15:45

def.php Magnifier (31 Bytes) gehao liu, 2012-06-11 11:09

def.py Magnifier (38 Bytes) gehao liu, 2012-06-11 11:09

def.txt Magnifier (31 Bytes) gehao liu, 2012-06-11 11:09

def.php.png (3.93 KB) Toshi MARUYAMA, 2013-06-04 15:40


Related issues

Duplicated by Redmine - Defect #11131: repository View and Annotate code Utf-8 show ??? ,diff is... Closed
Duplicated by Redmine - Defect #14445: <code> inside <pre> destroy polish diacritic of PHP Closed

History

#1 Updated by Etienne Massip over 5 years ago

Did you set any value in the Attachments and repositories encodings setting (in Administration/Settings General tab)?

If not, try to?

#2 Updated by Troex Nevelin over 5 years ago

Yes I have tried setting it to UTF-8 but it has no effect

#3 Updated by Toshi MARUYAMA over 5 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking file contents to Git: encoding not showing correctly when looking file contents

#4 Updated by Toshi MARUYAMA over 5 years ago

Redmine uses "git show".
source:tags/2.0.1/lib/redmine/scm/adapters/git_adapter.rb#L372

Git 1.7.3.4, "git show --help" says

The contents of the blob objects are uninterpreted sequences of bytes. 
There is no encoding translation at the core level.

#5 Updated by Troex Nevelin over 5 years ago

I understand that git stores files in binary form, but calling from console:

git show --no-color HEAD:.../lang/ru/component.php

returns UTF-8 valid text, as I understand Redmine tries to guess encoding and sanitise content making sure no invalid characters pass to view.

For example source:trunk/config/locales/ja.yml this displays up correctly (but it uses SVN).

I think there is encoding guess problem in source:tags/2.0.1/lib/redmine/codeset_util.rb#L84 calling .to_utf8_by_setting_internal(str) sets ASCII-8BIT encoding on line 94?

#6 Updated by Toshi MARUYAMA over 5 years ago

I cannot reproduce.
https://github.com/redmine/redmine/commit/d7e2a66d

Could you attach this "git show" output file?

#7 Updated by Troex Nevelin over 5 years ago

git show --no-color HEAD:.../lang/ru/component.php > git-show-component.php.txt

I'm running Redmine on Debian 6, with ruby 1.9.3p125 (2012-02-16) [x86_64-linux] package compiled from debian ruby repository, using unicorn rack server.

I'm almost sure this is local related problem. Can you guide me how to debug this problem? I'm familier with ruby and ror. I have tried to output raw content in app/views/common/_file.html.erb but it gives me ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT) error

#9 Updated by Troex Nevelin over 5 years ago

I've made one more test on my setup, I've attached the same file to an issue but with different extensions .txt and .php and when trying to see attached file I get an issue with viewing syntax highlighted file. So this is not only Git related problem.

But no issue here in this ticket.

# grep coderay Gemfile.lock 
    coderay (1.0.6)
  coderay (~> 1.0.6)

#10 Updated by Toshi MARUYAMA over 5 years ago

  • Subject changed from Git: encoding not showing correctly when looking file contents to UTF-8 encoding not showing correctly when looking file contents
  • Category deleted (SCM)

#11 Updated by gehao liu over 5 years ago

  • Status changed from New to Resolved

same problem!!!!!!!!!!
txt extname is OK,
python extname .py is OK.
php extname is wrong,
problem is viewing syntax highlighted!!!!!!!!!!

#12 Updated by gehao liu over 5 years ago

gehao liu wrote:

same problem!!!!!!!!!!
txt extname is OK,
python extname .py is OK.
php extname is wrong,
problem is viewing syntax highlighted!!!!!!!!!!

#13 Updated by Toshi MARUYAMA over 5 years ago

  • Status changed from Resolved to New

#14 Updated by gehao liu over 5 years ago

this is coderay 1.0.6's bug,only php file.

#15 Updated by András Kolesár over 4 years ago

coderay php encoding issue has been solved:
https://github.com/rubychan/coderay/issues/40

checked, works fine with updated coderay/scanners/php.rb file

#16 Updated by Etienne Massip over 4 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking file contents to UTF-8 encoding not showing correctly when looking highlighted file contents
  • Category set to Text formatting
  • Status changed from New to Confirmed
  • Target version set to Candidate for next minor release

Upgrade dep to 1.0.9 or 1.1.

#17 Updated by Toshi MARUYAMA over 4 years ago

  • Related to Defect #13692: warning: already initialized constant on Ruby 1.8.7 added

#18 Updated by Toshi MARUYAMA over 4 years ago

  • Related to deleted (Defect #13692: warning: already initialized constant on Ruby 1.8.7)

#19 Updated by Toshi MARUYAMA over 4 years ago

  • File def.php.png added
  • Status changed from Confirmed to Closed

Coderay version is defined "~> 1.0.6" in source:tags/2.3.1/Gemfile#L6
So, Coderay 1.0.9 is installed in 2.3.1.

This is note 12 php image.

#20 Updated by Toshi MARUYAMA over 4 years ago

  • Subject changed from UTF-8 encoding not showing correctly when looking highlighted file contents to UTF-8 encoding not showing correctly when looking highlighted php file contents

#21 Updated by Toshi MARUYAMA over 4 years ago

  • Target version deleted (Candidate for next minor release)

#22 Updated by Toshi MARUYAMA over 4 years ago

  • Duplicated by Defect #14445: <code> inside <pre> destroy polish diacritic of PHP added

Also available in: Atom PDF