Defect #7134

is_binary_data uses wrong heuristic to decide if a string is binary

Added by Jérémy Lal almost 7 years ago. Updated almost 7 years ago.

Status:ClosedStart date:2010-12-19
Priority:LowDue date:
Assignee:-% Done:

0%

Category:SCM
Target version:-
Resolution:Duplicate Affected version:

Description

The is_binary_data method, used in repositories_controller.rb#124, is :

def is_binary_data?
  ( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty?
end

Applied to the attached file :

str=''
File.open("test_color.sh", "r") { |f|
    str = f.read
}
(str.count("^ -~", "^\r\n") +0.0)/ str.length
=> 0.311345646437995

Which clearly is not < 0.3, so that file is downloaded instead of being displayed,
when one want to view it.

test_color.sh Magnifier - Not binary file (758 Bytes) Jérémy Lal, 2010-12-19 14:00


Related issues

Duplicates Redmine - Defect #6256: Redmine considers non ASCII and UTF-16 text files as bina... Closed 2010-08-31

History

#1 Updated by Jérémy Lal almost 7 years ago

Ideally it should get the binary attribute from the SCM, but i don't know if all SCM have interface to get that information.

#2 Updated by Jean-Philippe Lang almost 7 years ago

  • Category set to SCM
  • Priority changed from Normal to Low

#3 Updated by ylgod jo almost 7 years ago

  • Assignee set to Jim Mulholland
  • % Done changed from 0 to 80

#4 Updated by Jean-Philippe Lang almost 7 years ago

  • Assignee deleted (Jim Mulholland)
  • % Done changed from 80 to 0

#5 Updated by Jean-Philippe Lang almost 7 years ago

  • Status changed from New to Closed
  • Resolution set to Duplicate

Same as #6256.

Also available in: Atom PDF