Defect #20730
closed
  
Fix tokenization of phrases with non-ascii chars
 
        
        Added by Jens Krämer about 10 years ago.
        Updated about 10 years ago.
        
  
  
  
  Description
  
  \w only matches ASCII characters, we should either use [:alnum:] instead or simply match all non-" characters for the phrase. Test case included.
   
 
  
  Files
  
 
  
  
    
    
    
    
       - Tracker changed from Patch to Defect
 
       - Target version set to 3.1.2
 
    
    +1
	Search keyword '"日本語 テスト"' (written in Japanese) matches both "日本語 テスト" and "日本語テスト" in the current trunk, but it should not match the latter.
	expected:
Redmine::Search::Fetcher.new('"日本語 テスト"', ...).tokens => ['日本語 テスト']
	actual:
Redmine::Search::Fetcher.new('"日本語 テスト"', ...).tokens => ['日本語', 'テスト']
	This behavior can be fixed by this patch.
 
     
   
  
  
    
    
    
    
       - Status changed from New to Closed
 
       - Assignee set to Jean-Philippe Lang
 
       - Target version changed from 3.1.2 to 3.0.6
 
       - Resolution set to Fixed
 
    
    
     
   
  
 
  
  
  
 
Also available in:  Atom
  PDF