Git: use --encoding=UTF-8 in "git log"
|Assignee:||Toshi MARUYAMA||% Done:|
Global setting for repositories log encoding is useless for git
git has config option i18n.logoutputencoding if it is empty, then log encoding is UTF-8
otherwise use value specified by option
scm: git: prepare version string unit lib test and git log encoding (#3396).
This file includes UTF-8 literal.
We need to consider Ruby 1.9 compatibity.
#2 Updated by Vitaliy Ischenko over 10 years ago
There is config option i18n.logOutputEncoding (per repository) in git which stores encoding for log output with git-log.
i18n.logOutputEncoding Character encoding the commit messages are converted to when running git-log and friends.
if it is empty or unset, then output will be UTF-8 encoded
else value specified in this option will be used
you can get this value with `git config i18n.logOutputEncoding`
#5 Updated by Toshi MARUYAMA over 8 years ago
#6 Updated by Toshi MARUYAMA over 8 years ago
Pass a configuration parameter to the command. The value given will override values from configuration files. The <name> is expected in the same format as listed by git config (subkeys separated by dots).
#7 Updated by Jean-François Dagenais over 8 years ago
I wrote an answer to Weverton Morais about how I patched a problem we had i beleive is related to this ticket. I maintain a modified linux kernel git repo, so lots of international names in there, I narrowed it down to a simple duplicating scenario.
Try making a dummy git commit with this name:
git commit -am"dummy test character encoding" --allow-empty --author="blaŻbla <email@example.com>"
Then do the changeset fetch, I use
ruby script/runner "Repository.fetch_changesets"
or the /sys/fetch_changesets with the key.
The logs will show a collation error on a query. We use git under linux platforms and never worried about encoding, so I believe our platforms default to utf8.
As my answer said, the problem seemed to be that all of the tables created by redmine (or TurnKey Linux? the base of our install.) were defaulted to latin1. In any case, the fetch_chagesets code should acount for the difference in encoding if needed.
#9 Updated by Vitaliy Ischenko over 8 years ago
Jean-François Dagenais wrote:
... so the point is, it's not just the file paths inside the repo, or the commit logs, but all text contained within the repo it seems.
According to docs this is false: i18n.commitencoding relates only to log message, all other parts should be treated as uninterpreted sequences of non-NUL bytes (file paths, author, commiter and other commit object headers).