Feature #5864

Regex Text on Receiver Email

Added by Sam Bo over 7 years ago. Updated 4 months ago.

Status:ClosedStart date:2010-07-10
Priority:NormalDue date:
Assignee:Jean-Philippe Lang% Done:

0%

Category:Email receiving
Target version:3.4.0
Resolution:Fixed

Description

Modify the "Truncate Email after these lines" section or add a new section that allows a regex to be added.. This would make the truncation of emails much easier and the ability for anyone to support their specific requirements for truncating after the email system replied to text ("On Wed June 24th 2010 Johnny wrote:" ).

This is related to #2852

I modified the controller for the mail hander to look for both the text truncation and hard coded the reply line that my email system uses.. It could easily be adapted to use a similar approach and regex check multiple expressions to handle different scenarios.

My Hard Coded Example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(On (.*)wrote:[\r\n].*)", Regexp::MULTILINE)

A proposed generic pseudo-code controller example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(#{ delimitersregex.join('|') })", Regexp::MULTILINE)

redmine_regex.png (23.6 KB) Ben Blanco, 2016-01-27 16:09

redmine_regex_incoming_emails.png (20.7 KB) Ben Blanco, 2016-02-15 18:52

allow_regex_delimiters.patch Magnifier (3.02 KB) Marius BALTEANU, 2016-10-19 23:50

regex_delimiter_setting.png (57.8 KB) Marius BALTEANU, 2016-11-23 21:49

allow_regex_delimiters_v2.patch Magnifier (5.21 KB) Marius BALTEANU, 2016-11-23 21:50

allow_regex_delimiters_v3.patch Magnifier (9.29 KB) Marius BALTEANU, 2016-12-11 18:23

make_text_clickable.patch Magnifier (1.41 KB) Marius BALTEANU, 2016-12-15 22:34


Related issues

Related to Redmine - Patch #11684: Truncate incoming emails New
Duplicated by Redmine - Patch #10069: delimiter improvments Closed

Associated revisions

Revision 16065
Added by Jean-Philippe Lang 10 months ago

Optional Regex delimiters to truncate incoming emails (#5864).

Revision 16066
Added by Jean-Philippe Lang 10 months ago

Adds :setting_mail_handler_enable_regex_delimiters i18n string (#5864).

Revision 16067
Added by Jean-Philippe Lang 10 months ago

Adds :not_a_regexp i18n string (#5864).

Revision 16079
Added by Jean-Philippe Lang 10 months ago

Makes the text "Enable regular expressions" clickable (#5864).

History

#1 Updated by Bart Stuyckens about 7 years ago

+1

#2 Updated by Terence Mill about 7 years ago

+1

#3 Updated by Akshat Pradhan about 7 years ago

Sam, where did you put this? I'm looking in mail_handler_controller.rb and that doesn't look like the correct place to put this. I also want to be able to regex out a certain portion of all incoming emails. I use google apps/smtp for all received emails.

Sam Bo wrote:

My Hard Coded Example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(On (.*)wrote:[\r\n].*)", Regexp::MULTILINE)

A proposed generic pseudo-code controller example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(#{ delimitersregex.join('|') })", Regexp::MULTILINE)

#5 Updated by Sam Bo almost 6 years ago

I'm going to 1 (not the G kind mind you) this since it has been over a year. I'd love to see this so we don't have to manually update the model on each upgrade! If I knew anything about Ruby I'd submit a patch.

#6 Updated by Sam Bo almost 6 years ago

Ouch, the plus 1 didn't come through correctly. Sorry for the double posting here.

#7 Updated by Nick Caballero over 5 years ago

diff -r 146377aeb5a8 app/models/mail_handler.rb
--- a/app/models/mail_handler.rb    Sun May 13 19:09:35 2012 +0000
+++ b/app/models/mail_handler.rb    Fri May 25 23:00:22 2012 +0000
@@ -415,9 +415,9 @@

   # Removes the email body of text after the truncation configurations.
   def cleanup_body(body)
-    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?).map {|s| Regexp.escape(s)}
+    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
     unless delimiters.empty?
-      regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
+      regex = Regexp.new("^(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
       body = body.gsub(regex, '')
     end
     body.strip

#8 Updated by Jens Schneider over 4 years ago

+1

This would be extremly helpfull for filtering multilingual email replys.

Depending on the users email programm, the sender could be displayed as:

Von: [mailto:]
or
From: [mailto:]

With a regular expression, this could be filtered out with a single line in the redmine configuration.

#9 Updated by Chris Birchall over 4 years ago

The use of multiline regex can make it quite tricky to write delimiters correctly. e.g. if you start your delimiter with .* then it can match the whole message, resulting in the whole message being deleted.

I went for a slightly safer fix: use a normal regex, and if you find a line matching any of the delimiters, delete that line and anything after it.

diff --git a/app/models/mail_handler.rb b/app/models/mail_handler.rb
index c84672b..82fd5fe 100644
--- a/app/models/mail_handler.rb
+++ b/app/models/mail_handler.rb
@@ -483,10 +483,16 @@ class MailHandler < ActionMailer::Base

   # Removes the email body of text after the truncation configurations.
   def cleanup_body(body)
-    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?).map {|s| Regexp.escape(s)}
+    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
     unless delimiters.empty?
-      regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
-      body = body.gsub(regex, '')
+      # Combine all delimiters into one regex
+      regex = Regexp.new("^(#{ delimiters.join('|') })")
+
+      # If the regex matches a line
+      regex.match(body) { |m|
+        # Delete the matched line and everything after it
+        body = body[0 ... m.begin(0)]
+      }
     end
     body.strip
   end

#10 Updated by David Aaron Fendley about 4 years ago

Chris Birchall wrote:

The use of multiline regex can make it quite tricky to write delimiters correctly. e.g. if you start your delimiter with .* then it can match the whole message, resulting in the whole message being deleted.

I went for a slightly safer fix: use a normal regex, and if you find a line matching any of the delimiters, delete that line and anything after it.

[...]

This worked perfectly. Thank you, Chris!

#11 Updated by Toshi MARUYAMA over 3 years ago

#12 Updated by Antonio García-Domínguez over 3 years ago

The patch by Chris worked beautifully in Redmine 2.3.2. Thanks!

+1

#13 Updated by Andrew Hills over 3 years ago

I did not want to maintain a patch for my installation of Redmine, so I've created a plugin (tested only on 2.4.3 thus far) changing the behavior of the truncation field from text snippets per line to regular expressions per line.

http://www.redmine.org/plugins/redmine_mail_handler_clean_body_regexp

#14 Updated by Massimo Rossello over 3 years ago

+1
Thank you for this life saving patch, the plugin works like a charm!

#15 Updated by Kevin Palm over 3 years ago

+1

#16 Updated by Patrice Bonhomme over 2 years ago

+1

#17 Updated by Michael Schaefer about 2 years ago

+1 I'd so much like to see this integrated in redmine!

#18 Updated by Alexander Ryabinovskiy about 2 years ago

+1, nice feature.

#19 Updated by Ismael Barros² almost 2 years ago

+1, please do

#20 Updated by Sebastian Paluch almost 2 years ago

+10!

#21 Updated by Ben Blanco over 1 year ago

Thanks for this!

I nearly have email truncation working as I'd like to - ie. it wasn't working at all with stock 3.2.0 redmine for me.

However, with the above tweak of app/models/mail_handler.rb it finally actually performs some truncation.

Now, the last thing I'd like to do is get rid of email clients' first line on replies (as mentionned by Jens Schneider), ie.

On Mon, Feb 1, 2016 at 12:35, redmine@foo.com wrote:
> Feature #5864: Regex Text on Receiver Email
> blabla
> blabblabla

So I'm looking for how to write a regex that would find & select the first line where it finds the mention of the redmine server's email address - so as to delete it, as well as any line following it.

For now I have come up with: (redmine@foo\.com)

Which sort of works ok, when I test it with a sample text - here: http://rubular.com/r/OowzIArxPf

But it's not working when I declare it in redmine..

Is it because the regex is not good (enough)?

Or is it because I'm not writing/putting it in correctly in redmine's Admin interface? Should something be prefixed|appended to the regex for it to be taken into account?

#23 Updated by Ben Blanco over 1 year ago

For anyone interested, I finally have configured my redmine 3.2.0 to cleanly truncate incoming emails.

Amend mail_handler.rb

The default mail_handler.rb was not performing any truncation for me (not sure why; never got an answer/much help to try and troubleshoot it, cf. #21746).

So, reading the thread of comments on #5864, I finally found that the following works:

def cleanup_body(body)
    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
    unless delimiters.empty?
        # Combine all delimiters into one regex
         regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })")
        # If the regex matches a line
          regex.match(body) { |m|
                             # Delete the matched line and everything after it
                              body = body[0 ... m.begin(0)]
                            }
    end
    body.strip
end

Administration / Settings / Incoming Emails

In redmine's incoming emails settings, I then put the following values:

.+redmine@foo\.com.+
^-{2}.\n
.+image:\scid:.+

The first one, .+redmine@foo\.com.+, enables to wipe out email clients' top line on replies, such as:

On Mon, Feb 1, 2016 at 12:35, redmine@foo.com wrote:
> Feature #5864: Regex Text on Receiver Email
> blabla
> blabblabla

So that's very cool.

The other two settings are:

  • ^-{2}.\n is to remove Gmail's appended signatures, which apparently are always preceded by a line with -- (followed by an invisible single character; which when you look at Gmail.com's online raw email dump feature, they show as being =20, but it isn't, it's a single character of whateverz they append to that double tack, hence the "." used in the regex).
  • Whereas .+image:\scid:.+ is to remove an image logo our company has for our staff's email signature.

Also, for that signature image, I also declared the full file name, in the Exclude attachments by name setting.

I might have to add file exclusions if/when we allow non-staff, ie. clients, to create/comment issues via email, but for now this setup works flawlessly!

Caveats

I've noticed that the regex truncation and/or attachment exclusion rules I specify in redmine's Administration are not taken into account upon Save. I have to restart redmine application, and then they're properly taken into account.

I'm not sure if that's normal; but mentionning it in case someone else reads this, as it'll save you a lot of time and pain.

If someone knows that this is a normal behaviour, then we should probably add a notice in Administration saying "Please restart redmine application for these settings to be taken into account".

If it's not normal, please let me know, and I can provide info on server setup (nginx+passenger+rbenv4rubies basically).

Final note

I tried using the redmine_mail_handler_clean_body_regexp plugin - it didn't work for me. Not sure if because I'm running redmine 3.2.0.

#24 Updated by Marius BALTEANU about 1 year ago

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

#25 Updated by Toshi MARUYAMA 11 months ago

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?
I think it is better to switch regexp on or off.

#26 Updated by Marius BALTEANU 11 months ago

Toshi MARUYAMA wrote:

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?

From my tests, no, it doesn't break, but if you have a specific scenario in your mind, please tell me and I'll test it.

I think it is better to switch regexp on or off.

Yes, agree with you. I've updated the patch to add a new setting which enable/disable this feature.

#27 Updated by Peter Petrik 11 months ago

Excellent patch, thanks for contributing it!

#28 Updated by Toshi MARUYAMA 11 months ago

Marius BALTEANU wrote:

Toshi MARUYAMA wrote:

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?

From my tests, no, it doesn't break, but if you have a specific scenario in your mind, please tell me and I'll test it.

For example, "***cut below lines***".

#29 Updated by Toshi MARUYAMA 11 months ago

  • Target version set to 3.4.0

#30 Updated by Marius BALTEANU 11 months ago

Toshi MARUYAMA wrote:

For example, "***cut below lines***".

You're right, without the new setting to enable/disable this feature, the existing delimiters with special regex characters will behave differently.

#31 Updated by Jean-Philippe Lang 10 months ago

I think that this patch would raise an error when receiving an email if "Enable regexp delimiters" is checked and the entered delimiter is an invalid regexp.

#32 Updated by Marius BALTEANU 10 months ago

Jean-Philippe Lang wrote:

I think that this patch would raise an error when receiving an email if "Enable regexp delimiters" is checked and the entered delimiter is an invalid regexp.

Thanks for your feedback. I'll modify the patch to validate each regex on save when the "Enable regexp delimiters" is checked.

#33 Updated by Marius BALTEANU 10 months ago

I've updated the patch to validate each regex delimiter on save. The user won't be able to save new settings with invalid regex delimiters and will receive an error message for each invalid entry.

Please let me know if more changes are required to have this committed.

#34 Updated by Marius BALTEANU 10 months ago

Thanks for implementing this feature. Attached is a small patch that makes the text "Enable regular expressions" clickable (as a label for the checkbox).

#35 Updated by Jean-Philippe Lang 10 months ago

  • Status changed from New to Closed
  • Assignee set to Jean-Philippe Lang
  • Resolution set to Fixed

I've fixed it using an existing class, thanks for pointing this out.

#36 Updated by Go MAEDA 10 months ago

#37 Updated by Mischa The Evil 4 months ago

  • Subject changed from Regex Text on Receiver Email to Regex Text on Receiver Email

Also available in: Atom PDF