Project

General

Profile

Actions

Patch #35104

closed

Code blocks - consistent rendering and retaining user-supplied language name in rendered HTML

Added by Martin Cizek about 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Text formatting
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

This patch:

  • Makes code block rendering consistent across formats, which is important for plugins, client-side scripts and styling.
  • Makes sure rendered language name is escaped.
  • Exposes user-supplied language name in an HTML attribute, so that client-side scripts and Redmine plugins can always access it.

The patch applies to the current Textile and Markdown formatters. The very same change to the planned CommonMark formatter will be attached to #32424.

Please see below for design decisions rationale.

Code block rendering

<pre><code> vs. just <pre>:
Textile supports blocks without code (<pre>...</pre>), while code can be part of such blocks. Typically, there is just one and only code block per <pre> block, but it's possible to have more than one code block in a <pre> block. So in general, we cannot avoid the code tag in the HTML rendered from Textile.

While it is possible to render only the pre tag for Markdown code blocks and rendering libraries do have option for that, we should use the code tag too to be consistent. Fortunatelly, this is already the case and it is even what CommonMark spec suggests.

No-code blocks in Markdown
Markdown does not have preformatted blocks except for code blocks. Code blocks are either idented or fenced, but both are meant to represent code (spec).
The fenced code blocks can also carry an info string, which is used to determine the languge of the code.

Comparison of different renderers:

Formatter No-code block → HTML Code block w/o lang → HTML Code block with supported lang → HTML Code block with unsupported lang → HTML
Redmine Textile <pre>...</pre> <pre><code>...</code></pre> <pre><code class="ruby syntaxhl">...</code></pre> <pre><code>...</code></pre>
CommonMark reference renderer N/A <pre><code>...</code></pre> <pre><code class="language-ruby">...</code></pre> <pre><code class="language-foo">...</code></pre>
Redmine Markdown N/A <pre>...</pre> <pre><code class="ruby syntaxhl">...</code></pre> <pre>...</pre>
Redmine CommonMark #32424 N/A <pre><code>...</code></pre> <pre><code class="ruby syntaxhl">...</code></pre> <pre>...</pre>

It's quite obvious that we should follow the Textile behavior to be consistent.

Original data language

The user-supplied language information is currently dropped if the particular language does not have any lexer for syntax highlighting. However, there are several use cases when this information is valuable:

For the reasons above, the information has to be carried on the code tag. I've examined a few projects and libraries related to the topic to find an appropriate name for such attribute. The conclusion is to use <code data-language="foo" ...>.

Resulting code block rendering:
Formatter No-code block → HTML Code block w/o lang → HTML Code block with supported lang → HTML Code block with unsupported lang → HTML
All Redmine formatters <pre>...</pre> <pre><code>...</code></pre>
<pre><code class="ruby syntaxhl" data-language="ruby">...</code></pre>
<pre><code data-language="foo">...</code></pre>

Quality Assurance, Security & Compatibility

  • QA: Added new unit tests and updated the existing ones.
  • Security: Previously, Textile formatter did not escape language name and relied on validating it against syntax highlighter, while expecting language names to be safe. Language names are now always subject to escaping.
  • Compatibility:
    • The rendered HTML is now the same as the HTML from Textile.
    • The added data-language attribute should be indeed ignored by unconcerned components.
    • I don't really expect that anybody to rely on getting <pre> and not <pre><code>, especially when it is the current behavior of Textile. I've verified this on redmine_wysiwyg_editor as a rare example of a potentially concerned plugin.

So I'm convinced it is safe to include this in any minor release.


Files


Related issues

Related to Redmine - Feature #32424: CommonMark Markdown Text FormattingClosedMarius BĂLTEANU

Actions
Actions #2

Updated by Martin Cizek about 3 years ago

...accidentaly marked this as a defect, it should have been patch. Thank you.

Actions #3

Updated by Go MAEDA almost 3 years ago

  • Tracker changed from Defect to Patch
Actions #4

Updated by Marius BĂLTEANU over 2 years ago

  • Assignee set to Marius BĂLTEANU
  • Target version set to 5.0.0
Actions #5

Updated by Marius BĂLTEANU over 2 years ago

  • Related to Feature #32424: CommonMark Markdown Text Formatting added
Actions #6

Updated by Marius BĂLTEANU over 2 years ago

  • Status changed from New to Closed

Patches committed, thanks again Martin for your work on this.

Actions

Also available in: Atom PDF