Feature #35037

Make wiki text section extraction less fragile

Added by Martin Cizek over 1 year ago.

Status:NewStart date:
Priority:NormalDue date:
Assignee:-% Done:


Category:Text formatting
Target version:-


The current approach to per-section text editing is inherently fragile:
  • Sections links are generated based on rendered HTML headings (before macro injections).
  • Section extraction from the markup source is based on simplified regexp-based parsing of the markup, independent on the fully featured markup parser.

The issue is going to be more significant after relaxing restrictions on markup syntax (#32424, #35035).
But as #35036 shows, even the current restrictions do not guarantee correct section extraction. See also the exmple below.

I can imagine two approaches to the solution:
  1. For CommonMark - using the sourcepos feature of the renderer. Also applicable to other eventual formatters with similar feature.
  2. For any markup - introducing validation of the section extraction results, which would detect that the extraction fails and would disable per-section edit links.

I can offer creating a PoC of the sourcepos approach for CommonMark format after #32424 is incorporated.

A difficult-to-solve example of Markdown with broken section extraction follows (copied from the skipped unit test in #35036):

# Title

## Heading 2

- item
not a heading

## Heading 2
Nulla nunc nisi, egestas in ornare vel, posuere ac libero.

Related issues

Related to Redmine - Defect #35036: Markdown text sections broken by thematic breaks (horizon... Closed


#1 Updated by Go MAEDA over 1 year ago

  • Related to Defect #35036: Markdown text sections broken by thematic breaks (horizontal rules) added

Also available in: Atom PDF