Needs review
Project:
Pathauto
Version:
8.x-1.x-dev
Component:
Code
Priority:
Normal
Category:
Bug report
Assigned:
Reporter:
Created:
6 Jun 2019 at 08:13 UTC
Updated:
18 Feb 2026 at 08:49 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
japor commentedComment #3
oknateThis is causing near constant warnings:
Comment #4
larvymortera commentedFixed constant warnings.
Comment #5
shubham.prakash commentedHope this patch fixes the issue.
Comment #9
mably commentedProblem
When punctuation characters are set to "Do nothing" (kept as-is in aliases), the ignored words regex used \b as word boundary, which treats those punctuation characters as boundaries.
This caused ignored words adjacent to kept punctuation to be incorrectly stripped (e.g., "a.b" with "a" ignored and
.kept would become ".b").Fix applied to
AliasCleaner::cleanString():$kept_punctuationusingpreg_quote()$wb) using lookaround assertions that treat kept punctuation as part of words, wrapped in a non-capturing group to avoid conflicts with the|alternation in the ignored words regexpreg_replaceinstead ofmb_eregi_replacewhen kept punctuation exists, since POSIX ERE doesn't support lookaroundsComment #10
mably commentedCode review of MR #127
The commit fixes the issue where punctuation characters configured to "Do nothing" (kept in aliases) were incorrectly treated as word boundaries by the ignored words regex. For example, "a.b" with "a" as an ignored word and "." kept would become ".b" because
\btreats punctuation as a word boundary.Changes
AliasCleaner::cleanString() — The fix tracks which punctuation characters are kept as-is using
preg_quote(). When kept punctuation exists, the standard\bword boundary is replaced with a custom lookaround pattern that treats kept punctuation as part of words rather than boundaries:(?:(?<![\w...])(?=[\w...])|(?<=[\w...])(?![\w...])). Since POSIX ERE does not support lookarounds, the fix also forcespreg_replaceinstead ofmb_eregi_replacewhen custom boundaries are needed, and adds the/u(unicode) modifier.Kernel test —
testIgnoredWordsWithKeptPunctuationcovers the key scenarios:The fix is correct and well-targeted. The custom word boundary pattern only activates when there are kept punctuation characters, so the default behavior with
\bandmb_eregi_replaceis preserved when no punctuation is kept.Comment #11
mably commentedComment #12
anybodyThis looks closely related or even overlapping with #3311669: Punctuation processed before replacing strings
Comment #13
mably commentedIt's related but quite different in fact. AFAIUI at least ;)