devbox@COMPUTEC The Computec development blog

4Feb/100

A little ReReplace CFConfusion (solved)

The following two snippets are only different in regard to the iSomeVal string which is used in a Regex replacement string. The first is a string that starts with a character ('fortytwo') the second is a string that starts with a number ('42'). The first example is working fine, whereas the second is somewhat lacking something quite essential:

<cfscript>
 variables.strSource = 'The answer is: Hmm.';
 variables.iSomeVal = 'fortytwo';
 variables.strTarget = ReReplace(variables.strSource,'^(.*:\s).*$','\1'&variables.iSomeVal,'ALL');
 writeOutput('~~' & variables.strTarget & '~~');
</cfscript>

Output: ~~The answer is: fortytwo~~

<cfscript>
 variables.strSource = 'The answer is: Hmm.';
 variables.iSomeVal = '42';
 variables.strTarget = ReReplace(variables.strSource,'^(.*:\s).*$','\1'&variables.iSomeVal,'ALL');
 writeOutput('~~' & variables.strTarget & '~~');
</cfscript>

Output: ~~The answer is:~~

WTF?

A few brain processor cycles later you realize what's going on: The replacement string now reads '\142' - and there's no back reference with that number.

So we need a little ugly trick because slash-escaping won't help us here either. We'll use the \u operator which usually would indicate to uppercase the following character (which in our case is an integer, so this doesn't harm us):

<cfscript>
 variables.strSource = 'The answer is: Hmm.';
 variables.iSomeVal = '42';
 variables.strTarget = ReReplace(variables.strSource,'^(.*:\s).*$','\1\u'&variables.iSomeVal,'ALL');
 writeOutput('~~' & variables.strTarget & '~~');
</cfscript>

Output: ~~The answer is: 42~~

Now in this case we're lucky. Any suggestions on what could be done if we don't actually know if the variable replacement bit after the first back reference would begin with a character or a number? I don't want to hack around it with a separating space or something similar - which could of course be removed in a second pass, but this doesn't seem elegant... Feels like I'm missing something extremely obvious here...

Update: Seems like I'm not missing anything ColdFusion-wise. There's actually a RegEx feature described on regular-expression.info as '$10 through $99 treated as $1 through $9 (and a literal digit) if fewer than 10 groups'. There's no clue as to the implementation in ColdFusion here, but judging from what I've seen in my example code, I'd say it's fair to assume that CF does not deliver the desired result in this category. But alas, all is not lost as we're running on top of Java, which I cannot ever shout out happily quite often enough under such circumstances. For behold:

<cfscript>
 variables.objRegex = createObject('component','JavaRegExp');
 variables.strSource = 'The answer is: Hmm.';
 variables.iSomeVal = '42';
 variables.strTarget = variables.objRegex.regExpReplace('^(.*:\s).*$',variables.strSource,'$1'&variables.iSomeVal,true);
 writeOutput('~~' & variables.strTarget & '~~');
</cfscript>

Output: ~~The answer is: 42~~

Yay! This snipped doesn't use ReReplace but the Java RegEx Component by massimocorner.com I mentioned in an earlier post UDF to strip certain chars, but leave UBB tags alone.

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

(required)

No trackbacks yet.