The following problem has come up during a file caching implementation: We've got a directory /var/www/MYCACHE; our filecaching mechanism uses a key-based directory structure to store files there. So let's suppose our key would be 123456789, we'd like to store the file 123456789.cache under /var/www/MYCACHE/123/123456/123456789.cache. This would make sure that no directory needs to hold more than 1,000 nodes.
All would be well if we could be sure that the user jrun (i.e. the user that owns our ColdFusion process) was indeed the only user ever to access this directory structure. In our case we want to be able to access this structure with PHP, too, which runs as mod_php on the webserver, thus as user www-data. To avoid permission problems, we want to assign a permission of 0777 to all directories in the structure upon creation.
A quick followup on a previous post ColdFusion UDF to get Unix timestamp from date: Here's a oneliner that provides you with the complimentary function to get a date from a Unix timestamp - as I've discovered that the dateAdd() route mostly recommended on the net not only suffers from being quite clumsy, the result is off by one hour, too - at least when DST is on.
So to get a date from a Unix timestamp in ColdFusion, you can use this oneliner:
<cfset dtMyDate = createObject('java','java.util.Date').init(javaCast('long',iUnixTS*1000)) />
You may know from previous blog posts that I strongly advise every ColdFusion developer to familiarize himself/herself with the thing that actually makes ColdFusion tick, i.e. with Java. Everybody who writes a single line of CFML should know about the possibilities of extending ColdFusion by directly accessing the underlying Java methods of certain objects. One of the datatypes where actually using Java may make a lot of sense is the string object.
ColdFusion string literals are just plain old Java strings. If you grab a string from e.g. a query object like variables.qMyQuery.myTextColumn, you need to be careful though - even if you think you just have one tuple returned, you've got something other than a string object on your hands. In such a case you need to either specifically target a certain row (like variables.qMyQuery.myTextColumn[1]) or you wrap it up in a JavaCast like Javacast('string',variables.qMyQuery.myTextColumn).
I finally found a moment to actually do some benchmarking on some of the built-in ColdFusion functions against their Java counterparts. This is not a benchmark of Java vs. ColdFusion performance, mind you, it's about deciding whether to use Java-methods inside of ColdFusion vs. ColdFusion's built-in string functions.
Full text searching is and probably will be for a long time an interesting challenge for any database driven application. Of course ColdFusion already offers a couple of options, though I found most of them somewhat lacking in features or quite complicated to set up.
As we're running mostly on PostgreSQL as database backend, we used to rely solely on the built-in TSearch2 full text search methods of that database. But over the years we have accumulated so much data, some of which is nicely distributed over several tables (the dark side of normalization), that we were really yearning for a less table based and more document focused indexing mechanism - and more speed than TSearch2 could deliver.
Verity never really quite met all of our needs and was a real pain to set up and maintain. CF 9's Solr, which is based on Lucene, might be a mighty step forward, but we're still running on ColdFusion 8, so I really cannot say a lot about handling and performance of the new indexing beast.
For our use cases (i.e. indexing of articles, products in our CMS as well as our forums), Sphinx (for SQL Phrase Index) has shown some amazing results - and we're using it for a couple of months now. In this article I'll show you how to compile, set up and use Sphinx in your ColdFusion application to retrieve search results from documents stored in a PostgreSQL or MySQL database.
Just a quick one: I have a method that takes a list argument; there is a discrete list of legal values for this list. I want to filter the passed argument list by throwing out all the values which are not contained in the list of legal values.
Of course I could use a nested loop to do this - but for longer lists this is neither fast nor elegant. Again I'll turn to Java for this. ColdFusion's arrays are in fact java.util.Lists, so after converting our ColdFusion lists to ColdFusion arrays, we can make use of the Java-API for lists.
Here's a quick UDF that does what I want:
<cffunction name="listIntersect" output="no" returntype="string"
hint="returns values from list 1 which are contained in list 2">
<cfargument name="lstSand" type="string" required="yes" />
<cfargument name="lstSieve" type="string" required="yes" />
<cfargument name="chDelimiter" type="string" required="no"
default="," />
<cfscript>
var aLstSand = listToArray(arguments.lstSand,arguments.chDelimiter);
var aLstSieve = listToArray(arguments.lstSieve,arguments.chDelimiter);
aLstSand.retainAll(aLstSieve);
return arrayToList(aLstSand,arguments.chDelimiter);
</cfscript>
</cffunction>
Usage:
<cfset lstSand = 'foo,bar,illegalparam,whatever' />
<cfset lstSieve = 'bar,foo,someotherval,whatever' />
<cfset lstSieved = listIntersect(lstSand,lstSieve) />
<cfoutput>#lstSieved#</cfoutput>
This will output foo,bar,whatever.
On January 19th 2038 I'll be 63 years, 9 months and 23 days old. So unfortunately there are still a couple of days until I can think about retirement. What's wrong with this date?
The Unix timestamp of 2038-01-19 03:14:07 is 2147483647. This is the maximum number that fits into the int4 data type. One second later we'll be getting integer overflow for any operations on Unix timestamps. Like getting the actual date from that Unix timestamp via dateAdd() in ColdFusion.
I recently started implementing a couple of our full text search requirements using Sphinx. I am extremely happy with this search engine, as it's lightning fast and provides some quite easy integration with the data we store in our PostgreSQL databases, is highly scalable and fairly easy to implement in ColdFusion via the Sphinx Client API.
Last night I bumped on article talking about regular expression performance tuning. After reading it and since we extensivly use regex to parse article & community content, I decided to see can we do something to boost performance on that side. So, here we go.