Searching the FOLKDJ-L Archive

One of our members reported that her playlists didn't come up when she searched the folkdj-l archives for artists that she'd played.  We figured out the problem, and it seemed useful to mention it to everybody, since it might affect how you all format your playlists as well as use the search engine.
 
When you search on a "string" (i.e., a word or name), the folkdj-l search engine locates posts that have *exactly* that word within them.  If you check the "Substring" box, it will include posts that have your desired string, even if they appear within a longer word.  But if you *don't* check the Substring box, the only matches will be posts that have your desired string as a free-standing word. 
 
For example, if you searched on the string "John", you'd match any playlists with John Denver or John Prine or John Lennon or, for that matter, Elton John.  If you check the Substring box, you'd *also* match playlists with Robert Johnson, since the string "John" is a substring within "Johnson."
 
Normally, the search engine seems to ignore punctuation when it looks for strings.  So commas and quote marks don't cause a problem.  If you search on "Train", for example, without checking the Substring box, you'll still match a playlist with this: "Freight Train," Libba Cotten, LIVE (Arhoolie) The search engine matches the string "Train" without worrying about the comma and quote mark that come right after it.  For that matter, parentheses are okay too, so a non-Substring search on "Arhoolie" would also match the above example.
 
So far, so good.  But the person who reported the problem uses slashes in her playlists, like this:
 
Libba Cotten/ Freight Train/ LIVE/ Arhoolie
 
Well, it turns out that the search engine thinks that "Train" and "Train/" are different words.  It doesn't ignore the slash, as it does commas and quote marks and parentheses.  So if you search on "Train" (or "Cotten") without checking the Substring box, that playlist would NOT show up as a match.  If you *do* check the Substring box, it *will* match, since "Train" is a substring of "Train/". 
 
Dashes are also a problem, in an entry like this:
 
--Woody Guthrie/ This Land Is Your Land
 
The non-Substring search wouldn't find either "Woody" (because of the dashes) or "Guthrie" (because of the slash). 
 
The solution is easy:  If you use punctuation like slashes or dashes, simply make sure that there's always a SPACE on BOTH SIDES of the punctuation mark.  The above example would successfully match non-Substring searches on both "Woody" and "Guthrie" if it were formatted like this:
 
-- Woody Guthrie / This Land Is Your Land
 
You might ask, "Why not just always check the Substring box when you do a search?"  Well, the Substring search can be handy, but it can also lead to a far larger number of matches than you really want.  If you're searching on "Rush" (maybe looking for posts about Tom Rush) and check the Substring box, you'll also match posts with the words Paintbrush, Sagebrush, Goldrush, Thrushes, Brushfire, etc.
 
So, it's best if we all format playlists in a way that will reliably work with the non-Substring searches.  If you use slashes or dashes to separate parts of your playlist, please put a space on either side of every mark.  If you're not sure whether your format is okay or not, try searching on a name in one of your own playlists.  If the Substring search finds it but the non-Substring search doesn't, then you have a problem and should modify your formats accordingly.

Bob Blackman, August 2008