One of the problems with using the Unix/Linux fortune (aka fortune-mod) command in web pages is making it readable in HTML. One can provide something that mostly works by substituting any HTML entities (&,<,>, and double quote) and converting linefeed to <br />.
However you're still going to get a lot of fortunes with unprintable characters where the original intent was lost - as many of these used 'backspace hacks' to provide character underlines, accent marks, and on really old fortune databases, using backspace to strike out text and replace it with something more amusing.
Here is a function that should make 99.999% of the fortunes you may encounter that use weird ASCII tricks display in web pages mostly as originally intended.
<?php
function fortune_to_html($s) {
// First pass - escape all the HTML entities, and while we're at it
// get rid of any MS-DOS end-of-line characters and expand tabs to
// 8 non-breaking spaces, and translate linefeeds to <br />.
// We also get rid of ^G which used to sound the terminal beep or bell
// on ASCII terminals and were humorous in some fortunes.
// We could map these to autoplay a short sound file but browser support
// is still sketchy and then there's the issue of where to locate the
// URL, and a lot of people find autoplay sounds downright annoying.
// So for now, just remove them.
$s = str_replace(
array("&",
"<",
">",
'"',
"\007",
"\t",
"\r",
"\n"),
array("&",
"<",
">",
""",
"",
" ",
"",
"<br />"),
$s);
// Replace pseudo diacritics
// These were used to produce accented characters. For instance an accented
// e would have been encoded by '^He - the backspace moving the cursor
// backward so both the single quote and the e would appear in the same
// character position. Umlauts were quite clever - they used a double quote
// as the accent mark over a normal character.
$s = preg_replace("/'\010([a-zA-Z])/","&\\1acute;",$s);
$s = preg_replace("/\"\010([a-zA-Z])/","&\\1uml;",$s);
$s = preg_replace("/\`\010([a-zA-Z])/","&\\1grave;",$s);
$s = preg_replace("/\^\010([a-zA-Z])/","&\\1circ;",$s);
$s = preg_replace("/\~\010([a-zA-Z])/","&\\1tilde;",$s);
// Ignore multiple underlines for the same character. These were
// most useful when sent to a line printer back in the day as it
// would type over the same character a number of times making it
// much darker (e.g. bold). I think there are only one or two
// instances of this in the current (2008) fortune cookie database.
$s = preg_replace("/(_\010)+/","_\010",$s);
// Map the characters which sit underneath a backspace.
// If you can come up with a regex to do all of the following
// madness - be my guest.
// It's not as simple as you think. We need to take something
// that has been backspaced over an arbitrary number of times
// and wrap a forward looking matching number of characters in
// HTML, whilst deciding if it's intended as an underline or
// strikeout sequence.
// Essentially we produce a string of '1' and '0' characters
// the same length as the source text.
// Any position which is marked '1' has been backspaced over.
$cursor = 0;
$dst = $s;
$bs_found = false;
for($x = 0; $x < strlen($s); $x ++) {
if($s[$x] == "\010" && $cursor) {
$bs_found = true;
$cursor --;
$dst[$cursor] = '1';
$dst[$x] = '0';
$continue;
}
else {
if($bs_found) {
$bs_found = false;
$cursor = $x;
}
$dst[$cursor] = '0';
$cursor ++;
}
}
$out = '';
$strike = false;
$bold = false;
// Underline sequence, convert to bold to avoid confusion with links.
// These were generally used for emphasis so it's a reasonable choice.
// Please note that this logic will fail if there is an underline sequence
// and also a strikeout sequence in the same fortune.
if(strstr($s,"_\010")) {
$len = 0;
for($x = 0; $x < strlen($s); $x ++) {
if($dst[$x] == '1') {
$len ++;
$bold = true;
}
else {
if($bold) {
$out .= '<strong>';
while($s[$x] == "\010")
$x ++;
$out .= substr($s,$x,$len);
$out .= '</strong>';
$x = $x + $len - 1;
$len = 0;
$bold = false;
}
else
$out .= $s[$x];
}
}
}
// These aren't seen very often these days - simulation of
// backspace/replace. You could occasionally see the original text
// on slower terminals before it got replaced. Once modems reached
// 4800/9600 baud in the late 70's and early 80's the effect was
// mostly lost - but if you find a really old fortune file you might
// encounter a few of these.
else {
for($x = 0; $x < strlen($s); $x ++) {
if($dst[$x] == '1') {
if($strike)
$out .= $s[$x];
else
$out .= '<strike>'.$s[$x];
$strike = true;
}
else {
if($strike)
$out .= '</strike>';
$strike = false;
$out .= $s[$x];
}
}
}
// Many of the underline sequences are also wrapped in asterisks,
// which was yet another way of marking ASCII as 'bold'.
// So if it's an underline sequence, and there are asterisks
// on both ends, strip the asterisks as we've already emboldened the text.
$out = preg_replace('/\*(<strong>[^<]*<\/strong>)\*/',"\\1",$out);
// Finally, remove the backspace characters which we don't need anymore.
return str_replace("\010","",$out);
}
Some interesting available domains for today, courtesy of NameThingy
UseArea.com
AbstractDocument.com
UseAnt.com
NiceEffect.com
RealCriminal.com
OnePiano.com
ReservedMan.com
UseLamp.com
ExoticOrange.com
WideModel.com
LessVirus.com
RapLady.com
LonelyWeek.com
WeakPresident.com
TopShadow.com
BestRockers.com
KriZit.com
BodyClaim.com
OldCircle.com
StuckCan.com
RegularBurger.com
YoungHam.com
RadioactiveHeat.com
DoctorIssue.com
PredatorAnimal.com
WarSunday.com
FriendlyTuna.com
OneMaiden.com
FunnyDrug.com
RoundChin.com
BetaApple.com
BaySummer.com
LowSquare.com
While doing some data analysis on the namethingy, I came across some interesting findings.
The boys and girls names therein were taken mostly from recent US census data (and adapted, modified, and otherwise mangled for my own use).
What I found interesting was that once a particular name has gotten some bad press, it can poison that name for centuries from being used again. Just think, when was the last time you met somebody named:
Cain
Goliath
Judas
Hansel
Gretel
Benedict
Napolean
Adolf
The first Tuesday in November. Everybody remembers what's important about that, right?
Right. Melbourne Cup Day. The entire nation comes to a screeching halt for a five minute horse race.
Oh yeah, there's that little presidential election in America; which is also held on the first Tuesday in November - except that's actually on Wednesday (Sydney time).
other causes combined."
-- Fred Brooks, Jr., _The Mythical Man Month_

Digg
Delicious
Facebook
Netscape
Technorati
cheers les