I've got a bit more testing to do, but the discussion forum feature should be landing soon. Weblogs are more or less one way communication (though some folks have quite a bit of communication through the comments). Forums or Rooms are designed to foster two-way communication over one-way ranting. You can call it whatever you like. I chose rooms because I'm already using that metaphor in the chat mechanism (which incidentally was re-architected a few days ago).
A room is nothing more than a message filter, just like categories. In fact, most everything on a modern website reduces to nothing more than a message filter. Hmmm.
In this case, the only difference from a weblog is that multiple people can write entries in a forum or room. If you had multiple authors capability on a weblog, I suppose you could call it a forum. Hmmm. I'm getting some ideas. People like me spend a lot of time implementing weblogs, forums, comments, email, etc. Just like I did with email many years ago, I've come to realize that this whole issue boils down to a (single) categorization function. They're all messages. All of the different features of modern systems can be reduced to finding different ways of choosing which messages to look at. Most of the time, this is hard-wired into the website design. Folders, rooms, forums, blogs - they're merely hardwired collections of messages. Why should they be hard-wired? Why couldn't one have a logical view of a website? Virtual dynamic document collections. Hmmm. Deja vu. I've done this all before.
Why would anybody care?, you might ask. Well imagine this. Right now you can find this particular message filed in various ways. Messages by Mike, messages in the category 'software', messages written in March, 2006, etc. All of these categorizations are performed on the website. What if this categorization was under your control? You might choose to categorize it differently. Messages about messages, for instance. Messages written by Mike on a Thursday. Messages that weren't addressed to a particular person. Messages that contain both the words 'kinky' and 'sex'. And what if the entire website was your collections of messages instead of my collections of messages? Now do you see what I mean? The entire concept is highly addicting.
Found this cute little package on the net. It's a z-80 emulator running under a cp/m emulator running on Linux. You can use dd to dump your old disks and have a working CP/M environment. I'm not sure any of my CP/M disks are readable anymore. But it's quite a time warp to find yourself staring at an A> prompt and be able to drop into a z-80 monitor.
Woohoo! Check this out...
A>mbasic BASIC-80 Rev. 5.21 [CP/M Version] Copyright 1977-1981 (C) by Microsoft Created: 28-Jul-81 34872 Bytes free Ok
For the uninitiated, the z-80 was a microprocessor manufactured by a company called Zilog back in the late 70's|early '80's. It was a lot like the Intel 8080, but much, much better. [It turns out that the z-80 is still being manufactured, details at zilog.com. ] CP/M is 'control program for microcomputers', written by Gary Kildall at Digital Research - one of the most successful of the early operating systems. It is no coincidence that Microsoft's IBM DOS version 1, looked a whole lot like CP/M.
Hmm, maybe I can bring back the old Crossroads Bar and Grill...
Hey, computer, would ya' make me some coffee... By the way, how many people called today? Any private mail? Who's it from? Read it, please. Thanks...
Microsoft's new operating system 'Vista' has been re-scheduled to January 2007 instead of Q3-2006.
Microsoft said Vista is delayed because it wants to improve overall quality, particularly in security, and that PC makers didn't want the operating system introduced in the middle of holiday sales, because a new version would create instability in the market.
Talk about creating instability in the market. What this does is guarantee that there will be no PC's given as Xmas gifts this year. Why would you give somebody a rather pricey gift that will need to be replaced next month?
But every silver lining has a cloud... if you're looking for a hot new Linux server, the PC makers will be dumping their old stock at Xmas time at ridiculous discounts both to make way for the Vista boxes and to boost the anemic sales.
Just don't try and run Vista on one of these puppies. I've heard through the grapevine that this is going to be the most bloated Windows yet. You're going to need a new 20ghz PC with 100 gigs of RAM and a few zotabytes of hard disk space if you want to see Vista work...
Now that the buglist is getting manageable, perhaps it's time to tackle another software feature or three. Lesseee, what's on the todo/wish list? Personal feed collections. Messages (public and private). And I've still got a bunch of work to do on the XML-RPC interface. Or perhaps I'll take a break for a few days.
Let's put this in perspective - I took a single user weblog running on a flat filesystem and converted it to a multi-user portal with logins and session management running on an SQL backend with avatars and chat and tag clouds with a new editor and an extensible plugin architecture and code-free user-defined templates...
...in about six weeks.
Got hit by a mountain of comment spam last night. In this case I'm certain the content was coming from compromised sites as there were several hundred duplicate entries all coming from individual machines located across the planet and all at the same time.
Hackers have it easy these days. Modern tools give them a point-n-click interface into a bank of compromised sites. Send spam from all drones. Launch virus from all drones. Or you can select specific drones.
So you might find it strange that I'm not cursing the spammers. Sure, it's an annoyance. But this gives me a real-world test of my anti-spam tools instead of simply theoretical scenarios. None of it got through because comments on this site are moderated anyway - but I was able to uncover some patterns that I will put to use so that ultimately, more of the junk will be rejected outright and won't even show up in the moderation queue.
I'm less than happy with a few aspects of the chat module. I'll need to rewrite it. But none of those concerns prevent it from being used. It's functional. Companies like Microsoft have gotten extremely wealthy selling basically functional software that needed to be re-written - so I'm not embarrassed.
If you want to create a private room, you do it in Unix fashion - start the name with a period. That's it. Those are the rules.
So it turned out to be easy enough to build an AJAX chat module that I've gone ahead and built one. I'll plug it in once I've finished a couple of more features and tested it all. The basic chat works fine right now. But chat isn't much fun without rooms - so I made sure it would support multiple rooms. Right now it'll chat in multiple rooms, but the missing ingredient is room presence - to answer the question "how many people are here right now?". That brings up the issue of private rooms, because some conversations are best done behind closed doors. The reason that's an issue with room presence is because if you've got room presence, you've got to know which rooms not to show. Yada, yada, yada. Feature creep. Maybe I should just open one public room and be done with it, but that's hardly intellectually inspiring.
After some long thought about the systems issues, I think I've got a novel concept. Since there are to be multiple rooms, who gets to create a room? Just the admin? Logged in users? On most sites you have to suggest a new room name and wait a few days/weeks or else they are created for you based on marketing research.
As it turns out, in my implementation as it exists right now a room is nothing more than a tag on a message. Maybe that's all it should be. You want a room called 'Lesbian Buddhist secret agents from Norway'? Fine by me. All you have to do is go to a room by that name - and that act alone makes it exist. Rooms cease to exist when all their messages expire. This is actually pretty cool from an administration viewpoint. Rooms appear magically as they're needed, and they vanish when they stop being used.
Zero maintenance.
It's probably more of an academic exercise than anything else. There aren't enough people that actually hang out at the BADDCAFE to be much use here. Then again, I wasn't expecting 60,000 visitors the other day, and social communications ware is the kind of software I've always enjoyed doing. AJAX chat fits nicely into the portfolio and looks good on the resume if nothing else.
![[*TOP MEMBER*] Martin [*TOP MEMBER*] Martin](images/unknown-2.jpg)
Good thing I got those sessions under control yesterday morning. Yesterday afternoon this site got hammered. It wasn't search engines, but the similiarity of requests is highly suspect. Over 60,000 unique sites requested my home page yesterday between noon and 2PM. None of them went any further. They all were redirected from my old site. Then just as suddenly, it all stopped. Hmmm. That's certainly strange.
It wouldn't surprise me at all if it turned out to be a coordinated DOS attack on my site after my mentioning a vulnerability to overloading. Somebody trained a firehose of hacked drone sites at me. That's my suspicion anyway. I can't quite picture 60,000 visitors doing exactly the same thing under other circumstances unless my site was mentioned on a major news channel as a place to get a free Lexus or download videos of Catherine Zeta Jones having sex with Brad Pitt or something like that.
The FBI is getting close to awarding a contract for Project Sentinel, the showcase of their technology upgrade. Looks like the bidding is only open to two companies - Lockheed Martin being one of them. Cost of the project is estimated to be around a half a billion dollars.
And what are they buying for all this money? A relational database. Lockheed Martin can build some pretty good airplanes and satellites, but databases? Server farms? I wonder what the cost would be if they opened the bidding to folks who actually have experience building scaleable information systems...
This is precisely why I didn't want to use sessions for managing authentication state...
I started to notice my webserver getting really slloooooooow. Sometimes it would time out completely (I've got it setup with 60 seconds max execution time per page). How could this be? I designed the pages for maximum SQL performance. Why is it taking so long to get the database open?
Then I looked closer at the timeout message. Timeout on line 3. Line 3 hasn't even touched a database yet. It belongs to a function called 'session_open()'. Gak.
By default, PHP stores its session data in the /tmp directory. One file per session. I bop over to the session directory and do an 'ls'. It takes about fifteen minutes to list all the session files. I can't even easily count them because the shell chokes on the wildcard expansion. On a Unix filesystem, anything more than 1000 files in a single directory is bad news. It's gotten better over the years, but if you're talking about say 20,000 or 100,000 files, the operating system can't deal with it, plain and simple.
I shortened the session expiration time and made a mental note that this is going to have to be fixed. A couple of days later, timeouts again. This is right after google did a deep scan of my site. OK, something's got to give. So I look at alternatives. PHP can also use 'memory mapped' sessions. No files. OK. That sound like a winner. This is a small site, I shouldn't need that much memory. I recompile php with the necessary configuration and run it. Great. At least it works...
Then I make a mental note that I really need to figure out a way to measure the memory use because running out of memory is not a good thing. And I let it run. I figure about 100 bytes per session. The problem is that I don't know how much memory the mapper reserves for itself and what its built-in limits are. Might be 100 megabytes (roughly a million sessions). Might be a megabyte (roughly 10,000 sessions). I've got about 50 megs free. If it's well written, I should be able to manage a half a million sessions. But I didn't take the time to review the source and find the answers. I just wanted to get it working - pronto.
Big mistake. I come back the next day and I've got a blank screen. Boy this memory manager doesn't degrade gracefully. When the 'files' driver filled up, the program just hung up for a while. Using the memory driver, the page actually comes up, but there's nothing on it. The dreaded White Screen of Death. At least I now know it's reserving something considerably less than a megabyte - and not asking for more.
OK - now it's done gone and made me mad. Time to write an SQL session driver. Yeah, I know how to do it. I didn't think I'd need it, but it turns out I do. Took about twenty minutes to get it running.
The lesson here is that you can't use the built-in PHP session drivers even on a small site these days with all the crawlers running around. A few years back you might get a few hundred hits a day if you were lucky. Now it's in the thousands and tens of thousands. If you've got a weblog that pings one of the pinging services, you're guaranteed another 500-1000 hits every time you change a page. Since these are all coming from robots and aren't storing cookies, each page is going to get a new session.
So all you developers out there, if you want to use PHP sessions - hack, beg, borrow, or steal a database session driver and figure out how to use it.
Attached is a little driver I found on the web a couple years back. It's a good starting point if you've never dealt with session drivers before.
![[*TOP MEMBER*] John [*TOP MEMBER*] John](images/unknown-1.jpg)
AJAX is being overused for some really simple and silly stuff. Most of it doesn't really matter - you could just as easily do it with a new web page. But there's one application they've never been able to do quite right with web pages - and that's chat or instant messaging. Every page refresh (which is every few seconds) loads a new page. It looks horrible. Nobody likes to use it. It'll sometimes refresh while you're typing something, losing what you typed - unless it is set up with two windows open. No matter how you do it, it looks and works horrible.
But AJAX is the perfect solution for chat. All the data is being refreshed in the background - and the page doesn't have to be reloaded - until you sign off. It's the killer app for AJAX.
I've glanced at a few open source AJAX chat packages - most of them are really, really primitive. So far I've only found one that looks promising. But open source isn't the only source out there. It only takes a couple of hours to write all the skeleton code. You're going to see chat on pretty much every major social web site soon. It can also be integrated pretty quickly into any of the leading IM applications. The interesting thing is that It can be integrated into any web page. I could have chat running on this page, for instance. I don't know that I'll end up doing so. I'm really not fond of chat. I'm just alerting you to the possibilities...
Oh, and with the ability to do things interactively without the annoying flicker of page reloads, you're going to see a whole lot of AJAX games pretty soon. I mean really. Games make up a huge chunk of the digital economy. But if you're looking for games online, seems like there's only one - Texas Hold'em. You can't even find a decent game of chess online.
The other killer app for AJAX is search. Coincidentally I'm looking for employment and the search companies seem to be the ones with openings... Of course they've probably already set people loose exploring possibilities. But picture this - you've got back 2,000,000 results on your search term. Hmmm... What if you could just hover your mouse anywhere on the page and have the result set refined in real time based on the word or entry you're hovering over? Let's use the keys example. Locksmiths, Florida Keys. Encryption keys. Alicia Keys. You want to take a vacation and typed 'keys'. Hover the cursor over the word Florida for a couple of seconds and all of a sudden you're looking at just Florida Keys and down to half a million results. Now hover over the word hotels. ...The Hilton. ...Reservations. Pretty soon (10-20 seconds) you're down to a few pages of possible results matching precisely what you're looking for and you haven't clicked anything.
See what I mean? There is absolutely nothing preventing this web application from being available to you tomorrow. The technology to do this exists today; and in fact has existed for several years - it's JASMOP (Just A Simple Matter Of Programming).
The Badd Cafe is now open.
There's still a lot of code cleanup to do, tons of documentation, and no doubt some bugs will arise.
Welcome to 0x BADD CAFE.
Oh for those of you wondering why oxen have a coffee shop, it's geekspeak. In the 'C' programming language (and several others), prefixing a variable value with 0x (that's a zero, not an 'oh') - indicates that the value is written in hexadecimal (base 16) notation. BADDCAFE is the number 3135097598 in hexadecimal. There isn't anything terribly interesting about this number, except that it is exactly one less than BADDCAFF and precisely 44102FD0 less than FEEDFACE. Amazing! BADDCAFE is also a musical progression, but unlikely to make the top 40 hit list. I know, I played it. Not quite the same magic as ADADADEA.
I promise to make this easier... but I've just plugged in comment avatars. To use them, first go to register and setup a basic account (I'm not quite ready with advanced accounts).
Then login.
Then go to avatar and select an avatar. You will only be able to do this if you have an account and are logged in.
From there on in, if you're logged in and make comments on this board, your avatar will be displayed.
There is still a lot to do, so don't hassle me about feature requests like uploading your own avatars and viewing profiles and stuff like that. All of that will come in time.
Oh and for folks who have been here awhile and made comments in the past, if I notice you have an avatar, I'll do what I can (manually) to add it to your existing (prior) posts.
The patience of my family and friends is probably wearing thin. My rantings are getting less and less entertaining and more and more technical lately. That's the way it has to be at the present time. I'm grappling with some tough architectural issues. If you're dealing with technical architecture issues, it always means you're doing something fundamentally wrong. Or perhaps a better description would be 'painting oneself into a corner'.
This is not always a bad thing - it's a learning experience. How do you you get out of the corner without getting your feet messy? Always look at the bigger picture. My software is primarily a weblog. But the bigger picture is a content management system, with web 2.0 connections.
What's web2 about it? Hang on, I'm still quite a ways from being finished with this upgrade. I have yet to add my secret sauce. This particular web address used to be my XML playground, and I used to have a copy of Drupal on it. Drupal is gone now, and I've grown my own software to the point that I can start to use it as a content framework. I've started plugging in the XML-RPC layers. I've started plugging in uhm - plugins. Welcome to the new XML playground.
Anyway, the URL framwework wasn't extensible enough. Let's throw some URL's out and see what happens....
This all starts with http://baddcafe.com
mike
mike/DEC-2003
article/1052
post
tags
feed/mike
There's a problem here. It's the URL's starting with 'mike'. You see, everything is function-based. If it starts with 'article' I know I'm viewing an article. If it starts with 'post', I know I'm writing something. I can extend this to arbitrary actions. Profile, shop, whatever. But 'mike'? What the heck am I doing? This is a problem. I need an action (or more specifically, target page) here. Otherwise as the list of actions grows, and the list of people grows, there could be conflicts. So it looks like it all has to move down a level. http://baddcafe.com/weblog/mike - since this is all attached to weblogs. I refuse to use the word 'blog'. More on that in a moment.
That looks like the way out of the painted corner.
Oh, about the word blog... I wrote this in a friend's weblog back east last week.... You can view the original context at dustingmybrain.com
"I also have fallen into disfavor with anything beginning with 'blog'. I was writing online for 20 years before I heard the term used. Say it out loud. Isn't that the sound you make when you're puking up a bottle of Southern Comfort? Blllooooooghhhhh! Consequently, I have to laugh everytime I hear 'blogosphere'. No, offisher, I'm not drunkenut... my blogosphere ish woggly.... oooopsh... Blllooooooghhhh!"
There's a lot of debate going on these days about the so-called MVC programming paradigm. I've mentioned it before. Have a look at this page, for example. MVC stands for Model, View, Controller.
Basically, this all came about because of the LAMP programming environment. That's Linux, Apache, MySQL, and PHP. When you are writing web applications with PHP and MySQL, you've got essentially three computer languages intertwined. HTML, PHP, and SQL. Up until a couple of years ago, you just stuck these all into a file and wrote your application. The software purists who like to write pure, clean, re-useable code found this to be an abhorent practice. Three languages in one file. Then along came RSS and added a fourth. By now most of them were frothing at the mouth. Then everybody started playing with Ajax, which relies heavily on Javascript. The software purists were now having irregular heartbeats and threatening to go postal. In fact, a lot of this debate started in 1995 when Netscape added Javascript into HTML pages - and has only escalated since then. We've come full circle.
So the MVC paradigm was born. What it does is separate as much as possible the different languages into their own files. The model is the PHP code or what they call business logic or process. The view is the visible representation or output - HTML (with Javascript) or RSS or whatever. The controller is the SQL or the input. Wherever possible, they've tried to keep the languages pure and separate.
This is all well and good, except you've got PHP in all three sections. That's the glue which makes these web-apps work after all.
I think it's all a bunch of hooey. Yeah, I understand the principle, and no, I've got nothing against structured code. In fact I think it was all designed intentionally after the dot-com bust to put programmers back to work. Now you need three programmers where before you only needed one. Three times as many files to deal with.
This is supposed to make programming less complicated and more easily understood. I've worked on a few MVC packages now. Let's say you want to fix a bug in the bogga.html web page. In the old days, you open up bogga.html with a text editor and there it is. You need to fix something - piece of cake. Now you've got to figure out what function it belongs to, and whether or not the bug is in the SQL code or the PHP code or the HTML/RSS/whatever code. Bogga.html might not even exist as a file, even though you can access it on the web. Once you've done this then you have to look through all the files to find the right place to tweak the code. Trust me, I've spent a lot of time fixing badly written code. Buggy code is buggy code. Whether or not the HTML is mixed in with the SQL doesn't make one bit of difference. All it does is take more time to find the bug - and then even more time to fix it. That's because you can't just glance at the surrounding code to see what effects a change might have anymore. The surrounding code is now spread across the filesystem.
I guess what it all boils down to is that if you're not multi-lingual in computer languages, you've got no business writing web-apps. If you are multi-lingual, you should be able to switch context between SQL and PHP and XHTML and XML and Javascript all within the same paragraph. Drop into assembly language to do a left shift a bit faster? Sure, why not? If this hurts your poor little brain, you've got no business writing web-apps.
Now that I've got the major part of the software working, I've been digging into all the little stuff. A lot of this you won't be able to see until I turn on registrations, so it may look like nothing has been done recently.
You might notice that there's a login block now. Yeah, I'm using sessions. It solves a lot of problems and makes the site work like any other site that uses logins. Those menus on the left -- click on one. They expand and collapse. If you're on my weblog page (not the main page) you'll see a guitar in the Author block. It's not just an image that I linked. It's my avatar and can follow me around the site. I've got an avatar selector and a modest but adequate collection to choose from.
So why haven't I turned on registrations? In a word, permissions. When you've only got one person, you don't have to worry much about who can access what. Once you allow a second or third person, permissions become critical. So I'm building in an entire infrastructure. On the first round, it will have four levels.
- Anonymous (not logged in) - Can only read allowed stuff and maybe make comments
- Registered - Has a login account and can maybe send private messages (consumer)
- Author - Can write weblogs and do lots of stuff (producer)
- Admin - Can do anything at all.
I'll probably turn on the 'registered' section first. There are still a few kinks to be worked out on the author level. Most of these have to do with changing the look and feel of their weblog. When this was a single user system, this could all be done with PHP and CSS. There won't be any public PHP (code) access, so whatever can't be done with CSS will have to be done with database tables (or not at all).
Oh, and I tossed the Kevin Roth editor (basically a script to turn on the browser editor) and went whole hog with tinyMCE. The difference is that I don't have to keep coming up with regex's to make it emit proper XHTML.
-- Andrew Tanenbaum

Digg
Delicious
Netscape
Technorati
yaze-1.14.tar.gz
sess.php.txt