January 4, 2004

Perl Experiment 1

samology on google.JPGCool, I finally got my "Google referrals" idea working-- as evidenced in the righthand sidebar under "RECENT GOOGLE LOVE." Basically, if you arrive at Samology via a Google search, I am adding your query to a database and linking it back here for all to see. The search terms listed are clickable, too- go ahead and try it, it is fun! It is my first Perl script, here have a look! Short but sweet. I also got a nice introduction to MySQL, DBI, CGI, Server Side Includes and CGI environment variables while I was at it! (UPDATE: Link goes to current version of script with various additions/cleaning up...) At present, various loopholes exist- paving the way for hackers to really screw everything up, but oh well. Things to work on (***begin extreme geekiness***); --A Google-referred client that "reloads" the page adds a duplicate data record (various ways to fix that). --Google referrals to my old ".html" archives fail to show up (my host provider requires a .shtml extension to trigger the Server Side Includes). It would be nice if changing the .html files to symbolic links to the new files would circumvent this problem, but guess what? It doesn't. Hopefully the new .shtml pages will eventually replace the old ones in the Google database. UPDATE: Done! Thanks, Joe! --Add results from other search engines (MSN, Yahoo, etc.). UPDATE: Done! --Translate the ASCII escape character stuff to something more legible. UPDATE: Done! --Append the "result rank" in parentheses (this will be fun). --Develop graphical interface to resultset table for easier management. Whee! By the way, I accept steak dinners or cash as payment for consulting work, your choice... UPDATE: Boy, lots more stuff to do. I.e.; --Investigate AOL "encrypted" queries to see if they are easily decrypted to display properly --Get rid of recursive behavior (i.e., clicking on a Netscape result generates a duplicate record for that result because Netscape by default loads loads search results in a frame). Should be fun... --Disallow multiple record additions from a single client in any short period of time (5 minutes?). Anti-hacking stuff...


Hmm, it occurs to me there are more search engines than just Google out there giving me hits, so I am on it! Yahoo was easy, the referral string is almost exactly like Google. MSN, AOL and Netscape will be slightly trickier...

On the SSI tip, you might be able to override your host provider's settings by placing the following in a file named .htaccess in your web root directory:

AddType text/html .html
AddHandler server-parsed .html

(If the .htaccess file doesn't exist, create it.) This will tell the web server to parse .html files so you can do SSI stuff with them.

Great, thanks Jason, I appreciate the tip. Will try it when I get a chance, that would really help things considerably.

Okay, done! Whew, thanks again, Jason, now links to my archives are showing up just fine. Neat!

