Misc. Ramblings

Week of 31 July through 4 August 2000
Last Week    Mon    Tues    Wed    Thurs    Fri    Next Week
Home     Diary Index     Search
Jump to Last Update: Friday 7:00 am HST

Monday - 31 July 2000

Storm Update. Cross our fingers. Right now it looks like tropical storm (downgraded from a hurricane on Friday) Daniel will pass just north of us tomorrow. It was touch and go over the weekend. Some of the projected storm tracks put the storm passing directly over Mau'i and just south of O'ahu. But as it gets closer, the projected track (see it here, 287K gif) moved north and the wind speeds, which had been steady at 50-60 mph (80-97 kph) with gusts to 70 mph (113 kph), are projected to decrease in to the 30-40 mph (48-64 kph). range. We hope this too shall pass.

Serendipity Doo Dah. One of the interesting sidelights of getting the new search engine running (see last week) was all of the bits and pieces that needed to be in place for it to work. Namely, a shell script, cron, and crontab. I will explain what each of them does and why I needed to use them below. For those who desire additional information, there is a list of sources that I used at the bottom of this posting. But for now, as Regis would say; "Let's get to it!"

Shell Scripts. Most of us come from the DOS world of config.sys and autoexec.bat. Each is a kind of script. Or, more familiarly in the DOS kingdom, a batch file. That is, a file which contains a list of commands that execute sequentially. A simple example of a shell script, which displays the words "Hello World" on your screen is (don't try executing this yet, I'll explain why below):

#!/bin/csh
print "Hello World \n"

However, one of the differences between DOS and *nix (referring to all variations of Unix, including Linux) is how the operating system determines what type of file you are trying to execute and how the OS is to handle it. In the DOS world, having a file extension .bat on a file tells DOS that the file is a batch file and that the commands contained therein should be executed.

In *nix, script files can execute even though they may not have an extension at all. For example, if we use our favorite text editor (and remembering to press the enter key after each and every line) and save the above snippet of code as "test" (without an extension). And then issue the *nix command chmod u+x test (where chmod is the command, u+x are switches modifying the command, and test is the file being acted on). We could then execute the file by typing ./test and then pressing the enter key. Note that without setting the permission to executable (the "x" part of the u+x switch), you can use almost any extension you want (except for perhaps .pl or .cgi), but *nix will probably not execute it.

Okay, you can now create and execute a shell script. Why do I need to do so in this case? Because ht://Dig uses multiple files, that need to be run in sequence, which create the index to be searched. My script file may be different from yours depending on what options you want to have. But you would be needing to run at least htdig, which does the scan, and htmerge, which merges the results of the scan into the existing index (as I understand it). I also run htfuzzy, which creates a file for use in doing fuzzy searches. In my case, I've saved the file with the name foo.scr (note below that # is like the DOS REM statement):

#!/bin/csh

# foo.scr
# 24 July 2000
# called by a cron job to update
#  the htdig index and merge
#  the new file into the current
#  searchable one.

/usr/foo/cgi-bin/bin/htdig -i
/usr/foo/cgi-bin/bin/htfuzzy metaphone synonyms
/usr/foo/cgi-bin/bin/htmerge

Cron - Clock Daemon. Now, if you don't mind having to run the above script file each time you do an update to your site (and can remember to do so), then you don't need to know about cron or crontab. But if you are like me (i.e. lazy, ahem, time challenged), then you will want some way of automating the process.

The cron command starts a process that will execute your shell script at a specified date and time. Usually, the system administrator is the one that gets cron going on your host server and the only thing you need to do is create the command file (called a crontab file) in which cron will find the dates and times you specify. If you are interested in how to setup cron, you can check out the link at the end of this posting. But for now, that's all we need to know about cron.

Crontab - User Crontab File. The crontab file consists of one or more lines, with six fields on each line. The six fields are shown below (with the permissible range of entries shown in parenthesis). Note that there should not be any blank lines or fields, although it is acceptable to have more than one number per field (as long as you separate the numbers by a comma).

minute (0-59)
hour (0-23 with 0=midnight) 
day of the month (1-31)
month of the year (1-12)
day of the week (0-6 with 0=Sunday)
string to be executed (in my case, a shell script)

My particular crontab file looks like this:

0 15,19 * * 1-5 /usr/bin/nice -20 /usr/foo/cgi-bin/bin/foo.scr

In English, this means at 3:00 pm and 7:00 pm (0 15,19) host server time, every day of the month (*), every month of the year (*), every weekday (1-5), the shell script foo.scr will be executed. The part about /usr/bin/nice -20 is specific to pair.com and tells the server to give this process a lower priority than serving pages (essentially making it a background process). And again, only for pair.com users, note that they require a literal "tab" character between the fifth digit and the /usr/bin/nice string. Don't ask me why. They just do.

To create the file above, start up your favorite *nix editor and type in the line(s) you need. Then save the file calling it, for example, crontab.scr. Then from the command line, issue the following command to load your script into crontab: crontab crontab.scr. To check to see if your file is loaded, issue the command crontab -l. The response to the screen will list what is currently loaded for your username.

And that's it! So, we've learned how to create and execute (remember the chmod u+x command!) a shell script so we don't have to type in these commands every time we want to index our site. And we've learned how to create a crontab file which tells the cron daemon when to execute our script everyday. Nothing to it right? YMMV.

Links to additional sources:

Shell Scripts: http://www.kcl.ac.uk/kis/support/cit/gum/b1_3-5.html
Cron: http://hoth.stsci.edu/man/man1M/cron.html
Crontab: http://hoth.stsci.edu/man/man1M/crontab.html
Pair.com Scheduling Programs: http://www.support.pair.com/tutorials/cron.html
Pair.com Cron Usage: www.support.pair.com/policies/cronusage.html

Mail Call

From: J. H. RICKETSON
To: Dan Seto
Sent: Thursday, July 27, 2000 6:38 PM
Subject: Your 07/27 Dissertation on Search Engines

Dan,

I am in 110% agreement with your take on search engines. I detest the idea of having ANY of my data dependent on some money-hungry stranger's server. I looked long and hard, and finally found search.cgi I have to figure it out, and then try it on the duplicate of my website on my HD Then, if all goes well, up it goes.

A 5000 page limitation on effective search is no problem for me!<BG>[Actually, Dan said 500 pages <g> - Ed.]

What I can't understand is that the people that are using AtomZ are intelligent people, well aware of the risks, not only to themselves & their systems, but more importantly to their readers. I think that's kind of irresponsible. Or maybe I'm overly paranoid about privacy.

Anyway, keep up the good work.

Regards,

JHR
--
J. H. RICKETSON
[JHR@WarlockLltd.com]
27/07/2000 9:27:41 PM

----- Original Message -----
From: Dan Seto
To: JHR
Subject: Re: Your 07/27 Dissertation on Search Engines
Date: Fri, 28 Jul 2000 07:29:27 -1000

JHR,

I am also in agreement with you that some very intelligent people are using external search engines. And I too can not figure out why (other than it is easier to implement). While I wouldn't necessarily go so far as to say they are irresponsible, I would agree that it is at least problematic.

And no, I don't think you are paranoid about this. There are clear examples (e.g. RealAudio, cuteftp, etc.) of companies misusing information gathered from people without their consent, much less knowledge. Whether Atomz is one of them I can't say. But why take the chance?

Thanks for the kind words.

Aloha - Dan

Top / Home / Monday / Tuesday / Wednesday / Thursday / Friday


Tuesday - 1 August 2000

Weather Update. If you are reading this, then we must have survived tropical storm Daniel. And if you remember nothing else about these storms, never think you know what they will do. They are like living things sometimes.

For example, at around 11:00 am yesterday, tropical storm Daniel, after traveling 2,500 miles (4,023 km) across the Pacific, and after being downgraded from a hurricane to a tropical storm, became a hurricane again. And turned heading directly for the islands. For a couple of hours things looked really bad. But then just as suddenly, it lost speed and turned away. Sheesh!

Right now, it is moving away from the islands. I say right now because there have been occasions in which storms turned around and came back. But for now, all is well.

Browser Updates. MS updated IE to version 5.5 recently. I haven't had any problems with it and I haven't seen anyone else saying otherwise. So if you've been waiting, perhaps now is the time to update.

Opera 4.01 is also available now. I couldn't find anything at the Opera site to say what was fixed with this 0.01 release but I have not experienced any problems with 4.0 or 4.01.

Meanwhile, Netscape 6.0 alpha build M17 is very late and getting later by the day. There are still over 500 bugs listed in their tracking system. And many are beginning to read like this:

1. Found bug.
2. Tried to figure out what causes bug.
3. Can't figure out so am changing status from P1 B2 for M17 to M18+.
4. Bug 29018 is a duplicate of this bug.

Page Update. After looking at the results of my experimenting with SSI. I decided to scale back what I was doing. Otherwise, the page looked pretty strange. So instead of changing all of the top and bottom of this page, I just inserted a couple of changes (mail and copy right). Those of you who stopped by early yesterday morning saw the version 3.0 of this page. Those that came later saw version 4.0 which looks a lot like version 2.0. As it should. Still, it's been interesting working with SSI and what it can do.

French, errr, German Grand Prix Update. A distraught Frenchman, protesting his dismissal by Mercedes-Benz, after working for the German car company for 20 years, ran across and along the Hockenheim racetrack where the German Grand Prix was taking place. Track security eventually grabbed their man and found vays to make him talk.

In the mean time, Rubens Barrichello, in his Ferrari, scored his first F1 victory in what is being described as a wild race (surely they are not referring to the Frenchman?). Featured events included a first-lap shunt which knocked points leading Michael Schumacher out of the race. Schumacher's lead now stands at only two points.

Following the Ferrari in second and third place were the McLaren-Mercedes of Mika Hakkinen and David Coulthard.

Brazilian Update. While Barichello was winning the German Grand Prix, fellow Brazilian Cristiano da Matta was winning the Target Grand Prix in Cicero, Illinois (Target Grand Prix just does not sound right somehow. And having it in Cicero, Ill. of all places. Sheesh.). Even though da Matta did very well in Indy Lights (winning 7 times in 27 starts in that series), it took him almost two years and 51 starts in CART to finally notch a win.

Mail Call. Tom Syroid, someone who I admire and try to emulate, has a response posted (see it here, look for Tuesday's post) to JHR's email (see Monday above). I will let my email speak for it self.

From: Dan Seto
To: Tom Syroid
Cc: JHR
Subject: A Slight Clarification
Date: Tue, 1 Aug 2000 08:31:25 -1000

Tom,

If I may make one slight clarification and a few comments. Some of your readers may not follow the link you were so gracious to make to my site regarding an email from JHR. They may also miss the previous weeks discussion that brought up the subject of search engines to which JHR was commenting. I believe you are a fair person and someone I admire and wish to emulate (and I have so stated publicly in the past in my postings - http://seto.org/diary/2000/j20000721.html see Wednesday Noon Update).

Given this, I think it may give a truer picture of what happened to point out that it is JHR that is saying it is "kind of irresponsible" to use an external search engine. I said I would not necessarily agree with that but I felt it was at least "problematic."

Notwithstanding that, I still stand by my main points. Namely, Atomz can, and probably is logging all search requests coming through their site. That is, IP addresses, search terms, and probably search results (see their FAQ here about their "Report" panel http://www.atomz.com/help/faq.htm#5). Is this a concern to anyone? Perhaps not. Did your readers know this would happen before I mentioned it (and perhaps you gave it wider circulation by mentioning it in your post)? Again, perhaps not. Does it matter to anyone else? Who knows until you mention it? That's what privacy statements are for.

For your own information, you may be interested to know that by signing up with Atomz, your are explicitly agreeing to receive "promotional materials" from third parties (i.e. partners such as, but not limited to, doubleclick.com - see their privacy statement here: http://www.atomz.com/help/privacy.htm). The actual wording they use, from their privacy statement is; "We use member contact information from the registration form to send the member information about our services and promotional material from partners." [emphasis added] While this applies to you personally, rather than your readers, I think it points out the kind of attitude they seem to have.

In conclusion, I think it is a Good Thing that people know what risks, however small they may deem it to be, they take when using such services. Hence, I was curious as to why someone like yourself, who is obviously so concerned about your site security, would then, figuratively speaking, open a port to the world. And yes, I did mention at least two or three times in my discussion of search engines that perhaps the reason was it was easier to add a few bits of html rather than install a bunch of Perl scripts (which you accurately point out, introduce certain security risks themselves).

If I was somehow misunderstood, I hereby publicly offer to you and everyone else my humble apologies for being a lousy writer and seriously misguided. However, I still believe knowledge is power. And in order to make an informed choice, knowledge is essential. And to that extent, I believe it would be helpful for people to know what risks, however small they may be, they take by using such services.

With warmest Aloha - Dan

Top / Home / Monday / Tuesday / Wednesday / Thursday / Friday


Hump Day Wednesday - 2 August 2000

Speaking of security. The discussion on security yesterday got me pointed to one of the ways you can do something to help improve such. Actually two ways. But the first most people know of already. Chmod (mentioned in passing in Monday's post above). The other is the .htaccess file. I don't know if this is a *nix only thing. And/or an Apache server thing. But if you are running Apache on *nix, then you can use the .htaccess file to do some things to improve security. I understand that there are ways that some clued-in script kiddies (is that an oxymoron?) have found to get around this but it's still worth the few minutes to setup.

There's a short tutorial on the .htaccess file here at lava.net. Which just happens to be a local ISP. But I digress. To summarize, you can create this file in a directory you wish to protect by requiring a password for access, or restrict access based on Internet address, hostname, or domain name of the Web client.

This file, in combination with the chmod command can help to secure a directory and its files. For example, use .htaccess to restrict access to certain directories. Such as your cgi-bin directory. Then use chmod to restrict read, write, execute access to those who need such access within those directories.

On a personal note. I hear both Dr. Keyboard and the lovely Mrs. Keyboard are feeling poorly. Since Dr. Keyboard is an avowed atheist (is that also an oxymoron?), I will only say I am sending my best wishes for a speedy recovery to both.

***** Noon Update *****

As I've noted before, I do my posts prior to reading the other Daynoters. So sometimes, I don't respond or comment on things on their sites until a noon update or perhaps the next day.

Such is the case today. Robert Bruce Thompson appears concerned about an un-named site that is allegedly accusing him of being "irresponsible" for using the Atomz external search engine. Hmmm. A check of the other sites does not find anything except mine. Ipso facto.

Okay. Let's see what the record actually says:

From: Dan Seto
To: RBT [webmaster@ttgnet.com]
Subject: PO'd
Date: Wed, 2 Aug 2000 11:22:41 -1000

1. Irresponsible I have not, and did not, accuse anyone of being irresponsible. Read the email from JHR in Monday's mail call and you can clearly see that. In fact, I specifically say I do not necessarily agree with that and instead characterize it as being problematic.

I say problematic because I could not understand why anyone would use an external search engine without at least letting their users know that their search would be logged at that external site (which is what Atomz says they do - see the FAQ, and probably all others for all I know).

If you are angry about the irresponsible label, please directly contact the person who used it.

2. Possible Security Threats. First, let me say flat out, I am not accusing Atomz, or any other external search engine of any illegal activities. Second, my remarks are, and always were, intended to refer to all external search engines in general. And third, my concerns were based more on a "philosophical" differences than anything else (read my original discussion from last week).

To wit, let me ask this question. How do you assess what level of security you are comfortable with? Was it possible/probable that RealAudio would track all of your downloads and report such back to them? Until recently, most people would say you must be paranoid to even think of such a thing. You must be a loon. You must be, yes, here's that word, irresponsible to even suggest that.

Does that necessarily mean that any external search engine is doing anything like that? Obviously, not. However, neither does it mean one or more aren't in fact doing something that you may prefer they don't do now does it? So the question is, barring any specific information one way or the other, how comfortable do you feel using an external search engine. Obviously, most of the Daynoters feel quite comfortable.

I do not. I prefer to use an engine that is released under the GNU GENERAL PUBLIC LICENSE, Version 2, June 1991. The source code for this engine is available from http://dev.htdig.org/. Anyone who wants to review the code is free to do so. Does this mean that it is free of what Steve Gibson calls "spyware" (or anything else fishy for that matter)? Perhaps not. But I feel comfortable, given the number of eyeballs looking at the source code, that this is not so.

Well, as I said in the beginning of my original discussion; "To each his own." But I've also said, knowledge is power. And to make an informed choice, knowledge is essential. So what is wrong with a discussion of the pros and cons of using an internal vs. external search engine? I'm really at a loss here as to the virulent reactions to what was said.

Having said that, I will say to you what I said to Tom yesterday; "If I was somehow misunderstood, I hereby publicly offer to you and everyone else my humble apologies for being a lousy writer and seriously misguided." And I followed that with a note to Tom, and now you, that says; "I also want to assure you that I have the utmost respect for you (and RBT, and all of the Daynoters) and did not intend to impugn your good name(s) in any way, shape, or form. I have lost "face" if it was taken that way by anyone and will do anything to try to correct it (except slice my belly open with a knife...).

Mail Call.

From: Jan Swijsen
To: Dan Seto
Sent: Tuesday, August 01, 2000 11:29 PM
Subject: problems versus bugs

Big misconception here (but it is widespread). A bug is an error in a program. If a bug is found it can be solved or mediated (worked around). Users never find bugs. Programmers spend about 80% of their time searching what is causing problems. No one ever searched what causes a bug.

Figurative :
The user has a piece of wood. After a time he notices that small holes are appearing. The user has found a problem. He has not seen the woodworm so he hasn't found a bug. He calls the carpenter. She inspects the wood, treats it with a 'debugger' (chemical in this case). If she finds the bug she can extract it (with a pair of tweezers).

Of course language is a living thing so maybe in a few years time the word problem or symptom will be replaced by the word bug.

--
Svenson.

From: Dan Seto
To: Jan Swijsen
Subject: Re: problems versus bugs
Date: Wed, 2 Aug 2000 06:27:36 -1000
[This response has been embraced and extended for clarity - Ed.]

As always, you are right. Thanks for the correction. I learn something new every day. Perhaps the wording should have been:

1. Found an unexpected and unwanted behavior, i.e, a problem.
2. Am unable to find the cause of the problem.
3. Don't have time to continue to look, so am reassigning to a later deadline.
4. This effect has already been reported under an earlier report no. 29018.

Of course, in my defense, I was just reporting how Mozilla does it in their "Bug", ahem, Famous Unexpected and Unwanted Behavior Active Reporting System (FUUBAR). <g>

Aloha - Dan

Top / Home / Monday / Tuesday / Wednesday / Thursday / Friday


Thursday - 3 August 2000

Survivor of Big Brother. Before I get voted off of the island, let me say that I hope that people can look past some of the things that have been going back and forth here and find the central thesis of what I was trying to say.

Namely; "How comfortable do/should you feel with your current level of security?" Each of us may have a different answer for different reasons. For Tom Syroid, I think he feels best when he has control over as much as possible. Thus his hosting his own site. For others, including myself, we may not be willing or able to spend that kind of time and resources. Such is it with a choice of internal vs. external search engines.

Personally, I choose to use an external search engine because I can control how it works more than I could if I used an external one. Others are not so disposed. Fine and dandy. Just remember that you are making a choice for your readers also. In any case, you decide what you want to do. YMMV. #30#

Thought for the Day. In Germany, the Nazis first came for the communists, and I didn't speak up because I wasn't a communist. Then they came for the Jews, and I didn't speak up because I wasn't a Jew. Then they came for the trade unionists, and I didn't speak up because I wasn't a trade unionist. Then they came for the Catholics, and I didn't speak up because I was a Protestant. Then they came for me, and by that time there was no one left to speak for me. -- Rev. Martin Niemoeller, German Lutheran pastor arrested by the Gestapo in 1938. He was sent to the concentration camp at Dachau, where he remained until he was freed by the Allied forces in 1945.

MS ME On Sale Now! I thought Microsoft's latest version of Windows 98 would not be available until next month. Wrong Rambus breath! Actually, what is available locally at one of our independent PC stores is the OEM version (price $99 USD). This means you can get it if you are buying a new system. Well, you say, you don't want to buy a complete system just to get Win98 SP3 (aka ME). Cheapskate. I'll let you in on a secret, if you buy a hard drive or motherboard, you qualify for the OEM version. Note that the OEM version is the complete deal, not an upgrade version.

What's the difference? Besides the price (more below), the OEM version allows you to install on a bare drive. The upgrade requires an earlier version of Windows 98 to be there already. So, if you need to do a complete system install on a bare drive, the OEM version is the way to go. Otherwise, you have to first install your earlier version of Windows (after finding the darned CD hiding behind the kitty litter box) and then install your upgrade.

But if you act now, and you don't care that what you will be getting is the upgrade version, Windows 98 ME is available for the low, low price of just $59 USD. This limited time offer is available through MS here. Say Dan sent you and you'll get a free set of Ginzu knives</JOKE>.

Mail Call.

From: Jan Swijsen
To: Dan Seto
Subject: Re: problems versus bugs
Date: Thu, 03 Aug 2000 10:36:22 +0100

>as always, you are right....

Yes, for once I am right <g>

Of course the term Bug is moving out of the technical jargon and invading normal language. Along with that invasion the content shifts. So technically and historically I am right but by next year I will be wrong.

BTW Intel has no bugs, only errata; Microsoft has no bugs either, just some undocumented features. So why should Mozzarella (oeps a keyboard bug, I mean) Mozilla have bugs, they should use their imagination and name it for example Inadvertent Program Operation (yea, IPO's are in these days <G>).

Top / Home / Monday / Tuesday / Wednesday / Thursday / Friday


Aloha Friday - 4 August 2000

Thank God It's Friday. Sigh.

It's been a long week (© Brian Bilbrey and Dan Bowman). I'm sorry to see that JHR is taking a break from writing his Daynotes. I don't necessarily agree with everything he says but I respect his opinions nonetheless. I hope he recharges his batteries quickly and is back in the saddle soon. His voice will be missed.

Speaking of which, I'm kind of burned out myself from the events of this week. Not a pleasant time. And for anyone who thinks I do these things just to pump up readership...take a long walk off of a short pier. I am not in the business of writing books. I do not need this site for "shameless self promotion" (© Dana Blankenhorn <g>). Whether I have 10,000 readers or only 10. It matters not to me. I do this because I have something to say. And this is where I say it.

If you find it interesting or helpful to you. Great. Comeback for more. If you don't. Fine. Move on to something that is useful to you. I have never tried to promote this site in any way, shape, or form except in the sense that I write what interests me. This is me. All 300 pounds of sumo wrestling thunder.

Now for something completely different. I begin graduate school (Masters in Public Administration at the University of Hawai'i at Manoa) in a little over one week. Posts may get a little short once that begins. But I am hopeful I will still be able continue to be the same pain in the okole that I've been the last 10 or 11 months. If you are up to it, join me. I can only promise that it will be interesting <SEG>.

Have a Great Weekend Everyone! See you back here on Monday (Lord willing and the crick don't rise). Aloha - Dan.


Last Week

   Mon    Tues    Wed    Thurs    Fri    Next Week
Diary Index   Link to the Daynotes Gang

© 2000 Daniel K. Seto. All rights reserved.