- Talked with Phil about disk online system performance issues. He agreed
that the I/O bottleneck was a problem, and also that we should upgrade to
Solaris 2.7 in order to implement UFS logging.
- Set up a terminal server for Edwards Air Force Base.
- Added a line to the footer for the earthquake mailing lists giving
information about how to subscribe. This is so that if a subscriber
forwards a message to someone else, that person will know how to subscribe
if they want to.
- Added Makalu to Big Brother. This is Katrin's new machine.
- Removed QDDS from the process list that Big Brother checks on Quake.
- Judy Konnert called and said that the Squid server had been installed in
Reston. At first, it would not boot. This turned out to have been caused
by the SCSI cable coming loose during shipment. She reseated the cable and
the machine came up.
- Bbear died on Wednesday with an apparent power problem. Rocking the
power switch back and forth a few times got it to come back on.
- Lander$dka0 died on Wednesday. The only thing on this disk was the page
and swap files. Created new page and swap files on Lander$dka100:.
- Installed the new version of the Simpson map software on Bort as a test.
Made up a first-cut configuration script to automate some of the
installation tasks and mailed it to Bob Simpson.
- QDDS on Agent86 and Bort went senile on Wednesday. The process was
running, but was not receiving new messages. Restarted it.
- Started running QDDS on Bort using kaffe <http://www.kaffe.org/>, which
is an open-source implementation of Java. This is a test to see if it
behaves any differently than running under the jdk 1.1.8 environment.
- Adjusted kernel parameters on Jet to match the changes made last week on
Spring. http://bort.gps.caltech.edu/stan/mail-archive/msg00000.html
- Assisted Frank Vernon with setting up Galena in the telemetry room.
- Got mail from Peter Schweitzer and John Lahr about mailing list issues.
John was concerned because Majordomo was taking about 4 hours to send mail
to their 2,600 subscribers. I recommended either installling bulk_mailer
or qmail and ezmlm.
- Set up a redirect on Agent86 to point requests for /ciim/ to /shake/.
The /ciim/ directory is the V1 area, and is no longer current.
- The Menlo Park Squid server came up on Friday.
- Set up SNMP monitoring of all three Squid servers, and a system status
page at http://bort.gps.caltech.edu/mrtg-ehzsquids/
- Set up a redirect and some soft links on Terra10 to redirect requests for
the old station updates archive to the new version.
- Installed the new Simpson map software on Agent86, so we will now display
the Nevada map along with the UNR events.
- Set 'RhostsAuthentication = yes' in /usr/local/etc/sshd_config on the
FreeBSD machines. This allows for the use of the '.shosts' file for access.
- UPS transition planning meeting.
- Disabled snmpdx on Flint.
- Traveled to Golden for the Earthquake Hazards Web Team meeting.
Presented some information about the planned new server configuration, as
well as information about past experiences with earthquake web server
traffic loads.
- Jim Fisher made the DNS change on Wednesday to make the new EHZ site live.
Got a new tar file from Madeline in Golden. Fixed some typos in the files.
The fixes I had to make are documented at
http://bort.gps.caltech.edu/stan/mail-archive/msg00007.html
- Mailed Charlene Fischer to change eqwebback to point to eqweb-north and
eqweb-east. I had originally erred and had it point to east and menlo.
- Set up rsync to synchronise content from ehzmenlo to ehznorth. This is
done every five minutes.
- The Simpson map on Terra10 was dead. The problem was that QDDS had
stopped.
- The 'waveforms' link on the individual event pages under the Simpson map
was broken for Northern California events. The program was making a
relative link, which came up as a 404 error on our server. Fixed the
program and mailed a patch file to Bob Simpson.
- Set up daily statistics reporting for the EHZ web site. The logs from
all three Squid servers are copied and merged into a single file for
analysis. The overall statistics and daily reports are kept at :
- Moved the '4kids' and 'faq' pages from Agent86 to the new Program site.
- Took one day vacation on 11/10.
- Built and installed Apache 1.3.14 on Ehzmenlo, Ehznorth and Ehzeast.
This was necessary to enable the mod_expires module. By default, Apache
marks server-parsed pages as non-cacheable. As a result, the Squid server
was not caching these pages. Enabling mod_expires is a way to fix this.
- Set up redirects to redirect /eqhaz/4kids/ and /eqhaz/faq/ on Pasadena to
point to the EHZ pages.
- Added some logic to the EHZ daily statistics to ignore the requests from
the other EHZ servers checking server availability. This was adding about
10,000 page requests a day.
- Moved all the critical systems to city power on Tuesday, in preparation
for the UPS changeover on Wednesday.
- Checked permissions and group membership for the EHZ Web Team on
Ehzmenlo.
- The UPS switch was done on Wednesday morning, and most machines were
moved back off city power by the end of the day.
- Attended the CUBE Users' Meeting on Thursday.
- Lisa changed .html files with includes on the EHZ page to be .shtml.
Turned off parsing of .html files.
- Helped Hugo with the pinouts for connecting to the Lantronix ETS-16 for
the NEIC display in the media room.
- A sales guy from Compaq visited to tell us about their network storage
products.
- Frank Vernon dropped off 256MB of memory for galena.
- The web server statistics for the EHZ pages were not correct. This
turned out to be a problem with merging the log files. Made an improved
script to merge and sort the records by timestamp. This script is on Bort
in /home/stan/Squidlogs/squidlogs.sh.
- Removed the old-style station updates archive from the Trinet Internal
web pages on Terra10.
- Changed umask to 002 for the people on Ehzmenlo who will be working on
the National Pages. This will allow for group write access.
- Set up a sed hack to remove lines containing 'Fault' from the
automatically generated mail when it is propagated over the mailing lists.
- Set up an email archive for Trinet RT System change mails. The archive
is at http://terra10.gps.caltech.edu/trinetrt_updates/.
- Made an account for Kate for external access for the Trinet Internal web
pages.
- Cleaned up my office.
***
- Investigated performance problems on the Trinet online systems. Found
out that the big bottleneck is disk I/O to the RAID-5 array. We are doing
about 40 writes per second to this disk, averaging about 8kB each. Due to
how RAID-5 works, this leads to a major performance hit. We are
considering options for reconfiguring this to avoid this problem.
- Attended the Earthquake Hazards Program Web Team meeting in Golden, CO on
November 7th. Presented information about the new server configuration and
performance issues due to the severe load spikes that earthquake web
servers are subject to.
- Finished setting up the servers for the new Earthquake Hazards Program
web site. Also set up SNMP monitoring to provide real-time status
reporting on the servers. The servers are located in Pasadena, Reston, and
Menlo Park. The new site went live on November 8th.
- Moved all critical systems off the UPS for the installation of the new S.
Mudd building UPS on November 15th. This switch was necessary due to the
stability problems that the old UPS had, which caused power failures on two
occasions this year. The changeover went well, and then the critical
systems were moved back to the UPS circuits.