- Increased the 'maxusers' parameter on Jet, Spring, and Hotspot. The
primary system was running into limitations on the number of allowed
processes, which was causing netmon to fail.
- Set up a script to compare the lists on the three mailing list servers
in order to check up on the synchronization scripts.
- Finished moving my office.
- Followed up on the plotter supplies we had ordered back in November.
- M4.2 north of Simi Valley at 21:53 on Monday night. Traffic on the
trinet web server peaked at 177 hits/sec.
- After the M4.2, we discovered that the Earthquake in the News function
on the Pasadena internal web page was broken. Had to reinstall Apache with
suexec enabled to get it to work.
- Modified the remove_old_datafiles script on Hotspot to keep 21 weeks of
files in the LOG.L directories. This was requested by Bob Busby.
- Helped Will Prescott with setting up and testing the new QDDS-driven
Earthquake in the News on Horst.
- Fixed an error in the 'upd-cron' script that webmgr runs to put new
special reports on the trinet web page. The script was copying the
graphics to the wrong directory.
- Menlo Park had routing troubles on Tuesday night that prevented them
from being able to get to Caltech.
- Called OES about our problems with sending messages to EDIS. Talked
to Ron Rosenow and gave him the address we were using.
- Went looking for the M3.2 foreshock to Monday night's 4.2. People had
said that it wasn't on the Simpson Map. Turned out it had never come
through QDDS.
- Fixed a problem with ssh for shakemap from Willow to K2.
- Got a new 2GB memory kit for Jet to replace the one that was crashing
the system. Installed it on Wednesday afternoon. The system began
logging recoverable memory errors on one of the new SIMMs. Swapped it
for one of the good SIMMs from the kit we replaced. This fixed the
errors.
- Pluton disk failure on 1:3. The system hung. Had to power cycle and
fsck. The hang turned out to have been caused by the keyboard connector
being knocked loose.
- M3.7 near Julian at 20:39 on Sunday. The EDIS message worked this
time, indicating that OES had fixed the problem.
- Installed ImageMagick on Iron and Willow for the Shakemap group.
- Got a replacement 2GB memory kit from Western Scientific to replace
the one that had repeated correctible errors.
- Installed a DHCP server on Bort.
- Set 30-second expiration for the front pages on the Pasadena and
Program web sites.
- Rewrote the station updates index-generating script in Perl.
- Installed eight new CPU modules and a new clock board in Spring. The
first attempt failed because this upgrade required patch 103346-29 to
upgrade Open Boot to 3.2.29. After applying the patch, the system
self-tested and booted. The load average dropped from 5 to about 1.2.
- Pluton disk failure on 1:2. Swapped in spare.
- Pluton disk failure on 0:7. Put in disk formerly at 1:2 and power-cycled.
It ran for almost 24 hours before going offline again. Swapped the cable
on 0:7 and rebooted.
- Moved /home and /opt on Iron to new disks.
- Increased the 'shmsys' and 'semsys' parameters in /etc/system on Jet
and Spring to fix the netmon errors.
- One of the switches in the basement of 525 died on Monday morning.
Joe from ITS came by and replaced it.
- Set up EDIS to run on Agent86.
- Upgrade Bort and Pluton to FreeBSD 4.5.
- DNS problem on Monday morning. The MX records for eqinfo.wr.usgs.gov
had been deleted. Called Charlene, and she restored them from a previous
version.
- Sent back the latest failed disk from Pluton. Sent it back to Maxtor
under RMA #0200818225.
- Another DNS problem surfaced on Tuesday. The names for the Pasadena
office web site had been deleted. Charlene fixed them. It appears
that the problem occurred when a disk filled up on the main name
server in Menlo Park, and the DNS file became truncated.
- Installed the new clock board and CPU modules in Jet on Tuesday.
- SNMP vulnerability scare on Tuesday. Upgraded UCD-SNMP on the
FreeBSD machines.
http://www.cert.org/advisories/CA-2002-03.html
ftp://ftp.freebsd.org/pub/FreeBSD/CERT/advisories/FreeBSD-SA-02:11.snmp.asc
- Upgrade Eqinfo1 to FreeBSD 4.5.
- Fixed the MRTG cpu load graphs. The change in UCD-SNMP changed how
the systems reported cpu load information.
- Local mail delivery was broken on Bort from the OS upgrade. Had to
add setuid on /usr/libexec/mail.local to get it to work again.
- The Lexmark Optra C in 535 ran out of oil. Ordered a new bottle.
- Helped Karen install DBI and DBD on K2 and Makalu.
- Set up Jet and Spring accounts for Vikki.
- Upgrade Fang to FreeBSD 4.5.
- Built the native-mode JDK 1.3 on Bort for testing.
- The disk on /wavepool/tmp/5 on Spring was logging bad block errors.
Used format to mark off the bad block.
- Brad's machine crashed and came back with a bad filesystem on
/cacheFS. Referred this to Kimo for repair.
- Did the Solaris license renewal and inventory of the Trinet systems.
- The ftp service on Bombay went bad on Sunday morning. Had to
restart UCX to get it working again.
- Got the warranty replacement disk for Pluton on Tuesday.
- Hacked qpage.c to have it trim outgoing messages to 320 characters so
that Patrick's paging client doesn't choke on the long messages that
Big Brother sometimes sends. The disk space messages it sends from Jet
and Spring include the full output from the 'df' command, and can be
quite long.
- Changed the IP address for Hotlava from 65.61 to 66.105 when Ellen
moved into her new office in 535.
- Put the new versions of QDDS and CNSSM on Agent86 and Fang.
- Got the new bottle of silicone oil for the Lexmark Optra C.
- Set up a database to keep track of the machines on the 66 subnet.
- M5.7 event south of Calexico on Friday morning. Traffic peaked at
400/sec in Menlo, 202/sec in Pasadena, and 93/sec on Trinet.
- The move_entries.pl script that runs under quake on Agent86 and Fang
was falling behind. This turned out to be caused by the inherent
delay in setting up and tearing down an ssh connection for moving each
file. Rewrote it over lunch on Friday to batch up the file transfers
and move all files through a single ssh connection.
***
- Upgraded the clock board and all eight CPU modules on Jet and Spring.
The new CPUs are 366MHz, and are considerably faster than the old
167MHz modules. This resulted in the load average for the system dropping
from about 6 to about 1.5.
- Fixed the problem with netmon that was caused by the online system's
running out of available process slots.