- Restored some files for Kate.
- Set up a mirror of the Eqexperts database on the internal web page on
Horst. Set up replication between the databases.
- Moved the home and opt partitions on to the new RAID on Jet. Also,
Ellen moved the database onto the new RAID. The old RAID went inactive
on Wednesday.
- Disabled Skey on Ehznorth.
- CISN meeting.
- A power supply on the AA RAID on Spring failed. Got RMA number 141318
from Anacomp [formerly nStor [formerly Andataco]].
- Solaris 2.8 upgrade on Iron. Did it as a fresh install so as to be
able to repartition the disk.
- Upgraded the Sun Forte compilers. The license is broken.
- Removed the nStor GigaRAID AA from Jet.
- Reinstalled qpopper and Xvfb on Iron.
- Installed the recommended patch cluster on Iron. This broke the console,
and required a 'reboot -- -r' to fix it.
- Downloaded Forte 6-2 from Sun. This was really a major pain, as it
meant downloading something like 20 35+MB files from their web site.
Installed it on Iron, but the licenses are still broken.
- CIIM rebooted mysteriously at 13:25 on Thursday. This is the third
time it has rebooted at this time, and it seems to be doing it on
alternate Thursdays. Wrote a note on the calendar to be in the computer
room at 13:25 on June 27th.
- Fixed Apache on Iron. It had been broken by the upgrade.
- Dog Day on Friday.
- The air conditioning in the S. Mudd computer room failed on Saturday.
Called Physical Plant at 395-4717. They already had someone working on
it.
- There was a brief power failure at 17:54 on Sunday. This seemed to
be city-wide, but the main campus was not affected.
- CIIM crashed and rebooted after the power hit at 17:54 on Sunday.
It also rebooted mysteriously at 22:46 and again at 01:33 Monday.
Eqinfo0 also crashed at 17:54. The fact that both of these systems
are on the same UPS leads to suspicion that there may be something
wrong with that UPS.
- Checked the UPS that CIIM is on. It is an APC BP500UC. CIIM and
Eqinfo0 are both on it. Put their estimated power draw into the APC
runtime calculator and it barfed. [See http://www.tuxedo.org/~esr/jargon/html/entry/barf.html]
CIIM draws too much power for that
little UPS. Tried moving CIIM to the big Fortress UPS, but when it was
turned on, the UPS started screaming. We need to get another big UPS
for this machine. In the meantime, it is not on any UPS.
- M5.3 near Eureka on Monday.
- Replaced Stannum with a new Ultra-10 and installed Solaris 8 on it.
- M5.0 in southern Indiana. The Program web site was observed to be
a bit slow during the time of peak traffic. Increased the value of
MaxClients to allow Apache to start more server processes. Aside
from that, the earthquake web sites performed well. There is a brief
report on the traffic spike from this event at
http://bort.gps.caltech.edu/spikes/18jun2002/
- Restored the printer configuration on Iron.
- Added Maki Hattori to the paging list.
- Upgraded Apache on Bort, Agent86, Fang, and Flint in response to
CERT Advisory CA-2002-17.
- Met with ITS about the proposal to move the Trinet systems from
the 65 subnet to 61.
- Upgraded Apache on Horst, Graben, Sqehznorth, and Sqehzeast.
- Ordered parts for Dave Wald's new shakemap server from
http://www.newegg.com.
- Physical plant reported a small flood in the telemetry room on Sunday.
Water in the drain pan in the air conditioner overflowed due to a
blocked drain.
- Ordered a rackmount case, et al from CalPC.
- Fixed the compiler licenses on Iron.
- Upgraded Dacite and Wacke to Solaris 8.
- All the parts for Dawn [Dave's new machine] arrived on Wednesday. Assembled
the machine and booted it up for testing. Under heavy usage, it
exhibited some instability. This is most likely due to the fact that
the memory in it is not the right kind. Turns out we need registered
memory for it, and both the new machine and CIIM have the wrong
memory. For some reason, this does not seem to be affecting CIIM, but
we made arrangements to return the new memory and replace it with the
correct units. The correct memory arrive on Friday, and the system
appears to be stable now.
- Installed one of the new UPS units that will be used for the Trinet
online system that will be moved into the 525 computer room.
- Solved the mysterious every-other-week crashing problem on CIIM.
The small UPS it had been attached to is an APC unit that automatically
self-tests every two weeks. At 13:23 on Thursday it performed its
self-test on schedule. When it self-tests, it flips over
to battery power briefly. Since CIIM drew too much current for the
UPS, this test would cause CIIM to crash ever other Thursday at 13:24.
- Added some addresses to the 'allow' list for the scignmail mailing
list. The list used to be open posting, but it had to be restricted
after it was used to send spam.
- Installed the APC UPS monitoring software on Dawn for testing.
***
- Installed a new Arena RAID unit on Jet to replace the old Andataco
array. The new RAID has about three times the capacity of the old
one, as well as being about three times faster.
- Began the operating system upgrades on the Trinet development
systems.
- Built a new machine for the Shakemap project. It is a dual-CPU
Athlon running FreeBSD. This will serve along with Willow for
creating shakemaps for Southern California events.