- Found some problems with the copy of the National Earthquake Hazards web
site on Ehzmenlo. Server-parsed pages were not being processed properly.
Called Will Prescott and had him add the handler for this. The fix
involved adding the following lines to the config file:
AddType text/html .shtml
AddHandler server-parsed .shtml
- There was one more problem with server-parsed HTML on Ehzmenlo. The fix
required adding "Includes" to the "Options" line in Apache.
- Talked to Judy Konnert in Reston about space in their computer room for
the Squid server.
- Called Fred Riggle in Reston to get an IP address and networking
information for the Reston Squid server.
- Talked to Bill Frank at Airtouch about the SNPP server. Apparently, the
'snpp.airtouch.com' name resolves to two different machines that live at a
Southwestern Bell facility in Dallas. From there, they have a dedicated
link to the paging satellite uplink in Chicago. They are in the process of
arranging for the SNPP servers to be colocated at an ISP with multiple
backbone connections.
- There was a widely felt M2.2 in the San Fernando Valley on Monday evening.
This event initially did not show up on the Pasadena Simpson map. This
turned out to have been caused by the hack I put in watch_cnssm to strip
off control characters from QDDS messages. Due to file IO timing problems,
the translation would sometimes leave the event file empty. Removing the
hack fixed the problem.
- Finished building the third Squid.
- Fixed Nick's pager address on the eqpager mailing list.
- Got back a disk from nStor. This is the replacement for RMA# 23159, which
was sent out on August 30.
- Did some experiments with virtual servers under Apache.
- Experimented with using Squid to accelerate multiple web sites.
- Helped Kate fix a bad map on the Trinet weekly report web page.
- Spring had a problem on Friday night where the rtem account could not
fork off any more processes. Rebooted at 21:32 to fix it. Set up scripts
to monitor and graph CPU load, process counts, and memory usage on Jet and
Spring.
- The system disk on Bigone was full again. Purged off lots of SHSYS.TMP
files in SYS$MANAGER.
- Upgraded Big Brother to version 1.5d1 to fix a security hole that was
discovered over the weekend.
- Configured the SNMP agents on the Squid servers and mailing list server
to restrict access to system information.
- Set up logging and graphing of access counts on the earthquake FAQ and
kids pages.
- The USGS Squid server crashed at 22:48 on Tuesday. The logs showed no
errors, which indicates a possible hardware problem.
- Swapped hardware between Bort and Usgs-squid in hopes of getting the
flakey hardware out of a critical role.
- Upgraded Bort to FreeBSD 4.1.1.
- Installed fetchmail for testing.
- Found a typo in the ml_sub_request CGI script that prevented mailing list
subscription requests from being processed.
- Got a call from the Army Corps of Engineers in Sacramento asking about
how to get email event notifications. Referred them to the mailing list
server.
- The external disk on Avalon failed. Replaced it with another identical
old disk.
- Fixed Kate's .Xdefaults file so that her VMS-style terminal windows will
work properly on Silver.
- Got some information about the new S. Mudd UPS and its SNMP capabilities.
- Did the Earthquake Fair at the Beverly Hills Farmers Market on Sunday.
- Installed two new CPUs in Jet.
- After the CPU install on Jet, self-test reported a failure on Board 0,
SIMM J3801. This SIMM has been reporting intermittent correctable errors
for several months, but it has apparently failed for good this time.
Called Western Scientific and arranged to get a replacement.
- The replacement memory kit for Jet arrived, and it was the wrong kit.
Called Western Scientific and arranged to get the correct kit.
- Added the read_uredas process the list for Big Brother to check on
Granite.
- Called Omega Travel to make arrangements to go to the Web Team meeting in
Golden on November 6th.
- Got fetchmail to retrieve mail from the POP server in Menlo Park.
- Called Jim Fisher about possibly setting up a name in DNS for the two
back-end servers that will be serving the national pages.
- Had Spencer's appliance repair (449-5800) come out to fix the
refrigerator in 535.
- Sent the mistake memory kit back to Western Scientific on Wednesday
morning.
- Got the correct memory kit and installed it.
- During the memory kit installation, discovered that the side vents on Jet
are blocked by the rack it is in, and the system is severely overheating.
Bent the cover open on one side of the rack and borrowed a fan to blow air
into the rack while we decide how to fix this.
http://bort.gps.caltech.edu/stan/mail-archive/msg00004.html
- Talked to Network Solutions and got the DNS information for anss.org
corrected.
- The main circuit breaker for the telemetry room tripped at 16:04 on
Wednesday. Jet and Spring did not come back up until 19:03, and both
required human intervention to fix their filesystems.
http://bort.gps.caltech.edu/stan/mail-archive/msg00003.html
- Physical Plant came out on Thursday morning to fix the air conditioner,
which was apparently the cause of the circuit breaker problems on Wednesday.
- Set up a virtual server on Agent86 to act as www.anss.org. Set up
redirects for the two old URLs.
- Registered tsunami2 for 131.215.66.194 for Jocey.
- Talked to Charlene in Menlo Park about setting up Eqwebback as an alias
for 130.118.61.34 and 130.11.44.132. These are the two back-end servers
for earthquake.usgs.gov.
- Sent the defective memory from Jet back to Western Scientific. Used RMA#
S6493.
- Did some research on how to speed up reboots after an online system crash.
Talked to Jeff Weiss at the local Sun office (310-640-7814) and got a copy
of a performance tuning White Paper. Found some kernel parameters that
needed adjustment. Also found out that disk I/O to the RAID is the major
performance bottleneck on the online systems. Wrote up my findings at
http://bort.gps.caltech.edu/stan/mail-archive/msg00006.html
- Tested UFS logging as a way to speed up reboots. This feature seems to
work as advertised. The test results are recorded at
http://bort.gps.caltech.edu/stan/mail-archive/msg00005.html
- Set up monitoring of Spring CPU states.
- Shipped out the new Squid servers to Reston and Menlo Park.
- Researched options for a new laser printer and a new large-format plotter
for the office. The winners appear to be the HP Laserjet 4050N and the HP
DesignJet 1055CM.
- Got mail from a Berkeley computer science grad student who is doing a
paper on scaling Internet services for large loads. He asked for a copy of
our web logs for October 16, 1999 in order to study an actual traffic
surge. Due to privacy concerns, I sanitized the log file by removing all
of the IP addresses before making it available to him.
***
- Built up three new Squid servers for use in serving the national
Earthquake Hazards web pages. They will be located in Reston, Menlo Park,
and Pasadena.
- Got the DNS information for anss.org set up. This is the Advanced
National Seismic System. The new web site URL is http://www.anss.org
- Dealt with the system crashes when the main circuit breaker in the
computer room tripped. The power spike from this event caused major damage
to several systems, including burning out several devices.