The server is a Sun Netra 296 MHz Ultrasparc with 256 MB of memory. It has one 4GB internal disk with Solaris 2.6, and two external 4.2GB and one 9GB disks which hold the web pages. One disk is set up as a backup for the other to provide a measure of redundancy. The three external disks are mounted as /webd1, /webd1/ncweb and /webd2. There is a cron job that runs at each night to copy the contents of the scweb, ncweb, and scign directories from /webd1 to /webd2. The machine has two Ethernet interfaces, but at the present time only one of them is in use. It is an auto-sensing 10/100 Mbit/sec card, and is currently hooked up to a 100Mb/s Ethernet port. The device name is hme0.
The Netra is running the Netscape Enterprise server. There are three
four five separate
servers defined on the machine. One is the http://scweb-south.gps.caltech.edu, which is the back-end server for
the USGS web pages. The actual
pasadena.wr.usgs.gov URL points to the
USGS Squid server, which
acts as an http accelerator to increase our web service capacity.
The history of how this configuration came to be can be found
in my short article,
Web Servers, Earthquakes, and the Slashdot Effect.
The second server on Ehzsouth
is a mirror
of the Menlo Park server,
and the third is a mirror of the main
USGS Earthquake Hazards page. The third server is not enabled at this time.
The fourth server is the SCIGN
page. A fifth server is currently set up for testing purposes.
These addresses are defined in /etc/inet/hosts and in /etc/hostname.hme0:*.
The directory for the USGS web pages is /webd1/scweb. This directory is exported for other unix machines to mount. In addition, the scweb server is configured to restrict access to the /webd1/scweb/pasadena directory which contains the internal pages. Access is allowed for:
It was necessary to define the access control using IP addresses, since DNS lookups are not currently enabled in the server.
There is a file in the top level directory /webd1/scweb called robots.txt. This is information for web-crawling robots to tell them which directories to ignore in cataloging our site. The contents of this file are:
User-agent: * Disallow: /cgi-bin/ Disallow: /pasadena/This tells the 'bots to ignore the cgi-bin and internal web pages directories. More information on this can be found at The Web Robots Pages.
The directory for the SCIGN pages is /webd1/scign. This is owned by the scignwww account and the scignwww group.
The Netscape server includes an administrative web page. This has options for controlling the server. See me (Stan) for instructions on how to access the admin server.
The server log files are stored in /opt/netscape/suitespot/https-scweb-south/logs, the relevant files are access-new and errors. The default log name is access, and the new name is to allow for logging of referring URLs. There is a cron job that runs at 08:00 on Wednesday mornings to rotate the log files. This runs in the root crontab. The relevant crontab line is:
0 8 * * 3 /opt/netscape/suitespot/https-scweb-south/rotate
There is a statistics package on the machine called Wusage ("DUBH-yoosij"). It is in /opt/WEBSTATS, and the relevant configuration file for our web statistics reports is scwebsouth.conf. The package is set up to generate a statistics report every day. The reports go into /webd1/scweb/stats. This runs in the usgswww account crontab, and the relevant line is:
0 1 * * * /opt/WEBSTATS/wusage -c /opt/WEBSTATS/scwebsouth.conf
This can also be run by hand to update the reports during the day. This should be done from the usgswww account, and the applicable commands are:
cd /opt/WEBSTATS ./wusage -c scwebsouth.conf
The manual for wusage is here.
The statistics for the scweb server are here.
Additional software installed on ehzsouth:
Maintained by Stan Schwarz
AlliedSignal Technical Services
Pasadena, California