[Scedc_users] SCEDC Newsletter - Volume 2, Issue 2

scedc_users at hungabee.gps.caltech.edu scedc_users at hungabee.gps.caltech.edu
Wed Dec 28 17:38:24 PST 2005


****************************************************************

Data Center Chronicles - E-News from the SCEDC

****************************************************************

Welcome to the fifth issue of the Southern California Earthquake Data
Center's electronic newsletter. We produce this compilation of news and
information about the SCEDC as part of our continuing efforts to keep users
informed about the Data Center and promote the data, tools and services we
provide at the SCEDC. 

For a web-based version of this newsletter, please click on the link below
or paste the URL into your browser's address bar: 
http://www.data.scec.org/about/chronicle/vol2issue2.html 

If you would like to subscribe to our mailing list, you can sign up (or
unsubscribe) at: http://www.data.scec.org/mailman/listinfo/scedc_users.
Please send your questions, comments and suggestions on this newsletter or
any SCEDC issues to vikki at gps.caltech.edu. 

****************************************************************

Fall, 2005

In This Issue:

A. The Archive
B. What's new with STP (Seismic Transfer Program)?
C. What's new on the SCEDC Website?
D. Searchable Catalog of Scanned Analog Seismic Records
E. Searchable Catalog of Moment Tensor Solutions
F. Google Output Available from the SCEDC
G. Location Codes: Coming Soon to a Seismogram Near You!
H. Highlight: the Station Information System
I. Email Virus Alert

****************************************************************

A. The Archive

The Archive: By the Numbers

Number of earthquakes in the 1932-present Caltech/USGS catalog 	623,872
earthquakes
Total size of the waveform archive:	6,893 GB
Size of SCEDC parametric and waveform database: 	239,552,775 rows

Data transferred via STP:

Q1: January 1-March 31:
.	17,325,347 waveforms = average of 191,667 waveforms daily = 2.22
waveforms per second!
.	532 gigabytes of waveform data = average of 6,051 megabytes bytes
daily = 70 kilobytes per second
Q2: April 1-June 30:
.	3,156,771 waveforms = average of 34,690 waveforms daily
.	465 gigabytes of waveform data = average of 5,111 megabytes daily =
59 kilobytes per second.
Q3: July 1-September 30:
.	1,671,318 waveforms = average of 18,570 waveforms daily
.	271 gigabytes of waveform data = average of 3,018 megabytes daily =
35 kilobytes per second.

>From January 1 - Sept 30, 2005, the SCEDC archived:

.	11,975 events
.	3,037,653 waveforms
.	239,855 arrivals
.	890,742 amplitudes

magnitude	Number of local events (le):
----------------------------------------------------
0-1	3,118
1-2	6,012
2-3	1,290
3-4	132
4-5	20
5-6	4

# events: 	event type
---------------------------
10,576		le (local event)
473		qb (quarry blast)
936	 	re (regional event)
99		sn (sonic blast)
578		ts (teleseism)
-----------------------------------------
12,662		Total

Six-month summary of requests for catalog information:

Jan	74,812 
Feb	60,048 
March	67,981 
April	120,219 
May	166,136 
June	690,487 
-------------------------------------
Total:	1,179,683

Continuous Archiving of High-Sample Rate Data

The SCEDC continuously archived high sample-rate data (HH_, HL_ (80 sps)
and/or EH_, EL_ (100 sps)) for the following significant events:


Obsidian Butte Swarm
EVID: 14179736  Mag = 5.1
Origin date/time: 2005/09/02, 01:27:19
lat/long: 33.1598, -115.6370
EVID: 14178184  Mag = 4.6
Origin date/time: 2005/08/31, 22:47:45
lat/long: 33.1648, -115.6357
channels/time available:	HH_, HL_, EH_
Continuous archive from 2005/08/31,00:00:00 to 2005/09/06,00:00:00

Anza/ Yucaipa Events
EVID: 14151344  Mag = 5.2
Origin date/time: 2005/06/12, 15:41:46
lat/long: 33.5288, -116.5727
EVID: 14155260  Mag = 4.9
Origin date/time: 2005/06/16, 20:53:2
lat/long: 34.058, -117.0113
channels/time available: HH_, HL_, EH_
Continuous archive from 2005/06/12,12:00:00 to 2005/06/18,00:00:00

Wheeler Ridge Event
EVID: 14138080  Mag = 5.2
Origin date/time: 2005/04/16, 19:18:13
lat/long: 35.0272, -119.1783
channels/time available: HH_, HL_ / -6h, +12h

More information on this topic is available at
http://www.data.scec.org/about/sigeventsshot.html


B. What's new with STP (Seismic Transfer Program)?

Additional STP Server

When significant events occur and large numbers of users log on to STP, you
may find yourself waiting in line. To accommodate the increasing number of
STP users, we have added a third server that accesses a read-only database,
which is replicated from our two main databases. This addition will increase
the reliability of our service and allow more simultaneous users. If one of
our servers is at full capacity, the STP client will automatically connect
to the next server in a seamless process. 

SAC2000 Module for Macintosh

Last year, the SCEDC released SAC2000 modules for Linux and Solaris that
enable users to issue STP commands directly from within SAC2000. Now our Mac
users will have the same flexibility. We have developed a SAC2000 module for
Mac OS X. that we are currently testing. This module will be available for
download very soon.


C. What's new on the SCEDC Website?

3D Velocity Model for Southern California: Version 4 now available

The Three-Dimensional Community Velocity Model for Southern California
provides a unified reference model for the several areas of research that
depend of the subsurface velocity structure in their analysis. These include
strong motion modeling, seismicity location, and tomographic velocity
modeling. It is also hoped that the geologic community will find the basin
models useful because they are based on structures and interfaces that are
largely derived from geologic structure models. 

The Community Velocity Model has been released in progressive versions, and
it is recommended to use version 4 over previous versions. Version 4 of the
SCEC model is available at: http://www.data.scec.org/3Dvelocity/ 


One-Stop SCEDC Software and Downloads Page

The SCEDC has made all of our software tools, catalogs, models and waveform
retrieval tools available from a single download page at:
http://www.data.scec.org/research/downloads.html. This page has the most
recent versions of all of the products and the software we produce and host.


Website Map

The SCEDC has a lot of great information available on our website. To make
sure that users can find what you're looking for, or discover something you
didn't know we had, we've built a website map at:
http://www.data.scec.org/sitemap.html. This map is accessible from most of
the SCEDC's web pages via the red "website map" link below the left-hand
navigation menu.

D. Searchable Catalog of Scanned Analog Seismic Records

The Caltech Seismological Lab recently completed a project to scan
pre-digital analog recordings of major earthquakes recorded in Southern
California. We have scanned records for M>3.5 earthquakes between 1962 and
1992 and other significant teleseisms. These scans are now available for
download through our new search page at
http://www.data.scec.org/research/scans/. Search features include the
ability to search by date, station, instrument, and orientation; the option
to sort search results by date, and the option to download multiple files as
a single zipped archive.

There are two output formats for the scanned results:
1. Raster image format (TIFF) for 1-90 intermediate period record (1 sec
seismometer free period, 90 sec galvanometer free period) and 30-90
long-period record (30 sec seismometer free period, 90 sec galvanometer free
period).
2. High-resolution JPEG format for WA (Wood-Anderson) records with file
sizes ranging from 3-8 Megabytes.

The naming format for the scanned records follows the convention:
NET_STA_BAND_INSTR_DIR_YYYYMMDD_HHMM

Example:
 CI_PAS_30-90_N_19690108_1500.tiff
 The north-south 30-90 record for Pasadena beginning at 1500 UTC on January
8, 1969.

E. Searchable Catalog of Moment Tensor Solutions

The SCEDC is currently archiving Moment Magnitudes and Moment Tensor
Solutions (MTS) produced by the SCSN in real-time and post-processing
solutions for events spanning back to 1999. These solutions are in the SCEDC
searchable database and are available for distribution from the consolidated
catalog search page (Moment Tensors tab) at:
http://www.data.scec.org/catalog_search/CMTsearch.php. 

The automatic MTS runs on all local events with Ml>3.0, and all regional
events with Ml>=3.5 identified by the SCSN real-time system. The solution is
emailed to SCSN personnel about 10 minutes after an event. If the quality of
waveform fits is good enough, and the event is within the SCSN reporting
region, it is automatically distributed to the outside world. The
distributed solution automatically creates links from all USGS Simpson Maps
to a text e-mail summary solution, creates a .gif image of the solution, and
updates the moment tensor database tables at the SCEDC. The solution can
also be modified using an interactive web interface, and re-distributed. The
SCSN Moment Tensor Real Time Solution is based on the method developed at UC
Berkeley by Doug Dreger. 

F. Google Output Available from the SCEDC

KML (Google Earth) Catalog Output

If you frequently use our catalog search at
http://www.data.scec.org/catalog_search/, you may notice a new output
format, KML. KML, or Keyhole Markup Language, is an XML-based language for
creating files that can be loaded into Google Earth, a 3D application that
functions as an interactive 3D globe letting you seamlessly zoom in on any
part of the world from a global scale to down to a few meters above the
ground. 

Viewing your search results in Google Earth offers many exciting features: 
* Zoom in on event epicenters. 
* Tilt the view to study terrain from different angles, with 3D rendering in
some areas. 
* "Fly-by" tour of your search results.  

Google Earth support has been implemented for date/magnitude/location, event
ID, and polygon searches. To use this new feature, select KML in the output
format pull-down box when you search. If you are directly saving your search
results to an output file, make sure that its name ends in .kml. If your
search results are being displayed in a web page, copy and paste the
complete results to a text file whose name ends in .kml. Load your search
results by opening Google Earth and then opening your .kml file. Placemarks
for the search results will be displayed in the left-hand menu under
"Temporary Places" in a subfolder named "SCEDC Catalog Search Results."  

Google Earth can be downloaded from http://earth.google.com. More
information about the KML schema is available at
http://code.google.com/apis.html#earth. Google Earth is currently available
only for Windows. 

Google Map Catalog Output

The second new product available from the drop-down "Output Format" menu on
the catalog search page is "Google Map," which will plot the results of your
query directly onto a Google Map. The map icons are color-coded by magnitude
i.e., all magnitude 1-2 markers are white, magnitude 2-3 markers are purple
etc. The earthquake's magnitude is displayed when you mouse over the icon,
and more information (time/date, event ID, latitude/longitude, depth and
magnitude) about the event is displayed when you click on the event's icon.
This development is ongoing, and we are working to improve the response for
larger queries, which currently take a much longer processing time, so
kindly limit your queries to shorter time periods.


G. Location Codes: Coming Soon to a Seismogram Near You!

The SCEDC uniquely describes the seismograms we archive and distribute using
the FDSN (Federation of Digital Seismic Network) system, which includes the
following four fields: 

1.  Network (2 characters)
2.  Station (3-5 characters)
3.  SEED Channel (3 characters; see SEED Reference Manual Appendix A)
4.  Location Code (2 characters)

e.g., NN.SSS.CCC.LL

The FDSN standard has always included the Location Code field, but it was
not used by the SCSN until recently. The SCEDC will now use the Location
Code field to uniquely identify SCSN data streams. 

Location Code is used to distinguish between multiple seismograms with
identical station and channel names. For example, a station equipped with
both STS-1 and STS-2 broadband high gain seismometers would produce two data
streams with the same net.sta.cha identification. Also, the SCSN uses
orientation codes of [1,2,3] for data channels from downhole sensors and
[Z,N,E] for traditionally-oriented surface channels. Without Location Codes,
we cannot have multiple downhole sensor packages without changing the
station or channel names on the second to n-th downhole sensors.

Currently, the default value for SCSN Location Codes is blank i.e., that
field contains two blank spaces. For instances where a different Location
Code is necessary to uniquely describe a data stream, the SCSN will follow
the SEED convention of allowed characters (A-Z, 0-9, space) and identify
streams with "01" for the first non-unique stream, "02" for the second, etc.
Users (or their software) should not assume that Location Codes have a
meaning; the SCSN will not use this field to encode information like
emplacement depth, preferred channel, sensor type, orientation etc. However,
the full SCNL description can be used as a unique key into complete
descriptive information about the characteristics of the data stream.

SCEDC users should be aware that if you do not specify a Location Code in
your data request, the Data Center will provide all seismograms that match
that net.sta.channel, so you may receive multiple seismograms where you only
expect one. For ASCII output where Location Code is a whitespace-delimited
field, a blank-blank Location Code will be assigned "--" and when parsing
ASCII input, "--" should be interpreted as 2 blanks.

The naming convention for seismograms will be to only include the Location
Code in the filename if it is something other than the default value of
blank-blank:

Triggered Waveforms
Now:
14176696.CI.USB.HLE.sac
Future:
14176696.CI.USB.HLE.sac and
14176696.CI.USB.HLE.01.sac (if there is an 01 Loc Code)

Continuous Waveforms 
Now:
20050831000000.CI.USB.HLE.sac
Future: 
20050831000000.CI.USB.HLE.sac and
20050831000000.CI.USB.HLE.01.sac (if there is an 01 Loc Code)


H. Highlight: the Station Information System

The Data Center has developed the Station Information System (SIS) to manage
station metadata for the California Integrated Seismic Network (CISN)
Southern California Management Center (SCMC). The goal of this project was
to develop a simplified database-driven system that can interact with a
single database source to enter, update and retrieve station metadata easily
and efficiently. 

Over the course of this project, the SCEDC: redesigned the database schema,
built a dynamic PHP website to allow users to view hardware and other
station information held in the SIS database at:
http://www.data.scec.org/stations/views/sta_hardware.php (this is also
available from the "Display Station Hardware Information" link from our main
Stations/Instrumentation page at:
http://www.data.scec.org/stationinfo.html), built a Graphical User Interface
to interact with the database, and have migrated all of the online
broadband, K2 and short-period telemetered stations data into the SIS
database. 

All station field changes that result in a change of a station's response
have been recorded in the SIS using the SIS GUI since 11/01/2005. Dataless
SEED volumes are now generated from data held in SIS via a stored procedure,
and all SEED volumes created since 11/01/2005 are from the SIS database. The
IRIS DMC has verified the dataless volumes produced by this system.


Redesigned Database Schema

The SCEDC staff has considerable education and experience with databases, so
we knew that we should invest significant time and effort on the data model,
which has had a very positive impact on the end product. The SIS's highly
normalized logical data model (an ERD is available from
http://www.data.scec.org/stations/SIS/SIS_ERDV4.jpg) is implicitly designed
for performance. If a database is not well modeled, it becomes clear to the
applications and the users. The SIS's well-designed data model reduces the
need for programming changes and increases application maintainability. 

Normalization in a database design allows for efficient access and storage
of data in a relational database. The purpose of the normalization process
is to reduce redundancy (same information stored more than once) and secure
data integrity (ensure that the database contains valid information). This
is achieved by reducing large entities into several other, lesser entities,
which together contains the same information without repeating it. 

Every time a decision to denormalize the database is made, a price is paid.
The cost is lost flexibility, future scalability, performance, and data
integrity. By denormalizing, there is data redundancy in the database, which
needs to be managed through program code, either at the GUI or by using
triggers. Denormalization may solve one part of dealing with performance,
but it creates possible performance problems in several other areas and data
integrity is at high risk. A clean, normalized database always gives good
performance and preserves data integrity. 

Database Packages, Procedures and Functions

With this project, the SCEDC requested that embedded SQL statements NOT be
allowed in applications. All SQL that is routinely executed was written as
stored procedures and functions, contained in database packages. Database
stored procedures for most common field operations (sensor swaps, new
station installs) have been written and are used by the GUI. A user can
record a sensor swap for a station and have all updates to gain and epochs
for all relevant channels immediately available in the database in one step.

What are the benefits? 

1. Programmers do not need to worry about the database structure, which is
beneficial because most programmers don't like working with databases that
are as well-designed as the SISDB.
2. Changes in the database structure/access paths do not influence
application logic 
3. Tuning of SQL is done independently of how many times (or places) this
access path is in use 
4. Stored procedures outperform any programmer's SQL - our DBA can write the
statements to utilize indexes and performance improving hints that
programmers typically aren't aware of.

SIS Graphical User Interface

The SIS interface is a Java program that directly accesses and updates the
SIS database. The users requested that the design provide drop-down lists,
radio buttons, and drill-down (nested tree) capabilities. For instance, a
user can type in a minimum set of parameters and then assemble a station
from drop-down lists of components in inventory (i.e., components that are
not installed at other stations.). The use of drop-down lists, radio
buttons, and forms that are pre-populated with default configurations have
significantly reduced problems associated with typos and cut and paste
errors. When a technician enters a new instrument into inventory, s/he can
accept the nominal values for an instrument of that type, or can modify the
values if desired. Field staff make their changes directly in the SIS
interface and trigger a pre-compiled email to the operators' mailing list
when the changes are submitted. 

Process

During the development of this system, the Data Center met with many users
of metadata to determine what their needs were for a Station Information
System and we discovered that each user community had different priorities
and used station data in different ways. We investigated a number of
currently-existing systems that deal with station-metadata. We thieved the
best ideas from areas that these systems did particularly well and developed
strategies to avoid methods did not work well in some systems. 

As our development progressed, we held regular bi-weekly meetings with the
SCSN field technicians and showed them our progress and received feedback
that helped guide our work to meet their needs. The filed staff provides two
different styles of information to the SIS, so they have two function-based
areas: one for fieldwork (Field Maintenance) and another tab for lab-based
actions (Hardware Maintenance).

We went through a period of extensive testing including user-testing where
all users of the SIS GUI were given a worksheet package containing scenarios
for users to work through that tested the interface and determine if any
modifications were necessary to improve the system. 


Future Work:

.	The Data Center will populate the SIS with as much information as we
have available for historic stations. We have file cabinets of paper records
for historical stations that have never been looked at, because the systems
that we had in place were not flexible enough to allow entry of incomplete
information or of non-standard station configurations.
.	As a direct result of our SIS efforts, the SCEDC is finally in a
position where we can effectively manage our metadata and easily exchange
our metadata with our CISN partners and other organizations. We are looking
forward to this next phase of SIS development.


Acknowledgements:

The Station Information System project was financed with joint special
funding to the Data Center from the USGS/ANSS and IRIS. This project has
been a tremendous success, which we could not have achieved without this
financial sponsorship. The SIS Gang sincerely thanks these organizations for
their support.


I. Email Virus Alert

Recently there has been a worm spreading through emails that claim to be
from the SCEDC. The message may appear to come from an address like
webmgr at quakedc.gps.caltech.edu or postmaster at quakedc.gps.caltech.edu (but is
actually forged) and have a subject line similar to "hi, ive a new email
address." The body of the email may look like:
hey its me, my old address dont work at time. i dont know why?! in the last
days ive got some mails. i' think thaz your mails but im not sure!

plz read and check ...
 cyaaaaaaa
and will include an attachment with a name similar to mailtext.zip or
mail_body.zip.

If you receive such an email, simply delete it. Do NOT download any
attachments.




More information about the Scedc_users mailing list