The Request Class#

The Request class handles all steps from download, over preprocessing, to deconvolution (i.e. the creation of a time domain receiver function). Since PyGLImER-v0.1.0 the database can be saved in two different formats - MSEED (raw waveforms), SAC (receiver functions), and XML (station response and event database) OR in hdf5 (hierachical data format). Both options have advantages and downsides (see below).

MSEED/SAC Database#

If the user chooses this option, the Request object will create a folder structure as defined by the user. Those contain raw (i.e. unprocessed, but downsampled) waveform files in miniseed format, preprocessed (3 component in RTZ coordinate system, and filtered/discarded by signal to noise ratio) waveforms in miniseed format together with info-files (shelve format), and receiver functions in time domain and in .SAC format, respectively). Additionally, a directory with station response files will be created.

Note

Saving your database in this format might be the better option if you create a relatively small database. Then, computational times tend to be shorter. However, for large databases (e.g., world-wide) this will create millions of files and potentially overload your file system.

HDF5 Database#

Your second option is to save data in hdf5 format. In this case, only two directories will be created; one for the downsampled raw-data and another one for the final time-domain receiver functions. The raw data and receiver functions are saved in an hdf5 variant specific to PyGLImER (in the following, we will learn how to use it).

Note

Use this format if you plan to create a large database as it will both save some disk space and, more importantly, create only two files per station.

Note

Once a database is created, a new Request object will always update existing raw-data if the same rawdir is chosen (i.e. download new data, if any available). This is valid for both formats!

Methods of the Request class#

A Request object has four public methods:

The functions are responsible for:

  • Downloading the event catalogue - for which waveforms should be downloaded

  • (2+3) Downloading station information - such as response data - and raw waveform data

  • Downsampling the raw data, preprocessing the raw data and saving the filtered data in a different directory, and creating receiver functions.

However, all parameters are already set, when initialising the Request object.

Note

As you surely have noticed, there are two functions to download data. As the name suggests download_waveforms_small_db(), is the tool of choice if you wish to download smaller databases, consisting of data from only few stations or networks. This function does also have the advantage that you can download data from defined lists of networks and stations instead of having to rely only on wildcards. For smaller databases download_waveforms_small_db() will download data up to twice faster than download_waveforms(). The latter will be the better choice if you create databases on continental or even global scales. It utilises the Obspy Mass Downloader.

Setting the parameters for your request#

The parameters for preprocessing and download are set when initialising the Request object. Probably the most convenient way to define them is to create a yaml file with the parameters. An example comes with this repository in params.yaml:

 1# This file is used to define the parameters used for PyGLImER
 2# ### Project wide parameters ###
 3# lowest level project directory
 4proj_dir : 'database'
 5# raw waveforms
 6raw_subdir: 'waveforms/raw'
 7# preprocessed subdir, only in use if fileformat = 'mseed'
 8prepro_subdir: 'waveforms/preprocessed'
 9# receiver function subdir
10rf_subdir: 'waveforms/RF'
11# statxml subdir
12statloc_subdir: 'stations'
13# subdir for event catalogues
14evt_subdir: 'event_catalogs'
15# directory for logging information
16log_subdir : 'log'
17# levels:
18# 'DEBUG', 'INFO', 'WARNING', 'ERROR', or 'CRITICAL'
19loglvl: 'WARNING'
20# format, either mseed or hdf5
21format: 'hdf5'
22
23# The teleseismic phase to use (P or S or also more exotic ones like SKS, PKP, ScS)
24phase: 'S'
25
26### Request parameters
27## First, everything concerning the download
28# waveform client, list of strings
29# use None if you want to download from all available FDSN servers
30waveform_client: ['IRIS']
31# Use an already downloaded event catalog
32# If so insert path+filename here.
33evtcat: None
34# earliest event
35starttime: '2009-06-1 00:00:00.0'
36# latest event
37endtime: '2011-12-31 00:00:00.0'
38# Minumum Magnitude
39minmag: 5.5
40# Network and station to use, unix-style wildcards are allowed
41# if you use the Request.download_waveforms_small_db method,
42# you can also provide a list of networks and/or a list of stations
43network: 'YP'
44station: '*'
45
46## concerning preprocessing
47# Coordinate system to rotate the seismogram to before deconvolution
48# RTZ, LQT, or PSS
49rot: 'PSS'
50# Polarisation, use v for v/q receiver functions
51# and h for transverse (SH)
52pol: 'v'
53# Deconvolution method to use
54# Iterative time domain: 'it'
55# Waterlevel Spectral Division: 'waterlevel'
56deconmeth: 'it'
57# Remove the station response. Set to False if you don't have access to the response
58remove_response: False

You can then read the yaml file using pyyaml like so:

import yaml

from pyglimer.waveform.request import Request

with open('/path/to/my/params.yaml') as pfile:
    kwargs = yaml.load(pfile, Loader=yaml.FullLoader)

r = Request(**kwargs)

Alternatively, you could of course just set the parameters while initialising the object.