pyglimer.waveform#

Waveform download and preprocess

pyglimer.waveform.download#

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Tue May 26 2019 13:31:30 Last Modified: Tuesday, 25th October 2022 12:29:11 pm

pyglimer.waveform.download.create_bulk_list(netsta_d: List[dict])[source]#

Takes in a list of dictionaries, to create a bulk request list of dictionaries that contain station info and bulk request strings.

pyglimer.waveform.download.download_small_db(phase: str, min_epid: float, max_epid: float, model: TauPyModel, event_cat: Catalog, tz: float, ta: float, statloc: str, rawloc: str, clients: list, network: Union[str, List[str]], station: Union[str, List[str]], channel: str, saveh5: bool)[source]#

see corresponding method     download_waveforms_small_db()

pyglimer.waveform.download.downloadwav(phase: str, min_epid: float, max_epid: float, model: TauPyModel, event_cat: Catalog, tz: float, ta: float, statloc: str, rawloc: str, clients: list, evtfile: str, network: Optional[str] = None, station: Optional[str] = None, saveasdf: bool = False, log_fh: Optional[FileHandler] = None, loglvl: int = 30, verbose: bool = False, fast_redownload: bool = False)[source]#
Downloads the waveforms for all events in the catalogue

for a circular domain around the epicentre with defined epicentral distances from Clients defined in clients. Also Station xmls for corresponding stations are downloaded.

Parameters:
  • phase (string) – Arrival phase to be used. P, S, SKS, or ScS.

  • min_epid (float) – Minimal epicentral distance to be downloaded.

  • max_epid (float) – Maxmimal epicentral distance to be downloaded.

  • model (obspy.taup.TauPyModel) – 1D velocity model to calculate arrival.

  • event_cat (Obspy event catalog) – Catalog containing all events, for which waveforms should be downloaded.

  • tz (int) – time window before first arrival to download (seconds)

  • ta (int) – time window after first arrival to download (seconds)

  • statloc (string) – Directory containing the station xmls.

  • rawloc (string) – Directory containing the raw seismograms.

  • clients (list) – List of FDSN servers. See obspy.Client documentation for acronyms.

  • network (string or list, optional) – Network restrictions. Only download from these networks, wildcards allowed. The default is None.

  • station (string or list, optional) – Only allowed if network != None. Station restrictions. Only download from these stations, wildcards are allowed. The default is None.

  • saveasdf (bool, optional) – Save the dataset as Adaptable Seismic Data Format (asdf; recommended). Else, one will be left with .mseeds.

  • log_fh (logging.FileHandler, optional) – file handler to be used for the massdownloader logger.

  • loglvl (int, optional) – Use this logging level.

  • verbose (Bool, optional) – Set True, when experiencing issues with download. Output of obspy MassDownloader will be logged in download.log.

Return type:

None

pyglimer.waveform.download.filter_overlapping_times(d)[source]#

Filters the found dictionary for overlapping windows. Dictionary looks as follows: d = {'event': [], 'startt': [], 'endt': [], 'net': [], 'stat': []}.

pyglimer.waveform.download.get_mseed_storage(network: str, station: str, location: str, channel: str, starttime: UTCDateTime, endtime: UTCDateTime) str[source]#

Stores the files and checks if files are already downloaded

pyglimer.waveform.download.inv2uniqlists(inv: Inventory)[source]#

Creates a list of unique station and channel lists from an inventory.

pyglimer.waveform.download.wav_in_db(network: str, station: str, location: str, channel: str) bool[source]#

Checks if waveform is already downloaded.

pyglimer.waveform.download.wav_in_hdf5(rawloc: str, network: str, station: str, location: str, channel: str) bool[source]#

Is the waveform already in the Raw hdf5 database?

pyglimer.waveform.download.wav_in_hdf5_no_global(av_data: dict, network: str, station: str, channel: str, evt_id) bool[source]#

Is the waveform already in the Raw hdf5 database?

pyglimer.waveform.errorhandler#

Files contains all Errorhandler for the Glimer to obspy project

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Saturday, 21th March 2020 19:16:41 Last Modified: Thursday, 25th March 2021 04:02:16 pm

pyglimer.waveform.errorhandler.NoMatchingResponseHandler(st, network, station, statloc)[source]#

Error handler for when the No matching response found error occurs.

pyglimer.waveform.errorhandler.redownload(network, station, starttime, endtime, st)[source]#

Errorhandler that Redownloads the stream for the given input. Used when the stream has less than three channels.

pyglimer.waveform.errorhandler.redownload_statxml(st, network, station, statfile)[source]#

Errorhandler: Redownload station xml in case that it is not found.

pyglimer.waveform.filesystem#

Module that handles loading local seismic data. TO use this feed in a function that yields obspy streams.

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Friday, 8th April 2022 02:27:30 pm Last Modified: Monday, 30th May 2022 01:30:48 pm

pyglimer.waveform.filesystem.import_database(phase: str, model: TauPyModel, event_cat: Catalog, tz: float, ta: float, statloc: str, rawloc: str, saveasdf: bool, yield_inv: Iterable[Inventory], yield_st: Iterable[Stream])[source]#

see corresponding method import_database()

pyglimer.waveform.preprocess#

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Tuesday, 19th May 2019 8:59:40 pm Last Modified: Friday, 21st October 2022 03:31:31 pm

exception pyglimer.waveform.preprocess.SNRError(value)[source]#

Bases: Exception

raised when the SNR is too high

exception pyglimer.waveform.preprocess.StreamLengthError(value)[source]#

Bases: Exception

raised when stream has fewer than 3 components

pyglimer.waveform.preprocess.preprocess(phase: str, rot: str, pol: str, taper_perc: float, event_cat: Catalog, model: TauPyModel, taper_type: str, tz: int, ta: int, statloc: str, rawloc: str, preproloc: str, rfloc: str, deconmeth: str, hc_filt: float, saveasdf: bool = False, netrestr=None, statrestr=None, client: str = 'joblib', remove_response: bool = True)[source]#

Preprocesses waveforms to create receiver functions

1. Clips waveform to the right length (tz before and ta after theorethical arrival.) 2. Demean & Detrend 3. Tapering 4. Remove Instrument response, convert to velocity & simulate havard station. 5. Rotation to NEZ and, subsequently, to RTZ. 6. Compute SNR for highpass filtered waveforms (highpass f defined in qc.lowco). If SNR lower than in qc.SNR_criteria for all filters, rejects waveform. 7. Write finished and filtered waveforms to folder specified in qc.outputloc. 8. Write info file with shelf containing station, event and waveform information.

Only starts after all waveforms of the event have been downloaded by download.py. (checked over the dynamic variables prepro_folder and tmp.folder)

Saves preprocessed waveform files. Creates info file to save parameters.

Parameters:
  • phase (string) – “P” or “S”

  • rot (string) – Coordinate system to cast seismogram in before deconvolution. Options are “RTZ”, “LQT”, or “PSS”.

  • pol (string) – “h” for Sh or “v” for Sv, only for PRFs.

  • taper_perc (FLOAT) – Percentage to be tapered in beginning and at the end of waveforms.

  • taper_type (STRING) – Taper type (see obspy documentation stream.taper).

  • event_cat (event catalogue) – catalogue containing all events of waveforms.

  • model (obspy.taup.TauPyModel) – 1D velocity model to calculate arrival.

  • tz (int) – time window before first arrival in seconds

  • ta (int) – time window after first arrival in seconds

  • logdir (string, optional) – Set the directory to where the download log is saved

  • loglvl (int, optional) – Set logger to this level.

Return type:

None.

pyglimer.waveform.preprocess.write_info(network: str, station: str, dictionary: dict, preproloc: str)[source]#

Writes information dictionary in shelve format in each of the station folders.

Parameters:
  • network (str) – Network Code.

  • station (str) – Station code.

  • dictionary (dict) – Dictionary containing the information.

Return type:

None.

pyglimer.waveform.preprocessh5#

This is a newer version of preprocess.py meant to be used with pyasdf. Now, we will have to work in a very different manner than for .mseed files and process files station wise rather than event wise.

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Thursday, 18th February 2021 02:26:03 pm Last Modified: Tuesday, 25th October 2022 12:26:09 pm

exception pyglimer.waveform.preprocessh5.SNRError(value)[source]#

Bases: Exception

raised when the SNR is too high

exception pyglimer.waveform.preprocessh5.StreamLengthError(value)[source]#

Bases: Exception

raised when stream has fewer than 3 components

pyglimer.waveform.preprocessh5.compute_toa(evt: Event, slat: float, slon: float, phase: str, model: TauPyModel) Tuple[UTCDateTime, float, float, float][source]#

Compute time of theoretical arrival for teleseismic events and a given teleseismic phase at the provided station.

Parameters:
  • evt (obspy.core.event.Event) – Event to compute the arrival for.

  • slat (float) – station latitude

  • slon (float) – station longitude

  • phase (str) – The teleseismic phase to consider.

  • model (obspy.taup.TauPyModel) – Taupymodel to use

Returns:

A Tuple holding: [the time of theoretical arrival (UTC), the apparent slowness in s/km, the ray parameter in s/deg, the back azimuth, the distance between station and event in deg]

Return type:

Tuple[UTCDateTime, float, float, float]

pyglimer.waveform.preprocessh5.preprocessh5(phase: str, rot: str, pol: str, taper_perc: float, model: TauPyModel, taper_type: str, tz: int, ta: int, rawloc: str, rfloc: str, deconmeth: str, hc_filt: float, netrestr: str, statrestr: str, logger: Logger, rflogger: Logger, client: str, evtcat: Catalog, remove_response: bool)[source]#

Preprocess files saved in hdf5 (pyasdf) format. Will save the computed receiver functions in hdf5 format as well.

Processing is done via a multiprocessing backend (either joblib or mpi).

Parameters:
  • phase (str) – The Teleseismic phase to consider

  • rot (str) – The Coordinate system that the seismogram should be rotated to.

  • pol (str) – Polarisationfor PRFs. Can be either ‘v’ or ‘h’ (vertical or horizontal).

  • taper_perc (float) – Percentage for the pre deconvolution taper.

  • model (obspy.taup.TauPyModel) – TauPyModel to be used for travel time computations

  • taper_type (str) – type of taper (see obspy)

  • tz (int) – Length of time window before theoretical arrival (seconds)

  • ta (int) – Length of time window after theoretical arrival (seconds)

  • rawloc (str) – Directory, in which the raw data is saved.

  • rfloc (str) – Directory to save the receiver functions in.

  • deconmeth (str) – Deconvolution method to use.

  • hc_filt (float) – Second High-Cut filter (optional, can be None or False)

  • netrestr (str) – Network restrictions

  • statrestr (str) – Station restrictions

  • logger (logging.Logger) – Logger to use

  • rflogger (logging.Logger) – [description]

  • client (str) – Multiprocessing Backend to use

  • evtcat (obspy.catalog) – event Catalogue

Raises:

NotImplementedError – For uknowns multiprocessing backends.

pyglimer.waveform.qc#

Contains quality control for waveforms used for receiver function creation.

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Friday, 10th April 2020 11:38:40 am Last Modified: Wednesday, 7th September 2022 06:08:38 pm

pyglimer.waveform.qc.qcp(st: Stream, dt: float, sampling_f: float, onset: float) tuple[source]#

Quality control for the downloaded waveforms that are used to create PRFS. Works with various filters and SNR criteria

Parameters:
  • st ('~obspy.Stream') – Input stream.

  • dt (FLOAT) – Sampling interval [s].

  • sampling_f (FLOAT) – Sampling frequency (Hz).

  • onset (float) – Onset in seconds after trace’s start.

Returns:

  • st (‘~obspy.Stream’) – Output stream. If stream was accepted, then this will contain the filtered stream, filtered with the broadest accepted filter.

  • crit (BOOL) – True if stream was accepted, false if it wasn’t.

  • f (FLOAT) – Last used low-cut frequency. If crit=True, this will be the frequency for which the stream was accepted.

  • noisemat (np.array) – SNR values in form of a matrix. Rows represent the different filters and columns the different criteria.

pyglimer.waveform.qc.qcs(st: Stream, dt: float, sampling_f: float, onset: float) tuple[source]#

Quality control for waveforms that are used to produce SRF. In contrast to the ones used for PRF this is a very rigid criterion and will reject >95% of the waveforms.

Parameters:
  • st ('~obspy.Stream') – Input stream.

  • dt (FLOAT) – Sampling interval [s].

  • sampling_f (FLOAT) – Sampling frequency (Hz).

  • onset (float) – Onset in seconds after trace’s start.

Returns:

  • st (‘~obspy.Stream’) – Output stream. If stream was accepted, then this will contain the filtered stream, filtered with the broadest accepted filter.

  • crit (BOOL) – True if stream was accepted, false if it wasn’t.

  • f (FLOAT) – Last used high-cut frequency. If crit=True, this will be the frequency for which the stream was accepted.

  • noisemat (np.array) – SNR values in form of a matrix. Rows represent the different filters and columns the different criteria.

pyglimer.waveform.request#

Contains the request class that is used to initialise the FDSN request for the waveforms, the preprocessing of the waveforms, and the creation of time domain receiver functions.

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Monday, 27th April 2020 10:55:03 pm Last Modified: Tuesday, 25th October 2022 12:34:06 pm

class pyglimer.waveform.request.Request(proj_dir: str, raw_subdir: str, rf_subdir: str, statloc_subdir: str, evt_subdir: str, log_subdir: str, phase: str, rot: str, deconmeth: str, starttime: UTCDateTime, endtime: UTCDateTime, prepro_subdir: Optional[str] = None, pol: str = 'v', minmag: float = 5.5, event_coords: Optional[Tuple[float, float, float, float]] = None, network: Optional[Union[str, List[str]]] = None, station: Optional[Union[str, List[str]]] = None, waveform_client: Optional[list] = None, evtcat: Optional[str] = None, continue_download: bool = False, loglvl: int = 30, format: str = 'hdf5', remove_response: bool = True, **kwargs)[source]#

Bases: object

“Initialises the FDSN request for the waveforms, the preprocessing of the waveforms, and the creation of time domain receiver functions.

Create object that is used to start the receiver function workflow.

Parameters:
  • proj_dir (str) – parental directory that all project files will be saved in (as subfolders).

  • raw_subdir (str) – Directory, in which to store the raw waveform data.

  • prepro_subdir (str or None) – Directory, in which to store the preprocessed waveform data (mseed). Irrelevant if format is hdf5.

  • rf_subdir (str) – Directory, in which to store the receiver functions in time domain (sac).

  • statloc_subdir – Directory, in which to store the station inventories (xml).

  • evt_subdir (str) – Directory, in which to store the event catalogue.

  • log_subdir – Directory that logs are stored in.

  • phase (str) – Arrival phase that is to be used as source phase. “S” to create S-Sp receiver functions and “P” for P-Ps receiver functions, “SKS” or “ScS” are allowed as well.

  • rot (str) – The coordinate system in that the seismogram should be rotated prior to deconvolution. Options are “RTZ” for radial, transverse, vertical; “LQT” for an orthogonal coordinate system computed by minimising primary energy on the converted component, or “PSS” for a rotation along the polarisation directions using the Litho1.0 surface wave tomography model.

  • deconmeth (str) – The deconvolution method to use for the RF creation. Possible options are: ‘it’: iterative time domain deconvolution (Ligorria & Ammon, 1999) ‘dampedf’: damped frequency deconvolution ‘fqd’: frequency dependent damping - not a good choice for SRF ‘waterlevel’: Langston (1977) ‘multit’: for multitaper (Helffrich, 2006) False/None: don’t create RFs

  • starttime (obspy.UTCDateTime or str) – Earliest event date to be considered.

  • endtime (obspy.UTCDateTime or str) – Latest event date to be considered.

  • pol (str, optional) – Polarisation to use as source wavelet. Either “v” for vertically polarised or ‘h’ for horizontally polarised S-waves. Will be ignored if phase=’S’, by default ‘v’.

  • minmag (float, optional) – Minimum magnitude, by default 5.5

  • event_coords (Tuple, optional) – In case you wish to constrain events to certain origns. Given in the form (minlat, maxlat, minlon, maxlon), by default None.

  • network (str, optional) – Limit the dowloand and preprocessing to a certain network. Wildcards are allowed, by default None., defaults to None

  • station (str, optional) – Limit the download and preprocessing to a certain station or several stations. Use only if network!=None. Wildcards are allowed, by default None.

  • waveform_client (list, optional) – List of FDSN compatible servers to download waveforms from. See obspy documentation for obspy.Client for allowed acronyms. A list of servers by region can be found at https://www.fdsn.org/webservices/datacenters/. None means that all known servers are requested, defaults to None.

  • evtcat (str, optional) – In case you want to use an already existing event catalogue in evtloc. If None a new catalogue will be downloaded (with the parameters defined before), by default None, defaults to None

  • continue_download (bool, optional.) – Will delete already used events from the event catalogue, so that the download will continue at the same place after being interrupted. Will make the continuation faster, but then old database will not be updated. Only makes sense if you define an old catalogue. Defaults to False.

  • loglvl (str, optional) – Level for the loggers. One of the following: CRITICAL, ERROR, WARNING, INFO,or DEBUG If the level is DEBUG, joblib will fall back to using only few cores and downloads will be retried, by default WARNING.

  • remove_response (bool, optional) – Correct the traces for station response. You might want to set this to False if you imported data and don’t have access to response information. Defaults to True.

Raises:

NameError – For invalid phases.

download_eventcat()[source]#

Download the event catalogue from IRIS DMC.

download_waveforms(verbose: bool = False)[source]#

Start the download of waveforms and response files.

Parameters:
  • verbose (Bool, optional) – Set True if you wish to log the output of the obspy MassDownloader.

  • seealso:: (..) – You should check whether the method download_waveforms_small            _db() might be better suited for your needs. Both methods offer unique advantages.

download_waveforms_small_db(channel: str)[source]#

A different method to download raw waveform data. This method will be faster than download_waveforms() for small databases (e.g., single networks or stations). Another advantage of this method is that the attributes network and station in Request can be lists.

Parameters:

channel (str) – The channel you will want to download, accepts unix style wildcards.

See also

You should check whether the method download_Waveforms() might be better suited for your needs. Both methods offer unique advantages.

import_database(yield_st: Iterable[Stream], yield_inv: Iterable[Inventory])[source]#

Import a local database to PyGLImER. This can, for example, be continuous raw waveform data that you collected in an own experiment. Data is fed in as Generator function for an obspy type Inventory and obspy type streams.

Note

If you don’t have Response information for your stations make sure to set Request.remove_response to False

Note

Follow the obspy documentation to create StationXMLs. This Tutorial The StationXML needs to contain the following information: 1. Network and Station Code 2. Latitude, Longitude, and Elevation 3. Azimuth of Channel/Location to do the Rotation. 4. (Optional) Station response information.

Note

Make sure that the Traces in the Stream contain the following information in their header: 1. sampling_rate 2. start_time, end_time 3. Network, Station, and Channel Code (Location code arbitrary)

Parameters:
  • yield_st (Generator[Stream]) – _description_

  • yield_inv (Generator[Inventory]) – _description_

preprocess(client: str = 'joblib', hc_filt: Optional[float] = None)[source]#

Preprocess an existing database. With parameters defined in self.

Parameters:

hc_filt (float or int or None, optional) – Highcut frequency to filter with right before deconvolution. Recommended if time domain deconvolution is used. For spectral division, filtering can still be done after deconvolution (i.e. set in compute_stack()). Value for PRFs should usually be lower than 2 Hz and for SRFs lower than .4 Hz, by default None.

pyglimer.waveform.rotate#

Contains functions to rotate a stream into different domains

copyright:

The PyGLImER development team (makus@gfz-potsdam.de).

license:

GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

author:

Peter Makus (makus@gfz-potsdam.de)

Created: Saturday, 21st March 2020 07:26:03 pm Last Modified: Thursday, 10th February 2022 03:53:30 pm

pyglimer.waveform.rotate.rotate_LQT_min(st: Stream, phase: str) tuple[source]#

Rotates stream to LQT by minimising the energy of the S-wave primary arrival on the L component (SRF) or maximising the primary arrival energy on L (PRF).

Parameters:
  • st (obspy.Stream) – Input stream in RTZ.

  • phase (STRING, optional) – “P” for Ps or “S” for Sp.

Returns:

  • LQT (obspy.Stream) – Output stream in LQT.

  • ia (float) – Computed incidence angle in degree. Can serve as QC criterion.

pyglimer.waveform.rotate.rotate_PSV(statlat: float, statlon: float, rayp: float, st: Stream, phase: str) tuple[source]#

Finds the incidence angle of an incoming ray with the weighted average of the lithosphere’s P velocity with a velocity model compiled from Litho1.0.

Parameters:
  • statlat (FLOAT) – station latitude.

  • statlon (FLOAT) – station longitude.

  • rayp (FLOAT) – ray parameter / slownesss in s/m.

  • st (obspy.Stream) – Input stream given in RTZ.

  • phase (str) – Primary phase, either P or S.

Returns:

  • avp (FLOAT) – Average P-wave velocity.

  • avs (FLOAT) – Average S-wave velocity.

  • PSvSh (obspy.Stream) – Stream in P-Sv-Sh.