Application Programming Interface
The ground motion database is equipped with a web interface, which is what you are currently viewing,
and an application programming interface (API) that exposes the data to HTTP requests. The web
interface is useful if you quickly want to look at data in a tabular form, but it is cumbersome if
you want to develop an end-to-end workflow. Copying HTML tables is kind of a pain, and it takes time
to render those tables. The API exposes the underlying data through HTTP requests, providing
the data in JSON format so you can work with it more directly and incoporate it into a workflow.
Users can access the API by authenticating, retrieving an authorization token, and submitting a request
that includes that token in the request header. The authorization token is required because we want to make sure
that we return the information only to people who have an active account, thereby avoiding attacks by
unauthorized users, and that users have access to data consistent with their role (user, modeler, admin).
Authentication / Authorization
The first step in using the API is to authenticate and retrieve an authorization token. To authenticate,
submit an API request to the following uniform resource locator (URL)
https://www.gmdatabase.org/users/login
You must use the Basic Auth method to include your username and password with the HTTP request.
Programs specifically designed for making HTTP requests include curl,
postman, and others. In this documentation, we provide
examples using the Python requests package.
The example below authenticates the user using HTTP Basic Auth and retrieves an authorization token.
We suggest never hard-coding your username and password into a script. You might accidentally share
the script, and divulge your authentication credentials. We have used the getpass
package here facilitate password entry. The token is valid for 120 minutes, after which you will need to
re-authenticate and retrieve a new token. It is important not to share your token with others.
import requests
from requests.auth import HTTPBasicAuth
import getpass
import json
username = input('username: ')
password = getpass.getpass('password: ')
url = 'http://www.gmdatabase.org/users/login'
headers = {"User-Agent":"XY", "Accept":"application/json"}
r = requests.get(url, headers=headers, auth=HTTPBasicAuth(username,password))
token = json.loads(r.text)['token']
Submitting a Request
After retrieving an authorization token, you can now submit a GET request to the desired
URL, and the requested data will be returned in JSON format. Two headers must be
included in your request. The "Accept":"application/json" header is used to identify that
JSON data is being requested. The "Authorization":"Bearer {}".format(token) header contains
the authorization token that was provided after successful authentication.
An example request is provided in the Python script below. In this case, the API requests
50 entries from the events table. The URL structure and available query string parameters
for customizing the request are discussed in the next section.
import pandas as pd
import requests
import json
headers = {"Accept":"application/json","Authorization": "Bearer {}".format(token)}
url = 'http://www.gmdatabase.org/events?limit=50'
r= requests.get(url, headers=headers)
df = pd.DataFrame.from_dict(json.loads(r.text))
URL structure
Requests are submitted to a uniform resource locator (URL) that contains the following components:
url = base_url/endpoint?query_string
base_url
The base_url is the webpage containing the API. In this case, it ishttps://www.gmdatabase.orgSubmitting a request without also specifying an endpoint will produce an error. The endpoint is necessary so we know which query to run to retrieve the data you have requested. The query string is optional, and can be used to customize your queries.
endpoint
An endpoint is a location where an API receives requests about a resource. The endpoints correspond to table names converted to pluralized lower camel case with underscores removed. All of the tables except for the user table are accessible. The user table is not accessible for privacy purposes. Furthermore, a flatfile endpoint is also available. The flatfile endpoint combines tables together in a manner similar to the NGA flatfiles. Tables included in the flatfile endpoint are indicated with a checkmark in the table below. The flatfile can be generated either for the response_spectra table or the fourier_spectra table, but not both at the same time because the resulting table would be large and unwieldy. The response_spectra table is returned by default. If you wish to retrieve data from the fourier_spectra table, you should enter a query string parameter associated with the fourier_spectra table, as described in more detail in the flatfile endpoint section.table name | endpoint | table included in flatfile |
---|---|---|
aftershock_mainshock | aftershockMainshocks | |
basin_model | basinsModels | |
basin_site | basinsSites | |
citation | citations | |
collections | collections | |
collection_motion | collectionsMotions | |
event | events | ✔ |
event_eqid | eventEqids | ✔ |
event_geometry | eventsGeometries | |
event_type | eventTypes | ✔ |
finite_fault | finiteFaults | ✔ |
finite_fault_kinematic_parameter | finiteFaultKinematicParameters | ✔ |
finite_fault_segment | finiteFaultSegments | |
fourier_spectra | fourierSpectra | (✔) |
intensity_measure | intensityMeasures | ✔ |
motion | motions | ✔ |
network | networks | ✔ |
path | paths | ✔ |
response_spectra | responseSpectra | (✔) |
site | sites | ✔ |
site_geometry | geometriesSites | |
station | stations | ✔ |
station_ssn | stationSsns | ✔ |
time_series_data | timeSeriesData | |
time_series_metadata | timeSeriesMetadata | ✔ |
version | versions | |
version_time_series_metadata | timeSeriesMetadataVersions | ✔ |
vs30_citation | vs30Citations | |
vs30_code | vs30Codes | |
z_code | zCodes |
query_string
A query string is a list of parameters used to customize the request. If a query string is not specified, default values will be returned. A list of valid query string parameters and their default values is below. The flatfile, responseSpectra, and fourierSpectra endpoints have additional query string options, as discussed in their respective sections below.query string options for table endpoints
query string parameter | default | description |
---|---|---|
limit | 20 | number of records per page |
sort | primary_key | field to sort by |
direction | asc | sort direction (asc or desc) |
page | 1 | page number |
role | user | Role for users with elevated permissions (i.e., modeler or admin). |
where | - | Used to filter selected data (e.g., ?where=pga_rotd50>0.1+AND+pga_rotd50<0.2 |
contain | - | comma-spearated list of related tables to include in query (e.g., events?contain=Motions.Stations |
matching | - | table to include in query (e.g., ?matching=Motions |
First query string parameter
The first query string parameter comes after a question mark (?) at the end of the url. For example: https://www.gmdatabase.org/intensityMeasures?page=1.Second and subsequent query string parameters
The 2nd and subsequent query string parameters are separated by an ampersand (&). For example: https://www.gmdatabase.org/?page=1&limit=50.Filtering data using "where" statements
Data can be filtered using the "where" query string parameter. The where statement is used to create a SQL statement excerpt "WHERE [field] [operator] [value]". For example, to retrieve events with hypocenter latitude larger than 40.0°, the following should query string parameter would be ?where=hypocenter_latitude>40.0 This query string parameter will be translated to the SQL excerpt WHERE hypocenter_latitude > 40.0 when the query is executed. Available operators depend on data type as described in the table below.type | operators |
---|---|
numeric (e.g., float, int) | >, >=, =, <=, <, BETWEEN, IN, NOT IN |
string | >, >=, =, <=, <, BETWEEN, LIKE, NOT LIKE, IN, NOT IN |
boolean | true, false |
Applying Multiple Conditions
Multiple operators can be applied using an "AND" or "OR" separator. When constructing the query string, spaces should be replaced by the plus sign "+", as shown below. For example,"?where=hypocenter_latitude<40+OR+hypocenter_longitude>-118.0"Furthermore, conditions can be grouped using parenthesis like this:
"?where=(hypocenter_latitude>40.0+AND+hypocenter_latitude<42.0)+OR+(hypocenter_longitude>-118.0+AND+hypocenter_longitude<-117.0)"
Equal sign (=) for float fields
You should avoid using the equal sign (=) operator for float-valued fields because floats are stored in an approximate manner. For example, if a strike appears as 196.2 in the database, you should recognize that its value may not be precisely 196.2 because we format outputs for presentation purposes. Therefore, performing a query "WHERE strike=196.2" is unlikely to return the entry you're looking for. You could use a combination of greater than (>) and less than (<) operators to search for a narrow range instead. It is fine to use the equal sign (=) operator for integer, string, and decimal fields. Decimal fields are used, for example, for latitude and longitude data such that all of the fields contain 5 decimals in the database, and can be compared using the equal operator. The schema page lists the type for each field. Strings are "varchar", integers are "int(N)" where N is the bit depth, floats are "float(N)", and decimal format is denoted "float(M,L)", where M is the total number of characters, and L is the number after the decimal point.BETWEEN
The BETWEEN operator is a shorthand replacement for a greater than and less than operator. For example,"where=hypocenter_latitude>40.0+AND+hypocenter_latitude<42.0"can be replaced by
"where=latitude+BETWEEN+40.0+AND+42.0"
LIKE and NOT LIKE
The LIKE operator is used to search for a specified pattern in a column of strings.The NOT LIKE operator returns the complementary set of entries excluded by the LIKE operator.
Two wildcards are often used with LIKE and NOT LIKE statements:
- The percent sign (%) is a placeholder for an arbitrary number of characters
- The underscore (_) is a placeholder for a single character
Query String | SQL Statement Excerpt | Description |
---|---|---|
where=event_name+LIKE+"v%" | WHERE event_name LIKE 'v%' | Finds any value that starts with "v" |
where=event_name+NOT+LIKE+"v%" | WHERE event_name NOT LIKE 'v%' | Finds any value that does not start with "v" |
where=event_name+LIKE+"%v" | WHERE event_name LIKE '%v' | Finds any value that ends with "v" |
where=event_name+LIKE+"%v%" | WHERE event_name LIKE '%v%' | Finds any value that contains "v" in any position |
where=event_name+LIKE+"_v" | WHERE event_name LIKE '_v' | Finds any value that contains "v" in the 2nd position |
where=event_name+LIKE+"v_%" | WHERE event_name LIKE 'v_%' | Finds any value that starts with "v" and is at least 2 characters in length |
where=event_name+LIKE+"v__%" | WHERE event_name LIKE 'v__%' | Finds any value that starts with "v" and is at least 3 characters in length |
where=event_name+LIKE+"v%n" | WHERE event_name LIKE 'v%n' | Finds any value that starts with "v" and ends with "n" |
IN and NOT IN Operators
The IN operator is used to select values from an array of options, and is shorthand for multiple OR conditions.The NOT IN operator returns the complementary set of entries excluded by the IN operator.
Since the IN and NOT IN operators imply an equal sign (=), you should not use them for float fields for the reasons explained above.
Consider the examples below in which we search on an integer field called "event_id".
Note that the value array is a comma-separated list contained within parenthesis.
Query String | SQL Statement Excerpt | Description |
---|---|---|
where=event_id+IN+(1,2,3) | WHERE event_id IN (1, 2, 3) | Finds entries with id = 1, 2, or 3 |
where=event_id+NOT+IN+(1,2,3) | WHERE event_id NOT IN (1, 2, 3) | Finds entries with id ≠ 1, 2, or 3 |
responseSpectra and fourierSpectra endpoints
The response_spectra and fourier_spectra tables are handled differently from the others. Each response_spectra and fourier_spectra entry has its own primary key. For example, an entry from the response_spectra table has the following fields: response_spectra_id, period, psa_rotd0, psa_rotd50, psa_rotd100, psa_h1, psa_h2, psa_v. We believe that users are more interested in reviewing all of the response spectra ordinates for a single time series rather than viewing each value one at a time. We therefore have "flattened" the view such that each row corresponds to a motion, as represented by the time_series_metadata_id key.sorting
These tables can be sorted by column heading. For example, ?sort=psa_rotd50_0p01 will sort the returned values by the spectral acceleration at a period of 0.01 s for the RotD50 component. We have replaced the decimal point "." with "p" because the decimal point has different meaning in URLs. The response_spectra and fourier_spectra tables contain specific periods / frequencies. You do not need to enter precisely one of those periods / frequencies when sorting or filtering data. The nearest available period / frequency will be selected. For example, if you enter ?sort=psa_rotd50_0p008, you will not receive spectral acceleration at an oscillator period of 0.008s because that's not one of the options. Instead, you will receive spectral acceleration at 0.01s, which is the closest available option.filtering
The tables can be filtered by column heading. For example, ?where=psa_rotd50_0p01>0.2 will return response spectra with spectral acceleration at a natural period of 0.01 seconds with values higher than 0.2 g.components
Valid components for these tables are provided in the table below, along with the default values. You may select as many different components as you'd like, but keep in mind that the amount of data will increase significantly if you include multiple components, and there is a chance that your request will exceed the server limit. You can select multiple components in three different ways. First, you could use a where statement (e.g., ?where=component+IN+('rotd0','rotd50','rotd100','h1','h2','v')) would return data for all componentstable | components | default |
---|---|---|
response_spectra | psa_rotd0, psa_rotd50, psa_rotd100, psa_h1, psa_h2, psa_v | rotd50 |
fourier_spectra | eas, fas_h1, fas_h2 | eas |
flatfile endpoint
The database is organized into many different tables to ensure data integrity, but users often want to query data from multiple tables and join them together in a way that results in repetition of certain fields. For example, users might want to retrieve all event, site, and ground motion data together in the same table, where event and site information is repeated for each ground motion. The flatfile endpoint provides the logic to join together tables that we believe will serve the needs of most users. Specifically, the following tables are queried and presented in a "flattened" form via the flatfile endpoint. By default, the fourier_spectra table is not queried, but can be included by specifying query string parameters as indicated below.
Tables included in flatfile endpoint query
table |
---|
motion |
time_series_metadata |
intensity_measure |
response_spectra |
[fourier_spectra] |
event |
event_type |
finite_fault |
finite_fault_kinematic_parameter |
station |
site |
station |
network |
path |
event_eqid |
station_ssn |
Query string parameters for the flatfile table are a bit more complicated than for the individual tables. A list of available query string parameters and their default values is indicated below. Desired components are specified separately for the intensity_measure, response_spectra, and fourier_spectra tables. It might be cumbersome to specify these separately, but it is not straightforward to infer user intent from a single set of components. For example, if we used a single "components" query string parameter, and a user specified "h1", we would not know if they want fourier_spectra or response_spectra. For clarify, we have separated them, at the expense of longer query strings. Users might not want to return all of the fields in the flatfile. For example, if a user specifies "?fields=psa_rotd50_0p10", they will retrieve only the RotD50 spectral accelerations at T=0.1s. However, we include primary keys from each table because it would otherwise be very difficult to make sense of the resulting table.
query string options for flatfile endpoint
query string parameter | default | description |
---|---|---|
limit | 20 | number of records per page |
sort | primary_key | field to sort by |
direction | asc | sort direction (asc or desc) |
page | 1 | page number |
role | user | Only used for API. Dropdown used to set role for website. |
intensity_measure_components | h1, h2, v, rotd0, rotd50, rotd100 | rotd50 |
response_spectra_components | psa_h1, psa_h2, psa_v, psa_rotd0, psa_rotd50, psa_rotd100 | psa_rotd50 |
fourier_spectra_components | fas_h1, fas_h2, fas_v, eas | none |
fields | [comma-separated list of field names] | all fields from selected tables |
[field name] | value | Used to filter fields (e.g., ?pga_rotd50>0.1 |
Whether data are included from response_spectra, fourier_spectra, or both is inferred from the provided query string parameters. For example, if you specify response_spectra_components, and fourier_spectra_components in the query string, both will be returned. Furthermore, if you include response_spectra or fourier_spectra quantities in the sort, fields, or [field name] query string parameters, those data will be returned. For example, ?psa_rotd50_0p10>0.1&sort=eas_0p10 will return all of the RotD50 response spectral ordinates, and all of the eas Fourier spectra ordinates (unless the "fields" query string parameter is used to reduce the list of returned fields). If you specify "fields", only those fields will be returned, whereas if "fields" is omitted, all fields in each table will be returned.
Example Queries
example 1
Below are a few example queries that show how to specify endpoints and query string parameters.
This is not an exhaustive list, but just a few examples.
example 2
Query 20 events sorted in descending order by magnitude:
https://www.gmdatabase.org/events?limit=20&sort=magnitude&direction=desc
example 3
Query 20 results for the rotd50 response_spectra component sorted in descending order by spectral acceleration at 0.1 second period:
https://www.gmdatabase.org/responseSpectra?limit=20&sort=psa_rotd50_0p100&direction=desc
example 4
Query results 40-60 (i.e., the 3rd page with limit=20) for the RotD50 and h1 response_spectra components sorted in descending order by RotD50 spectral acceleration at 0.1 second period:
https://www.gmdatabase.org/responseSpectra?limit=20&sort=psa_rotd50_0p10&direction=desc&components=psa_rotd50,h1&page=3
example 5
Query results 40-60 for the RotD50 response_spectra components sorted in descending order by RotD50 spectral acceleration at 0.1 second period for values with Rotd50 spectral acceleration at 0.01s less than 0.1g.
https://www.gmdatabase.org/responseSpectra?limit=20&sort=psa_rotd50_0p10&direction=desc&psa_rotd50_0p01>0.1&page=3
We don't know
why anyone would want to limit and sort by spectral acceleration at different natural periods, but we're not judgy. This is just demonstrating how to use the query string fields.
example 6
Query response_spectra flatfile for the RotD50 components sorted in descending order by RotD50 spectral acceleration at 0.1 second period.
https://www.gmdatabase.org/flatfile?sort=psa_rotd50_0p10&direction=desc
We don't know
why anyone would want to limit and sort by spectral acceleration at different natural periods, but we're not judgy. This is just demonstrating how to use the query string fields.