Database layout¶
A formal description of the data base layout used by the datapool.
Legend:
pk
= Primary Keyfk
= Foreign Keyuq
= Unique within tableA
field
in bold letters indicates a field which cannot be NULL
Tables:¶
signal¶
This is the central table holding the measurements. Each row represents a value of a parameter measured at a given time and location (site). The coordinates of the signal may not correlate to entries in the site table.
field |
datatype |
description |
---|---|---|
signal_id |
integer (pk) |
|
value |
float |
the actual measured value of the parameter |
timestamp |
date_time |
time when the value was measured. |
parameter_id |
integer (fk) |
parameter |
source_id |
integer (fk) |
source |
site_id |
integer (fk) |
site |
coord_x |
string |
at a given site, a signal may origin from a specific place |
coord_y |
string |
the type of coordinate system is CH1903/LV03 |
coord_z |
string |
elevation |
binary_data¶
Store n-dimensional arrays as binary blob.
Use the poolkit's
binary_data_packing module for converting n-dimensional arrays to binary data
and the other way around.
field |
datatype |
description |
---|---|---|
binary_data_id |
integer (pk) |
pk |
data |
byte |
binary data |
timestamp |
timestamp |
parameter |
parameter_id |
string (fk) |
parameter |
source_id |
string (fk) |
source |
site_id |
string (fk) |
site |
lab_result¶
Store lab measurements. At the moment primarily designed for water samples.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
pk |
lab_identifier |
string |
id stated by lab |
parameter_id |
string (fk) |
parameter |
sample_identifier |
string |
id of sample |
filter_lab |
string |
used filter |
dilution_lab |
float |
lab dilution |
method_lab |
string |
measurement method |
value_lab |
float |
measured value |
description_lab |
string |
comment from lab personnel |
timestamp_start_lab |
timestamp |
processing start |
timestamp_end_lab |
timestamp |
processing end |
site_id |
integer (fk) |
site |
filter_sample |
string |
filter size used for sample |
dilution_sample |
float |
sample dilution |
timestamp_sample |
timestamp |
time the sample was taken |
method_sample |
string |
sampling method |
description_sample |
string |
comment of person who took sample |
lab_result_person_lab_association¶
Store multiple people that worked on a single sample in the lab.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
|
person_id |
string (fk) |
person |
lab_result_person_sample_association¶
Store multiple persons that were involved in the sampling process of a single sample.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
|
person_id |
string (fk) |
person |
meta_log_type¶
A log type represents the top category, which a meta data history entry can belong to. Eg.: Maintenance, Installation, Malfunctions, …
field |
datatype |
description |
---|---|---|
meta_log_type_id |
integer (pk) |
|
name |
string (uq) |
|
description |
string |
meta_action_type¶
Action types represent a sub category to log types. Eg.: Cleaning (sub cat. from Maintenance), Battery (sub cat. from Maintenance), …
field |
datatype |
description |
---|---|---|
meta_action_type_id |
integer (pk) |
|
meta_log_type_id |
integer (fk) |
|
name |
string |
|
description |
string |
meta_flag¶
Flags are intended to mark certain entries. Eg.: low_operator_confidence (stating that operator of the sensor network, has seen issues in the field, …), …
field |
datatype |
description |
---|---|---|
meta_flag_id |
integer (pk) |
|
name |
string (uq) |
|
description |
string |
meta_data¶
The meta data table is intended to store “one time” information and are linked to sources.
Any static data that corresponds to a source, and does not fit into the other source related tables.
The json
field enables you to store any kind of json like information to ensure maximal flexibility.
Eg.: API-keys for a data backend, WebGIS information
field |
datatype |
description |
---|---|---|
meta_data_id |
integer (pk) |
|
source_id |
integer (fk) |
|
site_id |
integer (fk) |
|
description |
string |
|
additional_meta_info |
json |
field allows for storing any information |
meta_data_history¶
Continuous meta data can be stored here, like regular maintenance entries. Use log types and action_types (optional),
as well as flags (optional) to make your data easily searchable, after the fact.
The json
field enables you to store any kind of json like information to ensure maximal flexibility.
Eg.:
field |
datatype |
description |
---|---|---|
meta_data_history_id |
integer (pk) |
|
meta_data_id |
integer (fk) |
|
meta_log_type_id |
integer (fk) |
|
meta_action_type_id |
integer (fk) |
|
meta_flag_id |
integer (fk) |
|
person_id |
integer (fk) |
|
timestamp_start |
timestamp |
|
timestamp_end |
timestamp |
|
comment |
string |
|
additional_meta_info |
json |
field allows for storing any extra information |
meta_picture¶
Pictures provided with meta data will be stored here.
field |
datatype |
description |
---|---|---|
picture_id |
integer (pk) |
|
meta_data_id |
integer (fk) |
|
meta_data_history_id |
integer (fk) |
|
filename |
string |
|
description |
string |
|
data |
bytes |
person¶
Enter persons to be referenced in meta data or lab entries.
field |
datatype |
description |
---|---|---|
person_id |
integer (pk) |
|
abbreviation |
string (uq) |
|
name |
string (uq) |
|
string |
project¶
Create a project. Each source belongs to a project.
field |
datatype |
description |
---|---|---|
project_id |
integer (pk) |
|
title |
string (uq) |
|
description |
string |
site¶
A site is a location where measurements are made. At a given site, several measuring equipments (source) can be found.
field |
datatype |
description |
---|---|---|
site_id |
integer (pk) |
|
name |
string (uq) |
Name of that site |
description |
string |
site_field¶
The table site field contains the added information for each of the sites. This added information contains the site coordinates, address, …
field |
datatype |
description |
---|---|---|
site_field_id |
integer(pk) |
|
name |
string |
field name |
site_field_value¶
The site field values table contains the effective values for the extra information for each of the sites.
field |
datatype |
description |
---|---|---|
site_field_value_id |
integer(pk) |
|
site_id |
integer(fk) |
|
site_field_id |
integer(fk) |
|
value |
string |
picture¶
Every site may contain a number of pictures. Filenames for each site
must be unique. The filetype (e.g. png, jpg, tiff) is determined by the filename
extenion of the filename
field.
field |
datatype |
description |
---|---|---|
picture_id |
integer (pk) |
|
site_id |
integer (fk) |
referring to the site |
filename |
string |
|
description |
string |
additional information about the picture |
data |
bytea |
contains the (binary) content of the file |
timestamp |
date_time |
creation date of the picture |
source¶
A (data-) source is a specific measuring equipment. Every measurement (signal) origins from a specific source. Sources are categorized into source_types. The name of a source must be unique.
field |
datatype |
description |
---|---|---|
source_id |
integer (pk) |
|
source_type_id |
integer (fk) |
source category |
site_id |
integer (fk) |
optional reference to a site (may be NULL) |
name |
string (uq) |
Name of that source. Usually is a combination of source_type and site name. |
description |
string |
|
serial |
string |
serial number. Is unique, if available |
source_type¶
Categorization of a given source.
field |
datatype |
description |
---|---|---|
source_type_id |
integer (pk) |
pk |
name |
string (uq) |
Name of that source |
description |
string |
|
manufacturer |
string |
company which produced that equipment |
special_value_definition¶
Certain source types produce categorical data, such as «dry», «wet»,
«n/a» and so on. This table is used to correlate categorical data and numeric
values for a given source type. For example the numerical value 1
might encode
the state «dry».
field |
datatype |
description |
---|---|---|
special_value_definition_id |
integer (pk) |
|
source_type_id |
integer (fk) |
source_type |
description |
string |
|
categorical_value |
string |
the catecorical value |
numerical_value |
float |
the numeric value it is mapped to. |
variable¶
Every value in the signal table is connected to a specific variable which describes and defines its unit.
field |
datatype |
description |
---|---|---|
variable_id | integer (pk) | |
||
name |
string (uq) |
e.g. “rain intensity”, “absorbance 200.00”, etc. |
description |
string |
|
unit |
string |
the physical unit, e.g. “mm/h”, “m-1” |
signals_signal_quality_association¶
Table maps flags to signals.
field |
datatype |
description |
---|---|---|
signal_quality_id |
integer (fk) |
|
signal_id |
integer (fk) |
signal_quality¶
A signal may contain more than one quality flag (but not the same quality flag twice). The combination of signal_id and quality_id must be unique.
field |
datatype |
description |
---|---|---|
signal_quality_id |
integer (pk) |
|
quality_id |
integer (fk) |
|
timestamp |
date_time |
date when annotation was added |
author |
string |
username of the author who added the annotation |
quality¶
Measuring the environment is always error prone. This table holds the controlled vocabulary mentioned above. As some quality flags may be assigned programmatically the method field indicates the origin of such an quality entry.
field |
datatype |
description |
---|---|---|
quality_id |
integer (pk) |
|
flag |
string (uq) |
a textual description of quality_id |
method |
string |
a description how the quality flag is generated. |
Design principles¶
The design of the database follows the https://en.wikipedia.org/wiki/Star_schema to model multidimensional data with a https://en.wikipedia.org/wiki/Data_warehouse.
You find a graphical description of the star schema here
.
We follow these principles to assure a consistent layout of the underlying tables:
primary keys of a table are called
tablename\_id
instead ofid
table names are in singular
the star schema avoids too much normalization
a table should not contain too abstract information