-
Notifications
You must be signed in to change notification settings - Fork 26
Degeneracy in variable name #330
Description
While looking for a mapping from variable name to long_name, standard_name and units there are some troubling inconsistencies
ACCESS-NRI/experiment-metadb#3 (comment)
The variables table in the database has the following schema
CREATE TABLE variables (
id INTEGER NOT NULL,
name VARCHAR NOT NULL,
long_name VARCHAR,
standard_name VARCHAR,
units VARCHAR,
PRIMARY KEY (id)
);
CREATE INDEX ix_variables_name ON variables (name);
CREATE UNIQUE INDEX ix_variables_name_long_name_units ON variables (name, long_name, units);Arguably this should also have an index columns for model and realm in case of variable name clashes between sub-models and models. In the original conception of the database it was only storing COSIMA data, so the same model and AFAIK there were no variable name overlaps between CICE and MOM5.
However if there are any other experiment types stored in the DB it may lead to more possibility of variable name clashes.
If you look for instances of multiple variable names with different definitions there are some troubling examples
sqlite> select * from variables where name not like "%time%" and name in (select name from variables group by name having count(*) > 1);
...
802|vh|Meridional Thickness Flux||m3 s-1
161|vh|Meridional thickness flux||m3 s-1
...
932|zoo|||
515|zoo|zoo||mmol/m^3
698|zoo|zoo||none
897|zoo|zooplankton||mmol/m^3So vh is defined with slightly different long names!? How does that happen?
There are four different distinct versions of zoo (zooplankton) variables? How does this happen?