I noticed that for several disruptive shots from EAST, the EFIT signals fetched fresh from MDSplus (tree=efit_east) do not match those stored in the SQL tables.
I also compared the stored SQL table data retrieved through disruption-py and through DBeaver for shots 54170, 71003, 71004, 81630 and they indeed match each other. This verified that disruption-py is indeed getting the stored SQL data, instead of fetching them fresh through MDSplus and calling them SQL data (this happens when e.g. asking for a column that doesn't exist in the SQL table). The question is then which of these EFIT trees is more accurate/reliable.
Judging from the shot numbers, it looks to me that, prior to a certain shot between 71259 and 80009, either:
import logging
from disruption_py.machine.tokamak import Tokamak
from disruption_py.workflow import get_shots_data
from disruption_py.settings import LogSettings, RetrievalSettings
import numpy as np
import matplotlib.pyplot as plt
tokamak = Tokamak.EAST
logger = logging.getLogger("disruption_py")
features = ["ip", "beta_n", "beta_p", "kappa", "li", "q0", "qstar", "q95", "wmhd"]
shotno = 53825
# Get MDSplus data
mdsplus_retrieval_settings = RetrievalSettings(
efit_nickname_setting="disruption",
time_setting="disruption_warning",
run_tags=[],
run_columns=features,
only_requested_columns=True,
)
mdsplus_data = get_shots_data(
tokamak=tokamak,
shotlist_setting=[shotno],
retrieval_settings=mdsplus_retrieval_settings,
output_setting="dataframe",
num_processes=1,
log_settings=LogSettings(
log_to_console=True,
console_log_level=logging.DEBUG,
),
)
# Get SQL DB data
sql_retrieval_settings = RetrievalSettings(
cache_setting="sql",
efit_nickname_setting="disruption",
time_setting="disruption_warning",
run_tags=[],
run_columns=features,
only_requested_columns=True,
)
sql_data = get_shots_data(
tokamak=tokamak,
shotlist_setting=[shotno],
retrieval_settings=sql_retrieval_settings,
output_setting="dataframe",
num_processes=1,
)
# Compare data
xrange = (min(mdsplus_data["time"]) - 0.1, max(mdsplus_data["time"]) + 0.1)
fig, axes = plt.subplots(len(features), figsize=(8, 6), dpi=100, sharex=True)
for i, feature in enumerate(features):
axes[i].plot(mdsplus_data["time"], mdsplus_data[feature], c="b", label="mdsplus")
axes[i].plot(
sql_data["time"], sql_data[feature], c="r", linestyle="--", label="sql"
)
axes[i].set_ylabel(feature)
axes[i].set_xlim(*xrange)
axes[0].legend()
plt.suptitle(f'EAST {shotno}')
plt.show()
Description of the problem
I noticed that for several disruptive shots from EAST, the EFIT signals fetched fresh from MDSplus (tree=
efit_east) do not match those stored in the SQL tables.I also compared the stored SQL table data retrieved through disruption-py and through DBeaver for shots 54170, 71003, 71004, 81630 and they indeed match each other. This verified that disruption-py is indeed getting the stored SQL data, instead of fetching them fresh through MDSplus and calling them SQL data (this happens when e.g. asking for a column that doesn't exist in the SQL table). The question is then which of these EFIT trees is more accurate/reliable.
Judging from the shot numbers, it looks to me that, prior to a certain shot between 71259 and 80009, either:
Example shot data
(branch=wei/east-physics)
Script