TEMPORARY FIX: to avoid incompatibilities that were noticed on platforms other than NERSC, the pyproject.toml file now imposes numpy<2 and corresponding qp and tables_io versions. This may cause clashes with existing RAIL installation, please be careful and install RAIL_SHIRE in a separate environment.
SHIRE stands for Star-formation HIstory Redshift Estimator. It is a photometric redshift estimation code, that works on the principle of Template Fitting (TF).
Unlike most other TF codes, SHIRE does not use SED templates in ASCII files; rather, it synthetises reference photometry along a specified redshift grid using Stellar Population Synthesis. To do so, templates are given as a set of parameters compatible with the SPS tool DSPS (Hearin et al., 2023).
SHIRE can be run in two modes:
- The 'SPS' mode, that computes the star-formation rate and the corresponding SED and photometry at any given redshift when building the reference
$\mathrm{photometry} = f(z)$ grid; - The 'Legacy' mode, that computes the SED once at the native redshift of a template, and then shifts it IAW
$\lambda_\mathrm{obs} = (1+z)\lambda_\mathrm{em}$ to compute the reference$\mathrm{photometry} = f(z)$ grid at any$z$ . This is similar to existing TF codes.
Caution: SHIRE was designed with extensive LSST-like datasets in mind, and therefore uses JAX to be compatible with GPU architectures. This improves its speed greatly on appropriate machines but may cause crashes due to high memory requirements on other platforms. Therefore, SHIRE is likely not the best suited code to be run on a personal laptop or limited-resources shared installation... sorry about that!
Before installing any dependencies or writing code, it's a great idea to create a
virtual environment. LINCC-Frameworks engineers primarily use conda to manage virtual
environments. If you have conda installed locally, you can run the following to
create and activate a new environment.
conda create -n <env_name> python=3.11
conda activate <env_name>This software is meant to be part of a broader software suite: RAIL.
Once you have created a new environment, you can install RAIL by following the production installation steps as described in RAIL's documentation.
This should be enough, unless of course you are an active RAIL developer, in which case another installation process may be better suited, see the doc for info.
Finally, you can clone this project and install it in your environment <env_name>:
conda activate <env_name>
cd <dir_where_to_clone_the_repo>
git clone [email protected]:lsstdesc/rail_shire.git
cd rail_shire
pip install --no-cache-dir .Alternatively, if you wish to contribute to RAIL_SHIRE, you can install this project for local
development using the following commands:
./.setup_dev.sh
conda install pandocNotes:
./.setup_dev.shwill initialize pre-commit for this local repository, so that a set of tests will be run prior to completing a local commit. For more information, see the Python Project Template documentation on pre-commit
Examples of how to use rail_shire are provided as jupyter notebooks in the examples/ directory, for several datasets.
Note: it will most likely be necessary to update paths in these notebooks so that they work properly!
- For simulated LSST-like data, start with LSSTsim minimal example then perhaps move on to LSSTsim and LSSTxROMANsim are the best places to start. These notebooks also showcase some plotting utilities available in
rail_shire, however these can be time-consuming so it is recommended to skip the corresponding cells in a first run. - For a small sample of COSMOS data, it is necessary to generate training samples first (pay attention to the paths and datasets used...), then try to adapt the minimal example to the new datasets!
- If you are in an appropriate environment (e.g. at NERSC), you may be tempted to see how to generate your own dataset with
RAIL's functionalities: CosmoDC2 data and RomanXRubin data; then adapt the LSSTsim examples above. You can also try runningSHIREon actual Data Preview 1 data, and even cross-matched with data from Euclid! - Finally, checkout the quick examples of post-processing: comparison of prior distributions and PDZ evaluation.
* Please remember that all examples shall be adapted to your environment before being able to run properly! That includes changing some paths and variable names, commenting/uncommenting cells, etc. *
The structure of photometric redshifts estimation with SHIRE is imposed by that of RAIL. First, you must have two datasets: one for training and one for estimation of "test". Then, the necessary steps are summarized below (for detailed examples, syntax and imports, please refer to the examples notebooks):
- Load training and test data into the
DataStorewithDS.add_data(*args, handle_class=TableHandle)orDS.read_file(*args, handle_class=TableHandle). This will make thetraining_dataandtest_dataavilable asTableHandleobjects. - Set-up the
ShireInformerobject withinformer = ShireInformer(**config_dict) - Train the informer (i.e. select appropriate templates and fit the prior):
informer.inform(training_data). The templates and prior are then available asTableHandleandModelHandleobjects respectively. - Set-up the
ShireEstimatorobject withestimator = ShireEstimator(model=informer.get_handle('model'), templates=informer.get_handle('templates'), **other_config_dict)where you can see how we used the outputs of the training as inputs for the estimation. - Run the estimation
estimator.estimate(test_data). The outputs are written in anHDF5file that can be opened with qp or loaded into theDataStorewithDS.read_file(*args, handle_class=QPHandle)for analysis.
This project was automatically generated using the LINCC-Frameworks python-project-template.
A repository badge was added to show that this project uses the python-project-template, however it's up to you whether or not you'd like to display it!
For more information about the project template see the documentation.