desiInstall
Introduction
This document describes the desiInstall process and the logic behind it.
The primary purpose of desiInstall is to install DESI software at NERSC. Using it to install software at locations other than NERSC is theoretically possible, but not supported.
Basic Invocation of desiInstall
desiInstall is invoked with the product and version to be installed:
desiInstall desiutil 3.6.1
The version should correspond to a tag in the repository corresponding to the product.
Branches can also be installed, and the resulting install will be a
checkout of the repository, set to the requested branch. This is specified
by prepending branches/ to the version name:
desiInstall desiutil branches/branch-compile
Finally, for Subversion repositories, trunk is a shorthand for branches/trunk;
for GitHub repositories, main is a shorthand for branches/main. All
other branch names must be prepended with branches/.
Configuring desiInstall
desiInstall has many options, which are best viewed by typing
desiInstall --help.
In addition, desiInstall both reads and sets several environment variables.
Environment variables that strongly affect the behavior of desiInstall.
DESICONDAThis variable contains the path to the DESI+Anaconda infrastructure.
DESICONDA_VERSIONThis variable should contain the version of the DESI+Anaconda infrastructure.
DESIUTILThis variable contains the path to the installed version of desiutil. It is needed to find the
etc/desiutil.modulefile.NERSC_HOSTThis will automatically be set on NERSC systems. Although it is fine to manipulate this variable during unit tests, for general user and production purposes, it should be considered strictly read-only.
USERThis variable is read to determine the username to pass to, e.g., svn.
Environment variables that are set by desiInstall for use by pip install or make.
INSTALL_DIRThis variable is set by desiInstall to the directory that will contain the final, installed version of the software package.
PRODUCT_VERSIONThis variable is set by desiInstall, with
PRODUCTreplaced by the actual name of the software being installed, e.g.,DESISPEC_VERSION.WORKING_DIRThis variable is set by desiInstall to the path containing a downloaded, expanded software package.
Environment variables related to the Modules infrastructure that may be manipulated by setting up Modules, or loading Module files.
LOADEDMODULESThis variable contains a list of the Module files currently loaded. It may be manipulated by
desiutil.modules.MODULE_VERSIONThis variable is set on some NERSC systems and is needed to determine the full path to modulecmd.
MODULE_VERSION_STACKThis variable is set on some NERSC systems may be set by
desiutil.modulesfor compatibility.MODULEPATHThis variable contains a list of directories containing Module files. It may be manipulated by
desiutil.modules.MODULESHOMEThis variable points to the Modules infrastructure. If it is not set, it typically means that the system has no Modules infrastructure. This is needed to find the executable program that reads Module files.
PYTHONPATHObviously this is important for any Python package!
PYTHONPATHmay be manipulated bydesiutil.modules.TCLSHMay be used to determine the full path to modulecmd.tcl on systems with a pure-TCL Modules infrastructure.
Directory Structure Assumed by the Install
desiInstall is primarily intended to run in a production environment that supports Module files, i.e. at NERSC.
desiInstall does not install a Modules infrastructure for you. You have to do this yourself, if your system does not already have this.
For the purposes of this section, we define $product_root as the
directory that desiInstall will be writing to. For standard NERSC installs it
defaults to a pre-defined value. $product_root may contain the following
directories:
- code/
This contains the installed code, the result of pip install . or make install. The code is always placed in a
product/versiondirectory. So for example, the full path to desiInstall might be$product_root/code/desiutil/1.8.0/bin/desiInstall.- modulefiles/
This contains the the Module files installed by desiInstall. A Module file is almost always named
product/version. For example, the Module file for desiutil might be$product_root/modulefiles/desiutil/1.8.0.
The --root option can override the built-in default value of $product_root,
which is useful for testing:
desiInstall --root $SCRATCH/test_install desispec 0.20.0
In the example above, desispec would be installed in
$SCRATCH/test_install/code/desispec/0.20.0,
with a corresponding Module file at
$SCRATCH/test_install/modulefiles/desispec/0.20.0
Within a $product_root/code/product/version directory, you might see the
following:
- bin/
Contains command-line executables, including Python or Shell scripts.
- data/
Rarely, packages need data files that cannot be incorporated into the package structure itself, so it will be installed here. desimodel is an example of this.
- etc/
Miscellaneous metadata and configuration. In most packages this only contains a template Module file.
- lib/pythonX.Y/site-packages/
Contains installed Python code.
X.Ywould be, e.g.,3.6or3.8.- py/
Sometimes we need to install a git checkout rather than an installed package. If so, the Python code will live in this directory not the
lib/directory, and the product’s Module file will be adjusted accordingly.
Stages of the Install
Input Validation
desiInstall checks the command-line input, verifying that the user has specified a product and a version to install.
Product/Version Parsing
Because of the structures of the DESI code repositories, it is sometimes necessary to specify a directory name along with the product name. desiInstall contains a list of known products, but it is not necessarily complete. desiInstall parses the input to determine the base name and base version to install. At this stage desiInstall also determines whether a branch install [1] has been requested.
The internal list of known products can be added to or overridden on the command line:
desiInstall -p new_product:https://github.com/me/new_product new_product 1.2.3
desiInstall -p desiutil:https://github.com/alternate_repository/desiutil desiutil 1.9.9
The -p option can be specified multiple times, though in practice, it only
matters to the product actually being installed.
Product Existence
After the product name and version have been determined, desiInstall constructs the full URL pointing to the product/version and runs the code necessary to verify that the product/version really exists. Typically, this will be svn ls, unless a GitHub install is detected.
Download Code
The code is downloaded, using svn export for standard (tag) installs, or
svn checkout for branch installs. For GitHub installs, desiInstall
will look for a release tarball, or do a git clone for tag or branch
installs. desiInstall will set the environment variable WORKING_DIR
to point to the directory containing this downloaded code.
Determine Build Type
The downloaded code is scanned to determine the build type. There are several possible build types that are mutually exclusive. They are derived in this order and the first matching method is used:
- py
If a pyproject.toml or a setup.py file is detected, desiInstall will attempt to execute pip install .. This build type can be suppressed with the command line option
--compile-c.- make
If a Makefile is detected, desiInstall will attempt to execute make install.
- src
If a Makefile is not present, but a src/ directory is, desiInstall will attempt to execute make -C src all.
- plain
If no other build type is detected, the downloaded code is simply copied to the final install directory.
It is the responsibility of the code developer to understand these build types and choose the one appropriate for the package being developed.
Determine Install Directory
The install directory is where the code will live permanently. If the
install is taking place at NERSC, the top-level install directory is
predetermined based on the value of NERSC_HOST:
/global/common/software/desi/${NERSC_HOST}/desiconda/${DESICONDA_VERSION}
The actual install directory is determined by appending /code/product/verson
to the combining the top-level directory listed above.
If the install directory already exists, desiInstall will exit, unless the
--force parameter is supplied on the command line.
desiInstall will set the environment variable INSTALL_DIR to point to the
install directory.
Module Infrastructure
desiInstall sets up the Modules infrastructure by running code in
desiutil.modules that is based on the Python init file supplied by
the Modules infrastructure.
Find Module File
desiInstall will search for a module file in $WORKING_DIR/etc. If that
module file is not found, desiInstall will use the file that comes with
desiutil (i.e., this product’s own module file).
Load Dependencies
desiInstall will scan the module file identified in the previous stage, and will module load any dependencies found in the file.
Configure Module File
desiInstall will scan WORKING_DIR to determine the details that need
to be added to the module file. The final module file will then be written
into the DESI module directory at NERSC. If --default is specified
on the command line, an appropriate .version file will be created. Module
files are always installed with world-read permissions.
Load Module
desiInstall will load the module file just created to set up any environment
variables needed by the install. At this point it is also safe to assume that
the environment variables WORKING_DIR and INSTALL_DIR exist.
It will also set PRODUCT_VERSION, where PRODUCT will be replaced
by the actual name of the package, e.g., DESIMODEL_VERSION.
For a few packages (speclite, specsim) that use setuptools-scm to dynamically
set the version, a special environment variable, e.g.
SETUPTOOLS_SCM_PRETEND_VERSION_FOR_SPECLITE, is set to work around
the fact that tarballs downloaded from GitHub do not contain the
git metadata needed to construct the version string.
Create site-packages
If the build-type ‘py’ is detected, a site-packages directory will be
created in INSTALL_DIR. If necessary, this directory will be
added to Python’s sys.path.
Can We Just Copy the Download?
If the build-type is only ‘plain’, or if a branch install is
requested, the downloaded code will be copied to INSTALL_DIR.
Further Python or C/C++ install steps described below will be skipped.
Run pip
If the build-type ‘py’ is detected, pip install . will be run at this point.
Build C/C++ Code
If the build-type ‘make’ is detected, make install will be run in
WORKING_DIR. If the build-type ‘src’ is detected, make -C src all
will be run in INSTALL_DIR.
Download Extra Data
If desiInstall detects etc/product_data.sh, where product should be
replaced by the actual name of the package, it will download extra data
not bundled with the code. The script should download data directly to
INSTALL_DIR. The script should only be used
with desiInstall and unit tests. Note that here are other, better ways to
install and manipulate data that is bundled with a Python package.
Compile in Branch Installs
In a few cases (fiberassign, specex) code needs to be compiled even when
installing a branch. If desiInstall detects a branch install and
the script etc/product_compile.sh exists, desiInstall will run this
script, supplying the Python executable path as a single command-line argument.
The script itself is intended to be a thin wrapper on e.g.:
#!/bin/bash
py=$1
${py} setup.py build_ext --inplace
Set Version in Branch Installs
In a few cases (speclite, specsim) the version is set dynamically via setuptools-scm. For branch installs, the version string needs to be constructed. Since branch installs do have the necessary git metadata to construct the version, a simple script is all that is needed.
Set Permissions
The permissions of INSTALL_DIR will be recursively set to standard
values under these circumstances:
World-read, unless
--no-worldis specified on the command line.Unwriteable to all, unless a branch install is being performed, in which case user-write is set.
Clean Up
The original download directory, specified by WORKING_DIR, is removed,
unless --keep is specified on the command line.