Running a Project

Local

Running a project file is straightforward. Call the buildstock_local command line tool as follows:

$ buildstock_local --help
...
usage: buildstock_local [-h] [-j J] [-m]
                        [--postprocessonly | --uploadonly | --validateonly | --samplingonly]
                        project_filename

positional arguments:
  project_filename

options:
  -h, --help           show this help message and exit
  -j J                 Number of parallel simulations. Default: all cores.
  -m, --measures_only  Only apply the measures, but don't run simulations.
                       Useful for debugging.
  --postprocessonly    Only do postprocessing, useful for when the simulations
                       are already done
  --uploadonly         Only upload to S3, useful when postprocessing is
                       already done. Ignores the upload flag in yaml
  --validateonly       Only validate the project YAML file and references.
                       Nothing is executed
  --samplingonly       Run the sampling only.

Warning

In general, you should omit the -j argument, which will use all the cpus you made available to docker. Setting the -j flag for a number greater than the number of CPUs you made available in Docker will cause the simulations to run slower as the concurrent simulations will compete for CPUs.

Warning

Running the simulation with --postprocessonly when there is already postprocessed results from previous run will overwrite those results.

NREL HPC (Eagle or Kestrel)

After you have activated the appropriate conda environment on Eagle, you can submit a project file to be simulated by passing it to the buildstock_eagle command.

$ buildstock_eagle --help
...
usage: buildstock_eagle [-h] [--hipri] [-m]
                        [--postprocessonly | --uploadonly | --validateonly | --samplingonly | --rerun_failed]
                        project_filename

positional arguments:
  project_filename

options:
  -h, --help          show this help message and exit
  --hipri             Submit this job to the high priority queue. Uses 2x node
                      hours.
  -m, --measuresonly  Only apply the measures, but don't run simulations.
                      Useful for debugging.
  --postprocessonly   Only do postprocessing, useful for when the simulations
                      are already done
  --uploadonly        Only upload to S3, useful when postprocessing is already
                      done. Ignores the upload flag in yaml
  --validateonly      Only validate the project YAML file and references.
                      Nothing is executed
  --samplingonly      Run the sampling only.
  --rerun_failed      Rerun the failed jobs
$ buildstock_kestrel --help
...
usage: buildstock_kestrel [-h] [--hipri] [-m]
                          [--postprocessonly | --uploadonly | --validateonly | --samplingonly | --rerun_failed]
                          project_filename

positional arguments:
  project_filename

options:
  -h, --help          show this help message and exit
  --hipri             Submit this job to the high priority queue. Uses 2x node
                      hours.
  -m, --measuresonly  Only apply the measures, but don't run simulations.
                      Useful for debugging.
  --postprocessonly   Only do postprocessing, useful for when the simulations
                      are already done
  --uploadonly        Only upload to S3, useful when postprocessing is already
                      done. Ignores the upload flag in yaml
  --validateonly      Only validate the project YAML file and references.
                      Nothing is executed
  --samplingonly      Run the sampling only.
  --rerun_failed      Rerun the failed jobs

Warning

Running the simulation with postprocessonly when there is already postprocessed results from previous run will overwrite those results.

Project configuration

To run a project on Kestrel or Eagle, you will need to make a few changes to your Project Definition. First, the output_directory should be in /scratch/your_username/some_directory or in /projects somewhere. Building stock simulations generate a lot of output quickly and the /scratch or /projects filesystem are equipped to handle that kind of I/O throughput where your /home directory is not.

Next, you will need to add a Kestrel Configuration or Eagle Configuration top level key to the project file, which will look something like this:

kestrel:  # or eagle
  account: your_hpc_allocation
  n_jobs: 100  # the number of concurrent nodes to use
  minutes_per_sim: 2
  sampling:
    time: 60  # the number of minutes you expect sampling to take
  postprocessing:
    time: 180  # the number of minutes you expect post processing to take

In general, be conservative on the time estimates. It can be helpful to run a small batch with pretty conservative estimates and then look at the output logs to see how long things really took before submitting a full batch simulation.

Re-running failed array jobs

Running buildstockbatch on HPC breaks the simulation into an array of jobs that you set with the n_jobs configuration parameter. Each of those jobs runs a batch of simulations on a single compute node. Sometimes a handful of jobs will fail. If most of the jobs succeeded, rather than rerun everything you can resubmit just the jobs that failed with the --rerun_failed command line argument. This will also clear out and rerun the postprocessing.

Amazon Web Services

Running a batch on AWS is done by calling the buildstock_aws command line tool.

$ buildstock_aws --help
...
usage: buildstock_aws [-h] [-c] [--validateonly] project_filename

positional arguments:
  project_filename

options:
  -h, --help        show this help message and exit
  -c, --clean       After the simulation is done, run with --clean to clean up
                    AWS environment
  --validateonly    Only validate the project YAML file and references.
                    Nothing is executed

AWS Specific Project configuration

For the project to run on AWS, you will need to add a section to your config file, something like this:

aws:
  # The job_identifier should be unique, start with alpha, and limited to 10 chars or data loss can occur
  job_identifier: national01
  s3:
    bucket: myorg-resstock
    prefix: national01_run01
  region: us-west-2
  use_spot: true
  batch_array_size: 10000
  # To receive email updates on job progress accept the request to receive emails that will be sent from Amazon
  notifications_email: your_email@somewhere.com

See AWS Configuration for details.

Cleaning up after yourself

When the simulation and postprocessing is all complete, run buildstock_aws --clean your_project_file.yml. This will clean up all the AWS resources that were created on your behalf to run the simulations. Your results will still be on S3 and queryable in Athena.