seqdec — sequential decision making under uncertainty

seqdec is a small C program that models a classic problem in sequential decision making under uncertainty: a pizza restaurant ordering sausage from a supplier. It evaluates the order-up-to policy, in which a minimum threshold (theta-min) triggers an order that refills inventory up to a maximum threshold (theta-max). By simulating stochastic demand over many time periods, the program searches the policy parameter space and reports the pair that yields the highest profit.

The model also accounts for shipping fees and a free-shipping threshold, reflecting real considerations a small restaurant faces when placing supply orders. The work is inspired by Warren B. Powell's Reinforcement Learning and Stochastic Optimization (2022).

Goals

How it works

Rank 0 is the leader: it generates the stochastic demand scenarios, broadcasts them to the workers, and partitions the policy parameter ranges. Each worker evaluates its slice of the search space and returns its best (theta-min, theta-max) pair along with the resulting profit. The leader picks the global best and writes it to standard output. See order-up(1) for the full set of flags.

Building

Pick a worker count (typically up to the number of cores per node) and build with FAST=1 for an optimized binary:

Run with sbatch slurm-job.sh on a Slurm system, or mpirun directly (see the run target in the Makefile). The openmpi package must be installed. Note that OpenMPI relies on shmget(2), which precludes the use of pledge(2).

Source