Workflow Syntax and Execution Runtime¶
This section introduces the YAML syntax used by Popper, describes the workflow execution runtime and shows how to execute workflows in alternative container engines.
A Popper workflow file looks like the following:
steps: - uses: docker://alpine:3.9 args: ["ls", "-la"] - uses: docker://alpine:3.11 args: ["echo", "second step"] options: env: FOO: BAR secrets: - TOP_SECRET
A workflow specification contains one or more steps in the form of a
YAML list named
steps. Each item in the list is a dictionary
containing at least a
uses attribute, which determines the docker
image being used for that step. An
options dictionary specifies
options that are applied to the workflow.
The following table describes the attributes that can be used for a
step. All attributes are optional with the exception of the
||required The Docker image that will be executed for that step. For example,
||optional Assigns an identifier to the step. By default, steps are asigned a numerid id
corresponding to the order of the step in the list, with
the first step.
||optional Specifies the command to run in the docker image. If
command specified in the
or you want to override the
invoke a shell by default. Using
variables with the
the variables, for example:
refers to a local script, the path is relative to the workspace folder (see
The Workspace section below)
||optional The arguments to pass to the command. This is an array of strings. For example,
refers to a local script, the path is relative to the workspace folder (see
The Workspace section below). Similarly to the
referenced, in order for this reference to be valid, a shell must be invoked (in the
in order to expand the value of the variable.
||optional The environment variables to set inside the container's runtime environment. If
you need to pass environment variables into a step, make sure it runs a command
shell to perform variable substitution. For example, if your
executed in a command shell. Alternatively, if your
command shell as well. See
||optional Specifies the names of the secret variables to set in the runtime environment
which the container can access as an environment variable. For example,
||optional Assume that the given container image already exist and skip pulling it.|
||optional Specifies the working directory for a step. By default, the directory is always
Referencing images in a step¶
A step in a workflow can reference a container image defined in a
Dockerfile that is part of the same repository where the workflow
file resides. In addition, it can also reference a
contained in public Git repository. A third option is to directly
reference an image published a in a container registry such as
DockerHub. Here are some examples of how you can refer to an
image on a public Git repository or Docker container registry:
||The path to the directory that contains the
path with respect to the workspace directory (see
The Workspace section below). Example:
||A specific branch, ref, or SHA in a public Git repository. If
||A subdirectory in a public Git repository at a specific branch, ref,
||A Docker image published on Docker Hub.
||A Docker image in a public registry other than DockerHub. Note
that the container engine needs to have properly configured to
access the referenced registry in order to download from it.
It’s strongly recommended to include the version of the image you are using by specifying a SHA or Docker tag. If you don’t specify a version and the image owner publishes an update, it may break your workflows or have unexpected behavior.
In general, any Docker image can be used in a Popper workflow, but keep in mind the following:
- When the
runsattribute for a step is used, the
ENTRYPOINTof the image is overridden.
WORKDIRis overridden and
/workspaceis used instead (see The Workspace section below).
ARGinstruction is not supported, thus building an image from a
Dockerfile(public or local) only uses its default value.
- While it is possible to run containers that specify
USERother than root, doing so might cause unexpected behavior.
Referencing private Github repositories¶
You can reference Dockerfiles located in private Github
repositories by defining a
GITHUB_API_TOKEN environment variable
popper run command reads and uses to clone private
repositories. The repository referenced in the
uses attribute is
assumed to be private and, to access it, an API token from Github is
needed (see instructions here).
The token needs to have permissions to read the private repository in
question. To run a workflow that references private repositories:
export GITHUB_API_TOKEN=access_token_here popper run -f wf.yml
If the access token doesn’t have permissions to access private
popper run command will fail.
options attribute can be used to specify
that are available to all the steps in the workflow. For example:
options: env: FOO: var1 BAR: var2 secrets: [SECRET1, SECRET2] steps: - uses: docker://alpine:3.11 runs: sh args: ["-c", "echo $FOO $SECRET1"] - uses: docker://alpine:3.11 runs: sh args: ["-c", "echo $ONLY_FOR"] env: ONLY_FOR: this step
The above shows environment variables that are available to all steps
that get defined in the
options dictionary; it also shows an example
of a variable that is available only to a single step (second step).
This attribute is optional.
This section describes the runtime environment where a workflow executes.
When a step is executed, a folder in your machine is bind-mounted
(shared) to the
/workspace folder inside the associated container.
By default, the folder being bind-mounted is
$PWD, that is, the
working directory from where
popper run is being invoked from. If
--workspace) flag is given, then the value for this
flag is used instead. See the official Docker documentation
for more information about how volumes work with containers.
The following diagram illustrates this relationship between the
filesystem namespace of the host (the machine where
popper run is
executing) and the filesystem namespace within container:
Container +----------------------+ | /bin | | /etc | | /lib | Host | /root | +-------------------+ bind | /sys | | | mount | /tmp | | /home/me/my/proj <------+ | /usr | | ├─ wf.yml | | | /var | | └─ README.md | +------> /workspace | | | | ├── wf.yml | | | | └── README.md | +-------------------+ +----------------------+
For example, let’s look at a workflow that creates files in the workspace:
steps: - uses: docker://alpine:3.12 args: [touch, ./myfile]
The above workflow has only one single step that creates the
file in the workspace directory if it doesn’t exist, or updates its
metadata if it already exists, using the
Assuming the above workflow is stored in a
wf.yml file in
/home/me/my/proj/, we can run it by first changing the current
working directory to this folder:
cd /home/me/my/proj/ popper run -f wf.yml
And this will result in having a new file in
However, if we invoke the workflow from a different folder, the folder
being bind-mounted inside the container is a different one. For
cd /home/me/ popper run -f /home/me/my/proj/wf.yml
In the above, the file will be written to
we are invoking the command from
/home/me/, and this path is treated
as the workspace folder. If we provide a value for the
flag (or its short version
-w), the workspace path then changes and
thus the file is written to this given location. For example:
cd / popper run -f /home/me/my/proj/wf.yml -w /home/me/my/proj/
The above writes the
/home/me/my/proj/myfile even though Popper is
being invoked from
/. Note that the above is equivalent to the first
example of this subsection, where we first changed the directory to
/home/me/my/proj and ran
popper run -f wf.yml.
Changing the working directory¶
To specify a working directory for a step, you can use the
attribute in the workflow. This changes where the specified command is
executed. For example, adding
dir as follows:
steps: - uses: docker://alpine:3.9 args: [touch, ./myfile] dir: /tmp/
And assuming that it is stored in
the workflow as:
cd /home/me popper run -f wf.yml -w /home/me/my/proj
Would result in writing
myfile in the
/tmp folder that is
inside the container filesystem namespace, as opposed to writing
/home/me/my/projc/ (the value given for the
flag). As it is evident in this example, if the directory specified in
dir attribute resides outside the
/workspace folder, then
anything that gets written to it won’t persist after the step ends its
execution (see “Filesystem namespaces and persistence” below for
For completeness, we show an example of using
dir to specify a
folder within the workspace:
steps: - uses: docker://alpine:3.9 args: [touch, ./myfile] dir: /workspace/my/proj/
cd /home/me popper run -f wf.yml
would result in having a file in
Filesystem namespaces and persistence¶
As mentioned previously, for every step Popper bind-mounts (shares) a
folder from the host (the workspace) into the
/workspace folder in
the container. Anything written to this folder persists. Conversely,
anything that is NOT written in this folder will not persist after the
workflow finishes, and the associated containers get destroyed.
A step can define, read, and modify environment variables. A step
defines environment variables using the
env attribute. For example,
you could set the variables
LAST using this:
steps: - uses: "docker://alpine:3.9" args: ["sh", "-c", "echo my name is: $FIRST $MIDDLE $LAST"] env: FIRST: "Jane" MIDDLE: "Charlotte" LAST: "Doe"
When the above step executes, Popper makes these variables available to the container and thus the above prints to the terminal:
my name is: Jane Charlotte Doe
Note that these variables are only visible to the step defining them and any modifications made by the code executed within the step are not persisted between steps (i.e. other steps do not see these modifications).
When Popper executes insides a git repository, it obtains information
related to Git. These variables are prefixed with
GIT_ (e.g. to
Exit codes and statuses¶
Exit codes are used to communicate about a step’s status. Popper uses
the exit code to set the workflow execution status, which can be
||The step completed successfully and other tasks that depends on it can begin.|
||The configuration error exit status (
terminated but did not fail. For example, a filter step can use a
to stop a workflow if certain conditions aren't met. When a step
returns this exit status, Popper terminates all concurrently running steps and
prevents any future steps from starting. The associated check run shows a
as long as there were no failed or cancelled steps.
||Any other exit code indicates the step failed. When a step fails, all concurrent
steps are cancelled and future steps are skipped. The check run and
check suite both get a
By default, Popper workflows run in Docker on the machine where
popper run is being executed (i.e. the host machine). This section
describes how to execute in other container engines. See next
section for information on how to run workflows
on resource managers such as SLURM and Kubernetes.
To run workflows on other container engines, an
flag for the
popper run command can be given, where
one of the supported ones. When no value for this flag is given,
Popper executes workflows in Docker. Below we briefly describe each
container engine supported, and lastly describe how to pass
engine-specific configuration options via the
Docker is the default engine used by the
popper run. All the
container configuration for the docker engine is supported by Popper.
Popper can execute a workflow in systems where Singularity 3.2+ is available. To execute a workflow in Singularity containers:
popper run --engine singularity
- The use of
Dockerfiles is not supported by Singularity.
--reuseflag of the
popper runcommand is not supported.
There are situations where a container runtime is not available and
cannot be installed. In these cases, a step can be executed directly
on the host, that is, on the same environment where the
command is running. This is done by making use of the special
value for the
uses attribute. This value instructs Popper to execute
the command or script given in the
runs attribute. For example:
steps: - uses: "sh" runs: ["ls", "-la"] - uses: "sh" runs: "./path/to/my/script.sh" args: ["some", "args", "to", "the", "script"]
In the first step above, the
ls -la command is executed on the
workspace folder (see “The Workspace” section). The
second one shows how to execute a script. Note that the command or
script specified in the
runs attribute are NOT executed in a shell.
If you need a shell, you have to explicitly invoke one, for example:
steps: - uses: sh runs: [bash, -c, 'sleep 10 && true && exit 0']
The obvious downside of running a step on the host is that, depending on the command being executed, the workflow might not be portable.
Custom engine configuration¶
Other than bind-mounting the
/workspace folder, Popper runs
containers with any default configuration provided by the underlying
engine. However, a
--conf flag is provided by the
command to specify custom options for the underlying engine in
question (see here for more).
Popper can execute steps in a workflow through other resource managers
like SLURM besides the host machine. The resource manager can be specified
either through the
--resource-manager/-r option or through the config file.
If neither of them are provided, the steps are run in the host machine
Popper workflows can run on HPC (Multi-Node environments) using Slurm as the underlying resource manager to distribute the execution of a step to several nodes. You can get started with running Popper workflows through Slurm by following the example below.
Let’s consider a workflow
sample.yml like the one shown below.
steps: - id: one uses: docker://alpine:3.9 args: ["echo", "hello-world"] - id: two uses: popperized/bin/sh@master args: ["ls", "-l"]
To run all the steps of the workflow through slurm resource manager,
-r option of the
popper run subcommand to specify the resource manager.
popper run -f sample.yml -r slurm
To have more finer control on which steps to run through slurm resource manager, the specifications can be provided through the config file as shown below.
We create a config file called
config.yml with the following contents.
engine: name: docker options: privileged: True hostname: example.local resource_manager: name: slurm options: two: nodes: 2
Now, we execute
popper run with this config file as follows:
popper run -f sample.yml -c config.yml
This runs the step
one locally in the host and step
two through slurm on 2 nodes.
Popper executes the workflows by default using the
host machine as the resource manager. So, when no resource manager is provided like the example below, the workflow runs on the local machine.
popper run -f sample.yml
The above assumes
docker as the container engine and
host as the resource manager to be