The Supervised-Staged-Shell Workflow¶
S3 = Supervised + Staged + Shell¶
This type of workflow helps to supervise a staged workflow involving some shell command.
Let's breakdown what this means...
A shell command is a single call to some external program. For example, VASP requires that we call the "vasp_std > vasp.out" command in order to run a calculation. We consider calling external programs a staged task made up of three steps:
- setup = writing any input files required for the program
- execute = actually calling the command and running our program
- workup = loading data from output files back into python
And for supervising the task, this means we monitor the program while the execution stage is running. So once a program is started, Simmate can check output files for common errors/issues -- even while the other program is still running. If an error is found, we stop the program, fix the issue, and then restart it.
graph LR
A[Start] --> B[setup];
B --> C[execute];
C --> D[workup];
D --> E[Has errors?];
E -->|Yes| B;
E -->|No| F[Done!];
Warning
This diagram is slightly misleading because the "Has Errors?" check also happens while the execute step is still running. Therefore, you can catch errors before your program even finishes & exits!
Running S3Workflows is the same as normal workflows (e.g. using the run
method),
and this entire process of supervising, staging, and shell execution is done for you!
S3Workflows for common programs¶
For programs that are commonly used in material science, you should also read
through their guides in the "Third-party Software" section. If your program is
listed there, then there is likely a subclass of S3Workflow
already built
for you. For example, VASP user can take advantage of VaspWorkflow
to build
workflows.
Building a custom S3Workflow¶
Tip
Before starting a custom S3Workflow
, make sure you have read the section
above this (on S3Workflows for common programs like VASP). You should also
have gone through the guides on building a custom Workflow
.
Simple command call¶
The most basic example of a S3Workflow is just calling some command -- without doing anything else (no input files, no error handling, etc.).
Unlike custom Workflows
were we defined a run_config
method, S3Workflows
have a pre-built run_config
method that carries out the different stages and
monitoring of a workflow for us. So all the work is already done for us!
As an example, let's just use the command echo
to print something:
from simmate.engine import S3Workflow
class Example__Echo__SayHello(S3Workflow):
use_database = False # we aren't using a custom table for now
monitor = False # there is no error handling yet
command = "echo Hello"
# behaves like a normal workflow
state = Example__Echo__SayHello.run()
result = state.result()
Tip
Note that we used "Echo" in our workflow name. This helps the user see what commands or programs will be called when a workflow is ran.
Custom setup and workup¶
Now what if we'd like to write input files or read output files that are created?
Here, we need to update our setup
and workup
methods:
from simmate.engine import S3Workflow
class Example__Echo__SayHello(S3Workflow):
use_database = False # we aren't using a custom table for now
monitor = False # there is no error handling yet
command = "echo Hello > output.txt" # adds "Hello" into a new file
@classmethod
def setup(cls, directory, custom_parameter, **kwargs):
# The directory given is a pathlib.Path object for the directory
# that the command will be called in
print("I'm setting things up!")
print(f"My new setting value is {cls.some_new_setting}")
print(f"My new parameter value is {custom_parmeter}")
return # no need to return anything. Nothing will be done with it.
@staticmethod
def workup(directory):
# The directory given is a pathlib.Path object for the directory
# that the command will be called in
# Simply check that we have a new file
output_file = directory / "output.txt"
assert output_file.exists()
print("I'm working things up!")
return "Done!"
task = Example__Echo__SayHello()
result = task.run()
There are a two important things to note here:
- It's optional to write new
setup
orworkup
methods. But if you do...- Both
setup
andworkup
method should be either a staticmethod or classmethod - Custom
setup
methods require thedirectory
and**kwargs
input parameters. - Custom
workup
methods require thedirectory
input paramter
- Both
- It's optional to set/overwrite attributes. You can also add new ones too.
Note: S3Workflows for a commonly used program (such VaspWorkflow
for VASP)
will often have custom setup
and workup
methods already defined for you.
You can update/override these as you see fit.
For a full (and advanced) example of a subclass take a look at
simmate.apps.vasp.workflows.base.VaspWorkflow
and the tasks that use it like
simmate.apps.vasp.workflows.relaxation.matproj
.
Custom error handling¶
TODO -- Contact our team if you would like us to prioritize this guide
Alternatives to the S3Workflow¶
For experts, this class can be viewed as a combination of prefect's ShellTask,
a custodian Job, and Custodian monitoring. When subclassing this, we can absorb
functionality of pymatgen.io.vasp.sets
too. By merging all of these together
into one class, we make things much easier for users and creating new Tasks.