Welcome!¶
Before you begin¶
This website is your go-to resource for all our tutorials and guides. Before diving in, you might want to explore:
- Our main website at simmate.org
- Our source code at github.com/jacksund/simmate
What is Simmate?¶
Simmate is a full-stack framework for chemistry research. It helps you calculate properties and explore third-party databases for both for molecular and crystalline systems. For experts, it also provides a toolbox to build out your own chemistry applications.
The computational side of chemistry research can be intimidating because there are so many programs to choose from, and it's challenging to select and combine them for your specific project. Simmate aims to be a link between the many diverse programs, databases, and utilities out there, and in turn, simplify the setup and execution of chemistry projects. Third-party datasets and tools are ingetrated into Simmate as "apps", where there is a growing list of supported software and databases.
You can mix & match these apps to meet your research needs and to even build out your own custom applications. Simmate includes a core chemical toolkit, workflow management system, database ORM, and web component library -- giving you a framework with essential frontend and backend tools. Our goal is for Simmate to be "batteries-included", so all of these apps & tools are available within the base installation. To learn more about Simmate's scope & design, as well as how it compares to other popular chemistry frameworks, visit our comparisons page.
A Sneak-Peak of Features¶
Prebuilt Workflows¶
Simmate comes with ready-to-use workflows for most common calculated properties, ranging from simple XRD pattern prediction to intensive dynamic simulations. All workflows can be submitted via a website user-interface, the command-line, or custom python scripts:
# in example.yaml
workflow_name: relaxation.vasp.matproj
structure: NaCl.cif
command: mpirun -n 8 vasp_std > vasp.out
simmate workflows run example.yaml
simmate workflows run-quick relaxation.vasp.matproj --structure NaCl.cif
# in example.toml
workflow_name = "relaxation.vasp.matproj"
structure = "NaCl.cif"
command = "mpirun -n 8 vasp_std > vasp.out"
simmate workflows run example.toml
from simmate.workflows.utilities import get_workflow
workflow = get_workflow("relaxation.vasp.matproj")
status = workflow.run(structure="NaCl.cif")
result = status.result()
https://simmate.org/workflows/static-energy/vasp/matproj/submit
Scalable Orchestration¶
Simmate can easily scale along with your project, whether you're on a single computer or across thousands of machines. It supports various setups, including university clusters with SLURM or PBS, and cloud platforms using Kubernetes and Docker.
from simmate.engine import workflow
@workflow
def hello(name, **kwargs): # (1)
print(f"Hello {name}!")
print(f"Extra parameters configured for you: {kwargs}")
- We always use
**kwargs
because Simmate automatically provides extra variables at runtime, such asrun_id
anddirectory
.
state = workflow.run_cloud(name="Jack") # (1)
result = state.result() # (2)
- On your local computer, schedule your workflow run. This returns a "future-like" object.
- This will wait until the job completes and return the result. Note, the job won't run until you start a worker that is connected to the same database
simmate engine start-worker # (1)
- In a separate terminal or even on a remote HPC cluster, you can start a worker that will start running any scheduled jobs in your database.
Full-Feature Database¶
Simmate's database can manage your perosnal data while also integrating with third-party databases such as COD, Materials Project, JARVIS, and others. It automatically constructs tables with common data types by including a wide range of standard columns. You can then access this data through a web interface, REST API, SQL, or Python ORM:
from simmate.database import connect # (1)
from simmate.apps.materials_project.models import MatprojStructure
# Query the database
structures = MatprojStructure.objects.filter( # (2)
nsites__gte=3,
energy__isnull=False,
density__range=(1,5),
elements__icontains='"C"',
spacegroup__number=167,
).all()
# Convert to excel, a pandas dataframe, toolkit structures, etc.
df = structures.to_dataframe()
structures = structures.to_toolkit()
- Follow the database tutorial to build our initial database with the command
simmate database reset
- This filter retrieves structures with: greater or equal to 3 sites, an energy value, density between 1 and 5, the element Carbon, and spacegroup number 167
SELECT *
FROM materials_project__structures
WHERE nsites >= 3
AND energy IS NOT NULL
AND density BETWEEN 1 AND 5
AND elements ILIKE '%"C"%'
AND spacegroup_number = 167;
https://simmate.org/third-parties/MatprojStructure/?format=json
https://simmate.org/third-parties/MatprojStructure/
Beginner-Friendly Toolkit¶
Simmate includes a toolkit
that builds off popular packages in the chemistry community, specifically rdkit
for molecules and pymatgen
for periodic crystals. The end result is a toolkit for rapidly prototyping analyses. Here is an eample script that is straightforward & clean with Simmate's toolkit, but less intuitive & Pythonic for others:
from simmate.toolkit import Molecule
# STEP 1
molecules = Molecule.from_sdf_file("example.sdf")
# STEP 2
molecules_filtered = []
for molecule in molecules:
if molecule.num_stereocenters > 3:
continue
if molecule.num_atoms_heavy > 30:
continue
if "Na" in molecule.elements:
continue
molecules_filtered.append(molecule)
# STEP 3
inchi_keys = [m.to_inchi_key() for m in molecules_filtered]
from rdkit import Chem
from rdkit.Chem import FindMolChiralCenters, Descriptors
# STEP 1
molecules = []
with Chem.SDMolSupplier("example.sdf") as supplier:
for molecule in supplier:
if mol is None:
continue
molecules.append(molecule)
# STEP 2
molecules_filtered = []
for molecule in molecules:
chiral_centers = FindMolChiralCenters(
molecule,
force=True,
includeUnassigned=True,
useLegacyImplementation=False,
)
if len(chiral_centers) > 3:
continue
nheavy = Descriptors.HeavyAtomCount(molecule)
if nheavy > 30:
continue
has_na = False # false until proven otherwise
for atom in molecule.GetAtoms():
if atom.GetSymbol() == "Na":
has_na = True
break
if has_na:
continue
molecules_filtered.append(molecule)
# STEP 3
inchi_keys = [Chem.MolToInchiKey(m) for m in molecules_filtered]
from openeye import oechem
from openeye import oeiupac
# STEP 1
molecules = []
with oechem.oemolistream("example.sdf") as ifs:
for mol in ifs.GetOEMols():
if mol is None:
continue
molecules.append(mol)
# STEP 2
molecules_filtered = []
for mol in molecules:
stereo_count = sum(1 for atom in mol.GetAtoms() if atom.IsChiral())
if stereo_count > 3:
continue
heavy_atom_count = sum(
1 for atom in mol.GetAtoms() if atom.GetAtomicNum() > 1
)
if heavy_atom_count > 30:
continue
has_na = any(atom.GetAtomicNum() == 11 for atom in mol.GetAtoms())
if has_na:
continue
molecules_filtered.append(mol)
# STEP 3
inchi_keys = [oeiupac.OECreateInChIKey(m) for m in molecules_filtered]
Need help?¶
Post your questions and feedback in our discussion section.