Chemistry Query Agent v1.4.1
name: pharmaclaw-chemistry-query
by cheminem · published 2026-03-22
$ claw add gh:cheminem/cheminem-pharmaclaw-chemistry-query---
name: pharmaclaw-chemistry-query
description: Chemistry agent skill for PubChem API queries (compound info/properties, structures/SMILES/images, synthesis routes/references) + RDKit cheminformatics (SMILES to molecule props/logP/TPSA, 2D PNG/SVG viz, Morgan fingerprints, retrosynthesis/BRICS disconnects, multi-step synth planning). Use for chemistry tasks involving compounds, molecules, structures, PubChem data, RDKit analysis, SMILES processing, synthesis routes, retrosynthesis, reaction simulation. Triggers on chemistry, compounds, molecules, chemical data/properties, PubChem, RDKit, SMILES, structures, synthesis, reactions, retrosynthesis, synth plan/route.
type: code
dependencies:
python: ">=3.10"
python-packages:
- rdkit
- gradio
- pandas
- Pillow
system:
- java (JRE 8+ for OPSIN, optional — only needed for IUPAC name→SMILES conversion)
external-apis:
- PubChem REST API (public, no key required)
- ChEMBL API (public, no key required)
- PubMed/NCBI E-utilities (public, no key required)
notes: >
OPSIN JAR (13.8MB) is auto-downloaded on first use of IUPAC name conversion
with pinned SHA-256 checksum verification (d25bc08f...). All other functionality
works without Java/OPSIN.
---
# Chemistry Query Agent v1.4.1
Overview
Full-stack chemistry toolkit combining PubChem data retrieval with RDKit molecule processing, visualization, analysis, retrosynthesis, and synthesis planning. All outputs are structured JSON for easy downstream chaining. Generates PNG/SVG images on demand.
**Key capabilities:**
Quick Start
# PubChem compound info
exec python scripts/query_pubchem.py --compound "aspirin" --type info
# Molecular properties from SMILES
exec python scripts/rdkit_mol.py --smiles "CC(=O)Oc1ccccc1C(=O)O" --action props
# Retrosynthesis
exec python scripts/rdkit_mol.py --target "CC(=O)Oc1ccccc1C(=O)O" --action retro --depth 2
# Full chain (name → props + draw + retro)
exec python scripts/chain_entry.py --input-json '{"name": "caffeine", "context": "user"}'Scripts
`scripts/query_pubchem.py`
PubChem REST API queries with automatic name→CID resolution and timeout handling.
--compound <name|CID> --type <info|structure|synthesis|similar> [--format smiles|inchi|image|json] [--threshold 80]`scripts/rdkit_mol.py`
RDKit cheminformatics engine. Resolves names via PubChem automatically.
--smiles <SMILES> --action <props|draw|fingerprint|similarity|substruct|xyz|react|retro|plan>| Action | Description | Key Args |
|--------|-------------|----------|
| props | MW, logP, TPSA, HBD, HBA, rotB, aromRings | `--smiles` |
| draw | 2D PNG/SVG (300×300) | `--smiles --output file.png --format png\|svg` |
| retro | BRICS recursive retrosynthesis | `--target <SMILES\|name> --depth N` |
| plan | Multi-step retro route | `--target <SMILES\|name> --steps N` |
| react | Forward reaction via SMARTS | `--reactants "smi1 smi2" --smarts "<SMARTS>"` |
| fingerprint | Morgan fingerprint bitvector | `--smiles --radius 2` |
| similarity | Tanimoto similarity scoring | `--query_smiles --target_smiles "smi1,smi2"` |
| substruct | Substructure matching | `--query_smiles --target_smiles "smi1,smi2"` |
| xyz | 3D coordinates (MMFF optimized) | `--smiles` |
`scripts/chain_entry.py`
Standard agent chain interface. Accepts `{"smiles": "...", "context": "..."}` or `{"name": "...", "context": "..."}`. Returns unified JSON with props, visualization, and retrosynthesis.
python scripts/chain_entry.py --input-json '{"name": "sotorasib", "context": "user"}'Output schema:
{
"agent": "chemistry-query",
"version": "1.4.0",
"smiles": "<canonical>",
"status": "success|error",
"report": {"props": {...}, "draw": {...}, "retro": {...}},
"risks": [],
"viz": ["path/to/image.png"],
"recommend_next": ["pharmacology", "toxicology"],
"confidence": 0.95,
"warnings": [],
"timestamp": "ISO8601"
}`scripts/templates.json`
21 named reaction templates with SMARTS, expected yields, conditions, and references. Includes: Suzuki, Heck, Buchwald-Hartwig, Grignard, Wittig, Diels-Alder, Click, Sonogashira, Negishi, and more.
Chaining
1. **Name → Full Profile:** `chain_entry.py` with `{"name": "ibuprofen"}` → props + draw + retro
2. **Chemistry → Pharmacology:** Output feeds directly into `pharma-pharmacology-agent`
3. **Retro + Viz:** Get precursors, then draw each one
4. **Suzuki Test:** `--action react --reactants "c1ccccc1Br c1ccccc1B(O)O" --smarts "[c:1][Br:2].[c:3][B]([c:4])(O)O>>[c:1][c:3]"`
Tested With
All features verified end-to-end with RDKit 2024.03+:
| Molecule | SMILES | Tests Passed |
|----------|--------|-------------|
| Caffeine | `CN1C=NC2=C1C(=O)N(C(=O)N2C)C` | info, structure, props, draw, retro, plan, chain |
| Aspirin | `CC(=O)Oc1ccccc1C(=O)O` | info, structure, props, draw, retro, plan, chain |
| Sotorasib | PubChem name lookup | info, structure, props, draw, retro, chain |
| Ibuprofen | PubChem name lookup | info, structure, props, chain |
| Invalid SMILES | `XXXINVALID` | Graceful JSON error |
| Empty input | `{}` | Graceful JSON error |
Resources
`scripts/advanced_chem.py`
Advanced cheminformatics engine with 6 Tier 1 capabilities.
--action <standardize|descriptors|scaffold|mcs|mmpa|chemspace> --smiles <SMILES> [options]| Action | Description | Key Args |
|--------|-------------|----------|
| standardize | Salt stripping, charge normalization, tautomer enumeration | `--smiles` |
| descriptors | 217+ molecular descriptors (RDKit full set), QED, SA Score, Lipinski/Veber rules | `--smiles --descriptor_set all\|druglike\|physical\|topological` |
| scaffold | Murcko scaffold extraction, generic scaffolds, diversity analysis, R-group decomposition | `--smiles` or `--target_smiles "smi1,smi2,..."` `--rgroup_core <SMARTS>` |
| mcs | Maximum Common Substructure across 2+ molecules | `--target_smiles "smi1,smi2,..."` |
| mmpa | Matched Molecular Pair Analysis — find single-point transformations | `--target_smiles "smi1,smi2,..."` |
| chemspace | Chemical space visualization (PCA/t-SNE/UMAP scatter plot PNG) | `--target_smiles "smi1,smi2,..." --method pca\|tsne\|umap --output plot.png` |
**Examples:**
# Standardize a salt form
python scripts/advanced_chem.py --action standardize --smiles "[Na+].CC(=O)[O-]"
# Full descriptors (217+)
python scripts/advanced_chem.py --action descriptors --smiles "CC(=O)Oc1ccccc1C(=O)O" --descriptor_set all
# Scaffold diversity of a set
python scripts/advanced_chem.py --action scaffold --target_smiles "CC(=O)Oc1ccccc1C(=O)O,CN1C=NC2=C1C(=O)N(C(=O)N2C)C,CC(C)Cc1ccc(cc1)C(C)C(=O)O"
# MCS of aspirin and salicylic acid
python scripts/advanced_chem.py --action mcs --target_smiles "CC(=O)Oc1ccccc1C(=O)O,c1ccccc1C(=O)O"
# Matched molecular pairs
python scripts/advanced_chem.py --action mmpa --target_smiles "c1ccc(CC(=O)O)cc1,c1ccc(CCC(=O)O)cc1"
# Chemical space PCA plot
python scripts/advanced_chem.py --action chemspace --target_smiles "CC(=O)Oc1ccccc1C(=O)O,CN1C=NC2=C1C(=O)N(C(=O)N2C)C,c1ccccc1" --method pca --output space.pngChangelog
**v2.0.0** (2026-02-28)
- Molecular Standardization & Tautomer Enumeration (salt stripping, charge normalization, canonical tautomers)
- Extended Descriptors (217+ RDKit descriptors, QED, SA Score, Lipinski, Veber)
- Scaffold Analysis (Murcko, generic scaffolds, diversity ratio, R-group decomposition)
- Maximum Common Substructure (rdFMCS with coverage per molecule)
- Matched Molecular Pair Analysis (rdMMPA fragmentation, transformation detection)
- Chemical Space Visualization (PCA/t-SNE/UMAP with matplotlib scatter plots)
**v1.4.1** (2026-02-25)
**v1.4.0** (2026-02-14)
**v1.3.0**
**v1.2.0**
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...