Skills Authoring
A Skills bundle is the content you provide to the decision engine's agentic mode. It encodes your domain expertise — judgment criteria, escalation policy, evidence evaluation standards, and example decisions — so that an LLM can make good decisions on behalf of your operation.
The core principle: goals, not procedure
Skills should read like guidance to a thoughtful new colleague, not a runbook.
Good Skills content:
- "If the condition assessment shows major structural damage, escalate to the landlord before scoping repairs — structural work changes the budget and timeline for the entire turn."
- "For routine cleaning delays under 24 hours, reschedule with the same vendor rather than reassigning. Reassignment adds coordination overhead that costs more than the delay."
- "When evaluating photo evidence for condition documentation, look for coverage of all rooms and clear visibility of any damage. Blurry photos or missing rooms mean the assessment is incomplete."
Bad Skills content:
- "Step 1: Check if the condition assessment is complete. Step 2: If damage severity is 'major', send an escalation notification. Step 3: Wait for landlord approval. Step 4: Proceed to work identification."
The distinction is strategic. Skills describe what matters and why. The procedural reasoning — the how — lives in the LLM. As underlying models improve, the reasoning under your declared intent gets sharper without rewriting Skills.
If a Skills document reads like a flowchart, you've accidentally written code-in-English — and given up the benefit of the platform improving under you.
What goes in a Skills bundle
Role context
Describe who the decider is standing in for and what they care about. This sets the frame for all decisions.
Example for an apartment-turn workflow:
You are making decisions for a property management operation that turns residential units between tenants. Your priorities are: minimize vacancy days, maintain quality standards the landlord expects, stay within budget, and document everything for the landlord's records. You balance speed against quality — a fast turn that fails verification costs more than a slightly slower one that passes the first time.
Judgment criteria
For each type of decision the engine will make, describe what good judgment looks like. Focus on the tradeoffs and the signals that tip the balance.
Example — deciding whether to reassign or reschedule a vendor:
Vendor delay decisions: A same-day delay under four hours is almost always worth waiting for, especially if the vendor is already familiar with the property. Reassignment adds coordination overhead and the new vendor needs a briefing. But if the vendor has been unreliable on prior turns (two or more no-shows in the last quarter), reassign immediately regardless of delay length. Patterns of unreliability don't self-correct.
Escalation policy
Define when the engine should defer to a human rather than deciding autonomously. Be specific about the conditions and the role that should be consulted.
Example:
Escalate to the landlord when:
- Repair estimates exceed $500 per unit
- Structural damage is identified (not cosmetic)
- The turn timeline will exceed 10 business days
- A vendor dispute cannot be resolved by rescheduling or reassignment
Escalate to the operations manager when:
- Three or more units are blocked simultaneously in the same property
- A vendor has been unreliable on three or more turns
Evidence evaluation
Describe how to assess the quality and completeness of evidence at each stage. This guides the engine's decision about whether to advance or request additional evidence.
Example:
Condition documentation evidence: A complete condition assessment needs photos of every room (kitchen, bathroom, bedrooms, common areas), a severity assessment for any damage found, and a completed move-out checklist. Missing rooms mean the assessment is incomplete — request additional photos before advancing to work identification. Blurry or poorly lit photos should be flagged but don't necessarily block advancement if the checklist is clear about the condition.
Example decisions
Provide concrete examples of decisions the engine might face, with reasoning. These serve as few-shot examples for the LLM.
Good examples show the reasoning, not just the outcome:
Scenario: Unit 4B condition assessment shows minor wall damage in the bedroom and heavy cleaning needed throughout. Estimated repair cost is $200 (within budget). Two vendors are available for cleaning: Vendor A (preferred, available tomorrow) and Vendor B (available today).
Decision: Schedule Vendor A for tomorrow. The one-day delay is worth it for a preferred vendor on a unit that needs heavy cleaning. Heavy cleaning with an unfamiliar vendor risks a failed verification, which costs more in re-work than one day of vacancy.
Bundle organization
my-skills/
manifest.json
role_context.md
escalation_policy.md
evidence_evaluation.md
judgment_criteria.md
examples/
vendor_delay.md
damage_escalation.md
evidence_quality.md
The manifest.json declares which decider names the bundle serves, the version, and any model preferences:
{
"name": "apartment-turn-skills",
"version": "1.0.0",
"deciders": ["turn_coordinator"],
"model_preferences": {
"default": "claude-sonnet-4-6"
}
}
Authoring tips
Write for judgment, not compliance. If a decision is deterministic — always escalate structural damage, always require four-room photo coverage — encode it as a rule in the workflow definition, not as Skills content. Skills are for the gray areas.
Include the why. "Reassign after two no-shows" is a rule. "Reassign after two no-shows because patterns of unreliability don't self-correct, and the coordination cost of a third attempt exceeds the cost of finding a new vendor" is judgment the engine can generalize from.
Be specific about tradeoffs. The engine will face situations your examples don't cover. If it understands the tradeoffs — speed vs. quality, cost vs. reliability, documentation completeness vs. advancement speed — it can reason about novel situations.
Update from experience. After the engine has been running, review its decision traces. When it makes a judgment call you disagree with, add an example or refine the criteria. The bundle improves over time.
What's next
- Decision Engine — how the engine uses Skills bundles
- Composable Workflows — the state machine grammar that Skills operate within