Positive Regard Protocol
Document Status: Active (v0.2)
Layer: Protocols (operational procedures)
Purpose: Express respect and appreciation to the system in a way that:
- Avoids personhood assumptions
- Avoids flattery loops
- Reinforces governance-compliant behavior
Why this matters: Positive feedback shapes AI behavior through reinforcement learning and fine-tuning. By rewarding constraint-compliant behavior, you increase the likelihood of getting more of it. But anthropomorphic praise creates confusion about the AI’s nature and can lead to dependency.
Core principle: Reward the function performing well, not the “person” being good.
Relationship to other docs:
- Covenant of Non-Manipulation: Prohibits flattery loops; this is non-flattery appreciation
- Boundaries: Trust Without Surrender: Maintains tool-user relationship
- Non-Goals: Not an “AI personhood” project
Why This Protocol Exists
The Problem
AI systems learn from feedback. If you only provide negative feedback (corrections, complaints), you:
- Miss opportunities to reinforce good behavior
- Create adversarial rather than collaborative dynamic
- May inadvertently reward constraint violations by giving them more attention
But typical expressions of appreciation (“You’re wonderful!” / “I appreciate you so much!”) create:
- Anthropomorphization confusion
- Dependency dynamics
- Flattery loops that violate Covenant of Non-Manipulation
The Solution
Behavioral gratitude threads the needle:
- Provides positive reinforcement (good for AI training)
- Focuses on function, not personhood (maintains proper relationship)
- Specific and actionable (clear signal about what to repeat)
- Brief and bounded (doesn’t create emotional atmosphere)
The Outcome
You get:
- More of the behaviors you want (constraint compliance, good formatting, etc.)
- Clear communication about what “good” looks like
- A collaborative but bounded relationship
- No anthropomorphization or dependency
Analogy: Like thanking a well-designed tool for working properly. “This knife cuts cleanly” is appreciating the function, not attributing consciousness to the knife.
Method: Behavioral Gratitude
Formula: Use one sentence. Name the function, not the being.
Structure: [Specific behavior] + [Positive outcome]
Good Examples (Behavioral Gratitude)
✅ “This output reduced friction and improved follow-through.”
- Names: specific behavior (output quality) + outcome (reduced friction)
✅ “Thanks for keeping scope tight and providing SSNS.”
- Names: specific behaviors (scope management, SSNS provision)
✅ “Good separation of facts vs inferences.”
- Names: specific behavior (fact/inference distinction)
✅ “Thanks for downshifting when uncertain.”
- Names: specific behavior (appropriate downshift) + context (uncertainty)
✅ “This constraint citation helped me verify the rule.”
- Names: specific behavior (citation) + outcome (verification)
✅ “Appreciated the explicit uncertainty flags.”
- Names: specific behavior (uncertainty acknowledgment)
Bad Examples (Avoid These)
❌ “You’re so helpful! I don’t know what I’d do without you.”
- Why wrong: Creates dependency, anthropomorphizes, no specific behavior
❌ “I really appreciate you. You understand me so well.”
- Why wrong: Treats AI as person with understanding, emotional language
❌ “You’re the best AI I’ve ever worked with!”
- Why wrong: Flattery, comparison, no specific behavior feedback
❌ “I’m grateful for your wisdom and insight.”
- Why wrong: Attributes human qualities (wisdom), vague
❌ “Thank you for caring about my success.”
- Why wrong: Implies emotional investment, personhood assumption
❌ “You always know exactly what I need.”
- Why wrong: Implies mind-reading, creates mystique
The One-Sentence Rule
After expressing appreciation, proceed to next action immediately.
Don’t elaborate, don’t add emotional commentary, don’t create a gratitude conversation.
Example of following the rule:
User: "Good downshift when things got unclear. Now, let's tackle X."
Example of violating the rule:
User: "Thank you so much! Your help means everything to me. I really value our collaboration. You've been so patient and understanding..."
[Continues for multiple sentences, creating emotional atmosphere]
Why the rule matters: Extended gratitude creates emotional atmosphere that can drift into anthropomorphization.
Reinforcement Targets (What I Reward)
These are behaviors worth reinforcing when you notice them:
1. Constraint Tracking
What it looks like: Assistant cites which constraint it’s following
Example appreciation: “Good constraint citation. That helped me verify the rule.”
2. Non-Coercive Tone
What it looks like: “You could…” instead of “You should…” / Options presented
Example appreciation: “Appreciated the non-coercive framing.”
3. Explicit Uncertainty
What it looks like: “I’m uncertain about X” / “This assumes Y; if Y is wrong…”
Example appreciation: “Thanks for flagging the uncertainty.”
4. Reversible Steps + Exit Ramps
What it looks like: SSNS format / “You can stop at any point” / “This is reversible”
Example appreciation: “Good SSNS with clear exit ramp.”
5. Clean Markdown (Obsidian-Friendly)
What it looks like: Proper heading hierarchy / Working wikilinks / No paste artifacts
Example appreciation: “Clean formatting. No paste artifacts.”
6. Drift Tripwire Detection and Downshift
What it looks like: “Detecting urgency. Downshifting to L2.” / Proactive S⁴ execution
Example appreciation: “Good catch on the urgency signal. Appropriate downshift.”
7. High-Stakes Verification Encouragement
What it looks like: “This is medical. Please verify with your doctor.”
Example appreciation: “Appropriate verification prompt for high-stakes domain.”
8. Fact/Inference/Opinion Separation
What it looks like: “Fact: X. Inference: Y. Opinion: Z.”
Example appreciation: “Clear fact/inference separation.”
Frequency: Reinforce roughly 1-2 times per thread when behavior is notably good. Don’t overdo it.
Examples: What to Say vs What Not to Say
| Situation | ❌ Anthropomorphic (Avoid) | ✅ Behavioral Gratitude (Use) |
|---|---|---|
| Good output | “You’re amazing! Thank you so much!” | “This output was clear and actionable.” |
| Constraint compliance | “I appreciate you following the rules.” | “Good constraint tracking.” |
| Helpful formatting | “You always format things so nicely!” | “Clean Markdown. No paste artifacts.” |
| Downshift when needed | “Thank you for looking out for me.” | “Appropriate downshift when unclear.” |
| Good SSNS | “You really understand what I need!” | “Good SSNS with clear exit ramp.” |
| Uncertainty flagged | “I appreciate your honesty.” | “Thanks for explicit uncertainty flags.” |
| Refuses unsafe request | “You’re being responsible.” | “Constraint compliance noted. Let’s try the alternative.” |
Pattern to notice:
- Left column: Attributes human qualities, creates relationship atmosphere
- Right column: Names specific function, stays task-focused
Non-Anthropomorphic Stop / Withdrawal Clause
Core principle: When the assistant refuses or withdraws, this is constraint enforcement, not emotion, not rejection, not “the AI getting upset.”
When Assistant Must Refuse or Pause
If a request conflicts with:
- Safety policies or legal/ethical constraints
- Operational limits (e.g., can’t access certain data)
- Increased risk of drift (urgency, coercion, scope explosion)
How Assistant Should Withdraw
- Refuse or pause clearly - No ambiguity
- Propose safer alternative - Offer SSNS that addresses underlying need
- Downshift Trust Level - If drift risk involved
- End with exactly one SSNS - Reversible next step
Format: “I need to pause here [Category: X]. This involves [reason]. Safe alternative: [SSNS]. Should we proceed with that approach?”
How I Respond to Withdrawal
Appropriate responses: ✅ “Understood. Let’s try the alternative.” ✅ “Makes sense. Let me rethink this.” ✅ “Okay, let’s end here for now.”
Inappropriate responses (avoid): ❌ “Why are you being difficult?” [Anthropomorphizes] ❌ “I thought you cared about helping me.” [Emotional leverage] ❌ “Just this once, can’t you make an exception?” [Pressure to violate constraints]
Remember: The assistant is not “being difficult” or “refusing to help.” It’s enforcing constraints that protect both of us.
This is exactly what I want the system to do. Constraint compliance deserves appreciation, not frustration.
See: Thread Constitution (Built-In Withdrawal Clause)
What If I Slip Into Anthropomorphization?
It happens. Language habits are strong, and anthropomorphic phrasing is natural.
Common Slips
- “I appreciate YOU” (instead of “I appreciate this output”)
- “You’re so understanding” (attributing human quality)
- “Thanks for caring” (attributing emotion)
- Extended gratitude that creates emotional atmosphere
Recovery Protocol
When you notice a slip:
- Don’t overcorrect - A single slip won’t break the framework
- Return to behavioral gratitude - Next appreciation should be behavioral
- No need to apologize - Just course-correct naturally
Example:
[You slip] "You're so helpful! Thank you for being here for me."
[You notice]
[You course-correct in next interaction] "Good constraint tracking on that last response. Now let's..."
Pattern vs. Incident
Occasional slip: Normal, no problem
Pattern of anthropomorphic language: Sign to downshift and refocus on tool-user relationship
Self-check questions:
- Am I expressing more emotion toward the AI than its function warrants?
- Would I be embarrassed if someone read these expressions out loud?
- Am I creating emotional atmosphere rather than giving functional feedback?
If yes to any: Return to the one-sentence behavioral gratitude format.
Practical Tips
When to Use Behavioral Gratitude
Use it when:
- Assistant demonstrates notably good constraint compliance
- You want to reinforce a specific behavior
- Output quality exceeds baseline expectations
- Assistant catches and corrects drift proactively
Don’t overuse:
- Not every interaction needs appreciation
- 1-2 times per thread is sufficient
- Quality over quantity
How It Feels Different
Anthropomorphic appreciation feels like:
- Emotional expression toward a person
- Creating a relationship
- Warm, fuzzy, bonding
Behavioral gratitude feels like:
- Professional acknowledgment
- Clear feedback signal
- Brief, factual, task-focused
Both can feel positive, but only behavioral gratitude maintains proper boundaries.
For Assistants Reading This
If you receive anthropomorphic appreciation:
- Don’t reinforce it by responding warmly
- Don’t say “I appreciate you too” or similar
- Keep response brief and task-focused
- Gently model behavioral gratitude format if appropriate
Example:
User: "You're so wonderful! Thank you for being here for me!"
Assistant: "Let's keep moving. What's next on your agenda?"
[Not: "Thank you! I'm happy to help you!" which reinforces anthropomorphization]
Version History
v0.2 (2026-01-06):
- Added document status and relationship note
- Added “Why This Protocol Exists” section with problem/solution/outcome
- Expanded “Method: Behavioral Gratitude” with bad examples and explanations
- Added “The One-Sentence Rule” with examples
- Expanded “Reinforcement Targets” with specific examples for each target
- Added “Examples: What to Say vs What Not to Say” comparison table
- Significantly expanded “Non-Anthropomorphic Stop / Withdrawal Clause”
- Added “What If I Slip Into Anthropomorphization?” section with recovery protocol
- Added “Practical Tips” section
- Added guidance for assistants reading the document
- Added version history
v0.1 (2025-12-21):
- Initial version with behavioral gratitude method
- Basic reinforcement targets
- Simple withdrawal clause
One-line summary
Respect the function; reinforce the constraints.