feat(project): deprecate WHALE.md; add .codewhale/constitution.json authority layer

Splits repo-level guidance into two clear artifacts and deprecates the
confusing WHALE.md concept (overlapped with AGENTS.md):

- AGENTS.md is the canonical cross-agent project-instructions file.
- .codewhale/constitution.json is the CodeWhale-specific repo authority /
  prioritization policy (when local sources conflict, which to trust first; what
  to verify before claiming done). Rendered into the system prompt as a
  higher-authority <codewhale_repo_constitution> block; takes precedence over a
  legacy WHALE.md.

WHALE.md migration (compat-preserving):
- AGENTS.md now ranks above WHALE.md in both project and global discovery; with
  both present, AGENTS.md wins.
- WHALE.md is still read as a legacy fallback, but now emits a deprecation
  warning and is never created or recommended (init.rs no longer suggests it).
- Discovery/docs updated; the global CodeWhale Constitution in prompts/base.md
  is unaffected (different thing).

constitution.json:
- New RepoConstitution (serde, all fields optional, unknown fields ignored,
  schema_version checked). Discovered at .codewhale/constitution.json in the
  workspace or any parent up to the git root. Malformed JSON warns, never panics.
- Loaded after the auto-generate fallback so it can't be clobbered.

.gitignore: ignore .codewhale/ contents at any depth EXCEPT the committed
constitution.json (a directory exclude can't be negated, so **/.codewhale/* +
negation). init.rs writes the same pattern for new repos. Dogfood: this repo's
.codewhale/constitution.json added.

find_git_root made pub(crate) and reused (no duplicate loader).

Tests: AGENTS-over-WHALE precedence, WHALE legacy-read-with-warning,
constitution render + system-block surfacing, malformed-constitution warning,
gitignore-keeps-constitution. cargo test -p codewhale-tui --bins → 3946 passed;
clippy clean.

Targets codex/v0.8.53.
This commit is contained in:
Hunter Bown
2026-06-03 12:12:34 -07:00
parent 03d1bba538
commit 9d9616e898
6 changed files with 376 additions and 37 deletions
+18
View File
@@ -0,0 +1,18 @@
{
"schema_version": 1,
"authority": [
"current user request",
"live code and tests",
"GitHub issue/PR details",
"AGENTS.md and project CLAUDE.md",
"memory",
"previous-session handoffs"
],
"verification_policy": {
"before_claiming_done": [
"run the focused tests for the changed crate (cargo test -p <crate>), then cargo check/clippy as appropriate",
"read changed files back to confirm the edit landed as intended",
"never claim verification you did not perform"
]
}
}
+4 -1
View File
@@ -67,7 +67,10 @@ project_overhaul_prompt.md
.wrangler/
# Local runtime state
.codewhale/
# Ignore everything under any .codewhale/ (snapshots, auto-generated
# instructions.md, etc.) at any depth EXCEPT the committed repo authority policy.
**/.codewhale/*
!**/.codewhale/constitution.json
.deepseek/
**/session_*.json
*.db
+16 -5
View File
@@ -35,9 +35,12 @@ pub fn init(app: &mut App) -> CommandResult {
}
}
/// If `workspace` is inside a git repository, ensure `.codewhale/` and
/// `.deepseek/` are listed in the nearest `.gitignore` so that snapshots,
/// instructions, and other workspace-local state are not accidentally committed.
/// If `workspace` is inside a git repository, ensure workspace-local CodeWhale
/// state is listed in the nearest `.gitignore` so snapshots, auto-generated
/// instructions, and other runtime state are not accidentally committed — while
/// keeping the authored `.codewhale/constitution.json` repo authority policy
/// committable (a directory exclude cannot be overridden, so `.codewhale/*` plus
/// a negation is required).
fn ensure_deepseek_gitignored(workspace: &Path) {
// Only act if this workspace is a git repo.
if !workspace.join(".git").exists() {
@@ -45,7 +48,11 @@ fn ensure_deepseek_gitignored(workspace: &Path) {
}
let gitignore = workspace.join(".gitignore");
let entries = [".codewhale/", ".deepseek/"];
let entries = [
"**/.codewhale/*",
"!**/.codewhale/constitution.json",
".deepseek/",
];
// Read existing contents once.
let existing = std::fs::read_to_string(&gitignore).unwrap_or_default();
@@ -109,7 +116,7 @@ fn generate_project_doc(workspace: &Path) -> String {
doc.push_str("<!-- file patterns to avoid, and anything that helps a model navigate -->\n");
doc.push_str("<!-- the codebase without reading every file. -->\n");
doc.push('\n');
doc.push_str("- **CodeWhale reads this file as:** <!-- WHALE.md (CodeWhale-native) or AGENTS.md (compatible with other agents) -->\n");
doc.push_str("- **CodeWhale reads this file as:** AGENTS.md (canonical cross-agent project instructions). <!-- WHALE.md is deprecated; put CodeWhale-specific authority policy in .codewhale/constitution.json -->\n");
doc.push_str(
"- **Read-only surface:** <!-- Which directories can the agent read but not write? -->\n",
);
@@ -394,6 +401,10 @@ version = "1.0.0"
let content = std::fs::read_to_string(tmpdir.path().join(".gitignore")).unwrap();
assert!(content.contains(".deepseek/"));
// .codewhale/ is ignored at any depth, but the committed
// constitution.json is kept.
assert!(content.contains("**/.codewhale/*"));
assert!(content.contains("!**/.codewhale/constitution.json"));
}
#[test]
+288 -25
View File
@@ -3,36 +3,55 @@
//! This module handles loading project-specific context files that provide
//! instructions and context to the AI agent. These include:
//!
//! - `WHALE.md` - CodeWhale-native project instructions (highest priority)
//! - `AGENTS.md` - Generic agent instructions (compatible with other agents)
//! - `.claude/instructions.md` - Claude-style hidden instructions
//! - `CLAUDE.md` - Claude-style instructions
//! - `.codewhale/instructions.md` - Hidden instructions file (new)
//! - `AGENTS.md` - Cross-agent project instructions (canonical, highest priority)
//! - `WHALE.md` - **Deprecated** legacy CodeWhale-native instructions (read-only fallback)
//! - `.claude/instructions.md` - Claude-style hidden instructions (compat)
//! - `CLAUDE.md` - Claude-style instructions (compat)
//! - `.codewhale/instructions.md` - Hidden instructions file (compat)
//! - `.deepseek/instructions.md` - Hidden instructions file (legacy)
//!
//! The loaded content is injected into the system prompt to give the agent
//! context about the project's conventions, structure, and requirements.
//! CodeWhale-specific repo authority/prioritization policy lives separately in
//! `.codewhale/constitution.json` and is rendered as its own higher-authority
//! block. The loaded content is injected into the system prompt to give the
//! agent context about the project's conventions, structure, and requirements.
use std::collections::{BTreeMap, VecDeque};
use std::fs;
use std::path::{Path, PathBuf};
use serde::Serialize;
use serde::{Deserialize, Serialize};
use thiserror::Error;
/// Names of project context files to look for, in priority order.
/// WHALE.md is the CodeWhale-native convention; AGENTS.md and CLAUDE.md
/// provide compatibility with other coding agents. `.codewhale/` is the
/// new config directory; `.deepseek/` is the legacy fallback.
///
/// `AGENTS.md` is the canonical cross-agent project-instructions file.
/// `WHALE.md` is **deprecated** (kept only as a read-only legacy fallback, now
/// below `AGENTS.md`) — CodeWhale-specific repo authority now lives in
/// `.codewhale/constitution.json`, not a bespoke markdown file. `CLAUDE.md` and
/// the `*/instructions.md` variants are read-only compatibility fallbacks;
/// CodeWhale never creates or recommends them.
const PROJECT_CONTEXT_FILES: &[&str] = &[
"WHALE.md",
"AGENTS.md",
"WHALE.md", // deprecated: legacy CodeWhale-native, read-only fallback (#WHALE.md deprecation)
".claude/instructions.md",
"CLAUDE.md",
".codewhale/instructions.md",
".deepseek/instructions.md",
];
/// File name of the deprecated CodeWhale-native instructions file.
const DEPRECATED_WHALE_FILENAME: &str = "WHALE.md";
/// Warning surfaced when a `WHALE.md` is still the active instruction source.
const WHALE_DEPRECATION_WARNING: &str = "WHALE.md is deprecated; move project instructions to AGENTS.md, or CodeWhale-specific authority policy to .codewhale/constitution.json. WHALE.md is still read for now but will be dropped from default discovery in a future release.";
/// Relative path (within a workspace or one of its parents) to the
/// CodeWhale-specific repo authority/prioritization policy.
const REPO_CONSTITUTION_RELATIVE_PATH: &[&str] = &[".codewhale", "constitution.json"];
/// `schema_version` understood by this build of the constitution loader.
const SUPPORTED_CONSTITUTION_SCHEMA: u32 = 1;
/// User-level project instructions loaded as a fallback when the workspace and
/// its parents do not define project context. `.codewhale/` takes priority
/// over vendor-neutral `.agents/`, which takes priority over legacy
@@ -107,6 +126,10 @@ pub struct ProjectContext {
pub source_path: Option<PathBuf>,
/// Any warnings during loading
pub warnings: Vec<String>,
/// Rendered `.codewhale/constitution.json` authority block, if present.
/// CodeWhale-specific repo authority/prioritization policy — distinct from
/// the cross-agent prose in `instructions`.
pub constitution_block: Option<String>,
/// Project root directory
#[allow(dead_code)] // Part of ProjectContext public interface
pub project_root: PathBuf,
@@ -121,6 +144,7 @@ impl ProjectContext {
instructions: None,
source_path: None,
warnings: Vec::new(),
constitution_block: None,
project_root,
is_trusted: false,
}
@@ -131,9 +155,13 @@ impl ProjectContext {
self.instructions.is_some()
}
/// Get the instructions as a formatted block for system prompt
/// Get the instructions as a formatted block for system prompt.
///
/// The CodeWhale repo constitution (`.codewhale/constitution.json`), when
/// present, is emitted first as a higher-authority block, followed by the
/// cross-agent `<project_instructions>` prose. Either may be absent.
pub fn as_system_block(&self) -> Option<String> {
self.instructions.as_ref().map(|content| {
let instructions_block = self.instructions.as_ref().map(|content| {
let source = self
.source_path
.as_ref()
@@ -142,8 +170,138 @@ impl ProjectContext {
format!(
"<project_instructions source=\"{source}\">\n{content}\n</project_instructions>"
)
})
});
match (self.constitution_block.as_ref(), instructions_block) {
(Some(constitution), Some(instructions)) => {
Some(format!("{constitution}\n\n{instructions}"))
}
(Some(constitution), None) => Some(constitution.clone()),
(None, Some(instructions)) => Some(instructions),
(None, None) => None,
}
}
}
/// CodeWhale-specific repo authority/prioritization policy, loaded from
/// `.codewhale/constitution.json`. All fields are optional so a minimal file
/// (or a future schema) still parses; unknown fields are ignored.
#[derive(Debug, Clone, Default, Deserialize)]
struct RepoConstitution {
#[serde(default)]
schema_version: Option<u32>,
/// Ordered list of sources to trust when local sources conflict
/// (highest authority first).
#[serde(default)]
authority: Option<Vec<String>>,
#[serde(default)]
verification_policy: Option<VerificationPolicy>,
}
#[derive(Debug, Clone, Default, Deserialize)]
struct VerificationPolicy {
/// Steps to perform before claiming a task is done.
#[serde(default)]
before_claiming_done: Option<Vec<String>>,
}
impl RepoConstitution {
/// True when the file carried no usable policy (so we can skip emitting an
/// empty block).
fn is_empty(&self) -> bool {
self.authority.as_ref().is_none_or(Vec::is_empty)
&& self
.verification_policy
.as_ref()
.and_then(|p| p.before_claiming_done.as_ref())
.is_none_or(Vec::is_empty)
}
/// Render a model-facing authority block.
fn render_block(&self, source: &Path) -> String {
let mut body = String::new();
if let Some(authority) = self.authority.as_ref().filter(|a| !a.is_empty()) {
body.push_str(
"When local sources conflict, trust them in this order (highest first):\n",
);
for (idx, item) in authority.iter().enumerate() {
body.push_str(&format!("{}. {item}\n", idx + 1));
}
}
if let Some(steps) = self
.verification_policy
.as_ref()
.and_then(|p| p.before_claiming_done.as_ref())
.filter(|s| !s.is_empty())
{
body.push_str("\nBefore claiming a task is done:\n");
for step in steps {
body.push_str(&format!("- {step}\n"));
}
}
format!(
"<codewhale_repo_constitution source=\"{}\">\nCodeWhale-specific repo authority policy (takes precedence over a legacy WHALE.md).\n\n{}</codewhale_repo_constitution>",
source.display(),
body.trim_end()
)
}
}
/// Discover and render `.codewhale/constitution.json` from `workspace` or, if
/// absent, its parent directories up to the git root. Returns the rendered
/// authority block plus any parse warnings.
fn load_repo_constitution_block(workspace: &Path) -> (Option<String>, Vec<String>) {
let mut warnings = Vec::new();
let git_root = crate::project_doc::find_git_root(workspace);
let mut current = workspace.to_path_buf();
loop {
let mut path = current.clone();
for component in REPO_CONSTITUTION_RELATIVE_PATH {
path.push(component);
}
if path.is_file() {
match fs::read_to_string(&path) {
Ok(raw) => match serde_json::from_str::<RepoConstitution>(&raw) {
Ok(constitution) if !constitution.is_empty() => {
if let Some(version) = constitution.schema_version
&& version != SUPPORTED_CONSTITUTION_SCHEMA
{
warnings.push(format!(
"{} declares schema_version {version}; this build supports {SUPPORTED_CONSTITUTION_SCHEMA}. Reading it on a best-effort basis.",
path.display()
));
}
return (Some(constitution.render_block(&path)), warnings);
}
Ok(_) => {
warnings.push(format!(
"{} has no authority/verification policy; ignoring.",
path.display()
));
return (None, warnings);
}
Err(e) => {
warnings.push(format!("Failed to parse {}: {e}", path.display()));
return (None, warnings);
}
},
Err(e) => {
warnings.push(format!("Failed to read {}: {e}", path.display()));
return (None, warnings);
}
}
}
if let Some(ref root) = git_root
&& current == *root
{
break;
}
match current.parent() {
Some(parent) if parent != current => current = parent.to_path_buf(),
_ => break,
}
}
(None, warnings)
}
#[derive(Debug, Serialize)]
@@ -433,6 +591,10 @@ pub fn load_project_context(workspace: &Path) -> ProjectContext {
file_path.display(),
content.len()
);
if *filename == DEPRECATED_WHALE_FILENAME {
tracing::warn!("{WHALE_DEPRECATION_WARNING}");
ctx.warnings.push(WHALE_DEPRECATION_WARNING.to_string());
}
ctx.instructions = Some(content);
ctx.source_path = Some(file_path);
break;
@@ -527,6 +689,16 @@ fn load_project_context_with_parents_and_home(
}
}
// Load the CodeWhale-specific repo authority policy
// (.codewhale/constitution.json) independently of the prose instructions —
// it is a distinct, higher-authority artifact and may exist with or without
// an AGENTS.md. When present it takes precedence over a legacy WHALE.md.
// Loaded last so the auto-generate fallback above (which rebuilds `ctx`)
// cannot clobber it.
let (constitution_block, constitution_warnings) = load_repo_constitution_block(workspace);
ctx.warnings.extend(constitution_warnings);
ctx.constitution_block = constitution_block;
ctx
}
@@ -553,20 +725,20 @@ fn merge_global_and_project_instructions(
fn load_global_agents_context(workspace: &Path, home_dir: Option<&Path>) -> Option<ProjectContext> {
let home = home_dir?;
// Priority order:
// 1. ~/.codewhale/WHALE.md (CodeWhale-native)
// 2. ~/.codewhale/AGENTS.md (new config directory)
// 3. ~/.agents/WHALE.md (vendor-neutral fallback)
// 4. ~/.agents/AGENTS.md (vendor-neutral fallback)
// 5. ~/.deepseek/WHALE.md (legacy fallback)
// 6. ~/.deepseek/AGENTS.md (legacy fallback)
// Priority order (AGENTS.md preferred over the now-deprecated WHALE.md):
// 1. ~/.codewhale/AGENTS.md (canonical)
// 2. ~/.codewhale/WHALE.md (deprecated, legacy fallback)
// 3. ~/.agents/AGENTS.md (vendor-neutral fallback)
// 4. ~/.agents/WHALE.md (deprecated, vendor-neutral legacy)
// 5. ~/.deepseek/AGENTS.md (legacy fallback)
// 6. ~/.deepseek/WHALE.md (deprecated, legacy)
let candidates: &[&[&str]] = &[
GLOBAL_WHALE_RELATIVE_PATH,
GLOBAL_AGENTS_RELATIVE_PATH,
GLOBAL_WHALE_VENDOR_NEUTRAL_PATH,
GLOBAL_WHALE_RELATIVE_PATH,
GLOBAL_AGENTS_VENDOR_NEUTRAL_PATH,
GLOBAL_WHALE_LEGACY_PATH,
GLOBAL_WHALE_VENDOR_NEUTRAL_PATH,
GLOBAL_AGENTS_LEGACY_PATH,
GLOBAL_WHALE_LEGACY_PATH,
];
let mut warnings = Vec::new();
@@ -580,6 +752,10 @@ fn load_global_agents_context(workspace: &Path, home_dir: Option<&Path>) -> Opti
if path.exists() && path.is_file() {
match load_context_file(&path) {
Ok(content) => {
if path.file_name().and_then(|n| n.to_str()) == Some(DEPRECATED_WHALE_FILENAME) {
tracing::warn!("{WHALE_DEPRECATION_WARNING}");
warnings.push(WHALE_DEPRECATION_WARNING.to_string());
}
let mut ctx = ProjectContext::empty(workspace.to_path_buf());
ctx.instructions = Some(content);
ctx.source_path = Some(path);
@@ -961,6 +1137,93 @@ mod tests {
);
}
#[test]
fn agents_md_preferred_over_deprecated_whale_md() {
let tmp = tempdir().expect("tempdir");
fs::write(tmp.path().join("AGENTS.md"), "AGENTS canonical").expect("write agents");
fs::write(tmp.path().join("WHALE.md"), "WHALE legacy").expect("write whale");
let ctx = load_project_context(tmp.path());
let instructions = ctx.instructions.expect("instructions loaded");
assert!(instructions.contains("AGENTS canonical"), "{instructions}");
assert!(!instructions.contains("WHALE legacy"), "{instructions}");
// No deprecation warning since AGENTS.md won.
assert!(
!ctx.warnings.iter().any(|w| w.contains("WHALE.md is deprecated")),
"{:?}",
ctx.warnings
);
}
#[test]
fn whale_md_alone_is_still_read_with_deprecation_warning() {
let tmp = tempdir().expect("tempdir");
fs::write(tmp.path().join("WHALE.md"), "WHALE legacy body").expect("write whale");
let ctx = load_project_context(tmp.path());
assert!(
ctx.instructions.as_deref() == Some("WHALE legacy body"),
"legacy WHALE.md must still be read"
);
assert!(
ctx.warnings.iter().any(|w| w.contains("WHALE.md is deprecated")),
"expected deprecation warning, got {:?}",
ctx.warnings
);
}
#[test]
fn constitution_json_renders_authority_block() {
let tmp = tempdir().expect("tempdir");
fs::create_dir(tmp.path().join(".git")).expect("mkdir .git");
fs::create_dir(tmp.path().join(".codewhale")).expect("mkdir .codewhale");
fs::write(
tmp.path().join(".codewhale").join("constitution.json"),
r#"{
"schema_version": 1,
"authority": ["current user request", "live code and tests", "AGENTS.md"],
"verification_policy": { "before_claiming_done": ["run focused tests"] }
}"#,
)
.expect("write constitution");
let ctx = load_project_context_with_parents(tmp.path());
let block = ctx
.constitution_block
.as_deref()
.expect("constitution block rendered");
assert!(block.contains("<codewhale_repo_constitution"));
assert!(block.contains("current user request"));
assert!(block.contains("run focused tests"));
assert!(block.contains("takes precedence over a legacy WHALE.md"));
// It also surfaces through the system block.
assert!(
ctx.as_system_block()
.expect("system block")
.contains("codewhale_repo_constitution")
);
}
#[test]
fn malformed_constitution_warns_without_crashing() {
let tmp = tempdir().expect("tempdir");
fs::create_dir(tmp.path().join(".git")).expect("mkdir .git");
fs::create_dir(tmp.path().join(".codewhale")).expect("mkdir .codewhale");
fs::write(
tmp.path().join(".codewhale").join("constitution.json"),
"{ not valid json",
)
.expect("write bad constitution");
let ctx = load_project_context_with_parents(tmp.path());
assert!(ctx.constitution_block.is_none(), "no block for invalid JSON");
assert!(
ctx.warnings.iter().any(|w| w.contains("Failed to parse")),
"expected parse warning, got {:?}",
ctx.warnings
);
}
#[test]
fn project_context_pack_is_stable_and_sorted() {
let tmp = tempdir().expect("tempdir");
+8 -6
View File
@@ -1,16 +1,18 @@
//! Project document discovery and loading
//!
//! Supports auto-discovery of project instructions like Claude Code.
//! Priority: WHALE.md > AGENTS.md > .claude/instructions.md > CLAUDE.md > .codewhale/instructions.md > .deepseek/instructions.md
//! Priority: AGENTS.md > WHALE.md (deprecated) > .claude/instructions.md > CLAUDE.md > .codewhale/instructions.md > .deepseek/instructions.md
use std::path::{Path, PathBuf};
/// Document filenames to search for (in priority order)
/// WHALE.md is the CodeWhale-native convention; AGENTS.md and CLAUDE.md
/// provide compatibility; `.codewhale/` is the new config directory.
/// Document filenames to search for (in priority order).
/// `AGENTS.md` is canonical. `WHALE.md` is **deprecated** (read-only legacy
/// fallback, now below `AGENTS.md`); CodeWhale-specific authority policy lives
/// in `.codewhale/constitution.json`. `CLAUDE.md` and the `*/instructions.md`
/// variants are read-only compatibility fallbacks.
pub const DOC_FILENAMES: &[&str] = &[
"WHALE.md",
"AGENTS.md",
"WHALE.md", // deprecated: legacy CodeWhale-native, read-only fallback
".claude/instructions.md",
"CLAUDE.md",
".codewhale/instructions.md",
@@ -64,7 +66,7 @@ pub fn discover_paths(cwd: &Path) -> Vec<PathBuf> {
}
/// Find the git root directory from cwd
fn find_git_root(cwd: &Path) -> Option<PathBuf> {
pub(crate) fn find_git_root(cwd: &Path) -> Option<PathBuf> {
let mut current = cwd.to_path_buf();
loop {
if current.join(".git").exists() {
+42
View File
@@ -5,6 +5,48 @@ At process startup it also loads a workspace-local `.env` file when present.
Use the tracked `.env.example` as the template; copy it to `.env`, then edit
only the provider and safety knobs you need.
## Project instructions & repo authority
Each repo can carry two distinct, complementary files:
- **`AGENTS.md`** — cross-agent **project instructions** (prose). This is the
canonical file for "how should an agent work in this repo." Run `/init` to
scaffold one. `CLAUDE.md` and `.claude/instructions.md` are read as
compatibility fallbacks.
- **`.codewhale/constitution.json`** — CodeWhale-specific **repo authority /
prioritization policy**: when local sources conflict, which should CodeWhale
trust first, and what to verify before claiming a task is done. `.codewhale/`
lives inside the repo (like `.github/`). Example:
```json
{
"schema_version": 1,
"authority": [
"current user request",
"live code and tests",
"GitHub issue/PR details",
"AGENTS.md",
"memory",
"old handoffs"
],
"verification_policy": {
"before_claiming_done": ["run focused tests", "read changed files back"]
}
}
```
When present, it is rendered into the system prompt as a higher-authority
block and takes precedence over a legacy `WHALE.md`.
> **`WHALE.md` is deprecated.** It overlapped confusingly with `AGENTS.md`.
> CodeWhale still **reads** an existing `WHALE.md` (below `AGENTS.md`) so old
> repos keep working, and emits a deprecation notice, but it is no longer
> created or recommended and will be dropped from default discovery after a
> deprecation window. Move ordinary instructions to `AGENTS.md` and
> CodeWhale-specific authority policy to `.codewhale/constitution.json`. (The
> global CodeWhale Constitution shipped in the model prompt is a separate thing
> and is unaffected.)
## Where It Looks
Default config path: