Files
codewhale/CONTRIBUTING.md
T
Hunter B 002f8f0ba1 ci: enforce mappable co-author credit
Add AUTHOR_MAP plus a lightweight co-author trailer checker so harvested commits use numeric GitHub noreply identities, reject bot/tool trailers, and require machine-readable credit when a commit says it was harvested from a PR.

Also normalize the local unpushed v0.9 harvest range so existing contributor authors/trailers for HUQIANTAO, Implementist, jrcjrcc, xyuai, cyq1017, idling11, and shenjackyuanjie use GitHub-mappable identities before the branch is published.

Validation: python3 scripts/check-coauthor-trailers.py --author-map .github/AUTHOR_MAP --range origin/main..HEAD --check-authors; python3 -m py_compile scripts/check-coauthor-trailers.py; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/ci.yml")'; git diff --check; negative in-process validation for raw email, missing harvested credit, and bot author cases.
2026-06-03 21:07:33 -07:00

357 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Contributing to codewhale
Thank you for your interest in contributing to codewhale! This document provides guidelines and instructions for contributing.
## Getting Started
### Prerequisites
- Rust 1.88 or later (edition 2024)
- Cargo package manager
- Git
### Setting Up Development Environment
1. Fork and clone the repository:
```bash
git clone https://github.com/YOUR_USERNAME/CodeWhale.git
cd CodeWhale
```
2. Build the project:
```bash
cargo build
```
3. Run tests:
```bash
cargo test --workspace --all-features
```
4. Run with development settings:
```bash
cargo run --bin codewhale
```
## Development Workflow
### Code Style
- Run `cargo fmt` before committing to ensure consistent formatting
- Run `cargo clippy` and address all warnings
- Follow Rust naming conventions (snake_case for functions/variables, CamelCase for types)
- Add documentation comments for public APIs
### Testing
- Write tests for new functionality
- Ensure all existing tests pass: `cargo test --workspace --all-features`
- Colocate unit tests beside the code they cover (standard Rust `#[cfg(test)]`
modules), and add integration tests under the owning crate's `tests/`
directory (for example `crates/tui/tests/` or `crates/state/tests/`). The
repository root `tests/` directory is not used
### Commit Messages
Use clear, descriptive commit messages following conventional commits:
- `feat:` New feature
- `fix:` Bug fix
- `docs:` Documentation changes
- `refactor:` Code refactoring
- `test:` Adding or updating tests
- `chore:` Maintenance tasks
Example: `feat: add doctor subcommand for system diagnostics`
When a commit harvests code from a community PR (see "How Your Contribution
Lands" below), include a `Harvested from PR #N by @author` line in the commit
body. An auto-close workflow watches for this pattern and closes the
referenced PR with credit so the contributor gets a clear signal that
their work shipped.
## How Your Contribution Lands
We follow a deliberate "land what's useful, credit the contributor" model
that occasionally surprises new contributors. Two paths:
### Path 1 — Direct merge
If your PR is well-scoped, passes CI, doesn't touch the trust-boundary
surface (auth / sandbox / publishing / branding), and doesn't conflict
with main, a maintainer merges it directly. This is the most common
outcome for small bug fixes and well-tested feature additions.
### Path 2 — Harvest
If your PR is large, mixes scope, conflicts with main, or needs polish
that's faster for the maintainer to apply than to round-trip with the
contributor, the maintainer may **harvest** the useful commits or hunks
into a new commit on `main` rather than merging the PR directly. This is
**not a rejection** — it means your code landed.
When this happens:
- The harvested commit's message includes `Harvested from PR #N by
@your-handle`. This is the contract: that line is your credit and the
signal that your contribution shipped.
- If the maintainer copies or adapts your code, the harvested commit also
keeps attribution with the original author identity when possible: either by
preserving the commit author on a cherry-pick or by adding a
`Co-authored-by: Name <id+login@users.noreply.github.com>` trailer. This is
what lets GitHub's contribution surfaces recognize more than prose credit.
Maintainers should use `.github/AUTHOR_MAP`, or run
`gh api users/<login> --jq '"\(.id)+\(.login)@users.noreply.github.com"'`,
rather than copying raw, `.local`, or old-style noreply emails from a
contributor's machine.
- The `CHANGELOG.md` entry for the next release credits you by handle.
- The auto-close workflow closes your PR with a templated thank-you and
a link to the commit on `main`.
To make a future contribution land via the faster Direct-Merge path
instead of the Harvest path, the highest-leverage things you can do are:
1. **Keep PRs single-purpose.** One bug fix per PR; one feature per PR.
Don't mix a refactor with a feature.
2. **Rebase onto current `main` before opening the PR**, and after CI
feedback. Conflicts force the harvest path even when the change is
small.
3. **Include tests** with new behavior. The maintainer often harvests
PRs without tests because adding the test is faster than asking the
contributor for one.
4. **Avoid the trust-boundary surface** without prior maintainer
sign-off. That includes auth/credential flows, sandbox policy,
publishing/release plumbing, and `prompts/` content. PRs that touch
these without prior discussion are unlikely to merge directly even
when the change is well-implemented.
## Layered and EPIC-Sized Work
Some architecture work is too large for one PR but still needs to be built in
dependent layers. For those changes, use this workflow:
1. Start with a tracking issue or EPIC when the work spans multiple PRs. Name
the intended slices and state what each slice is not trying to close yet.
2. Keep each implementation PR focused on one behavior boundary.
3. Later layers may stay in your fork or open as draft PRs while the lower
layer is still moving. Draft stacked PR titles or descriptions should say
`Draft / depends on #NNNN`.
4. A dependent PR is not ready for merge review until the lower layer has
landed, the branch has been rebased onto current `main`, and the PR targets
`main`.
5. The PR body should identify which earlier PR it builds on, what is in scope,
what is explicitly out of scope, which issues it references, and which local
commands were run.
6. Use `Closes #...` only when the slice fully satisfies an issue. Use
`Refs #...` with a short `(partial)` note when the PR advances a broad issue
but leaves follow-up work.
7. Structured commits are fine during review. Maintainers may squash or harvest
at merge time, with contributor credit preserved through authorship,
co-author trailers, changelog entries, or PR/issue comments.
Before asking for merge review on a layered PR, check that it is:
- rebased onto current `main`
- marked ready for review, not draft
- focused to one behavior boundary
- backed by local command evidence in the PR body
- green in CI, or has any remaining red lane clearly explained
- covered by round-trip or migration-preservation tests when it changes config
or schema behavior
- referencing broad issues as partial unless it really closes them
For layered work, a useful PR description shape is:
```text
Summary:
Scope:
Not in this slice:
Builds on:
Issues:
Validation:
```
## Contribution Gate
CodeWhale uses a maintainer-managed contribution gate for the community front
door. Maintainers and collaborators bypass this gate automatically. The gate
workflows default to dry-run / comment-only mode so maintainers can observe the
signal before changing contributor flow.
The maintainer posture is documented in
[docs/AGENT_ETHOS.md](docs/AGENT_ETHOS.md): automation should reduce load while
keeping good-faith contributors seen, credited, and able to keep helping.
Issues are never auto-closed by the contribution gate. Unapproved external
issues receive a short welcome note that asks for reproduction details and then
remain open for maintainer triage. CodeWhale depends on real edge cases from
real users, so issue intake should stay warm and open.
Pull requests are different because they can touch code, CI, release plumbing,
auth, sandboxing, provider policy, and other trust-boundary surfaces. The PR
gate can be switched from dry-run to enforcement when maintainers decide they
need that safety control, but it should be treated as a review-load control,
not a judgment on contributor quality. Before enabling PR enforcement, seed the
allowlist broadly enough for active external contributors who should not be
interrupted by the rollout.
The allowlist is scoped:
- `pr:username` allows pull requests.
- `issue:username` allows issues.
- `all:username` allows both.
A maintainer can approve someone by commenting `/lgtm` on a pull request for PR
access, or `/lgtmi` on an issue for issue access. The exact bare commands
`lgtm` and `lgtmi` are also accepted for compatibility, but the prefixed forms
are preferred because they are harder to trigger accidentally in ordinary review
discussion.
Approvals do not edit `main` directly. The approval workflow opens a small
allowlist update PR so the new entry is reviewable before it takes effect.
If the PR gate fires on a good contributor incorrectly, use the same approval
flow to restore them: comment `/lgtm`, merge the generated allowlist PR, then
reopen the affected pull request. If GitHub will not allow the closed PR to be
reopened, ask the contributor to resubmit after the allowlist PR is merged.
## Agent-Assisted Improvements
CodeWhale is allowed to help improve CodeWhale, but the contribution still has
to be shaped for human review. The recommended workflow is the
[recursive self-improvement prompt](docs/RECURSIVE_SELF_IMPROVEMENT.md): run it
from a fresh fork or branch, let the agent find exactly one small friction point,
and stop after one patch. DeepSeek V4 Pro is the first-class path for this loop
today, but the review shape matters more than the provider.
Agents and maintainers should follow the stewardship posture in
[docs/AGENT_ETHOS.md](docs/AGENT_ETHOS.md): use automation for evidence,
verification, and narrow patches while keeping the final community decision
human-reviewed.
The useful output is not "ideas for improvement." The useful output is a
specific reproduction, a minimal diff, focused checks, and a PR description that
explains the trade-off. Do not use an agent to touch auth, credentials, sandbox
policy, publishing/release plumbing, provider policy, telemetry, sponsorship,
branding, or global prompts without prior maintainer sign-off.
## Project Structure
codewhale is a Cargo workspace. The live runtime and the majority of TUI,
engine, and tool code currently live in `crates/tui/src/`. Smaller workspace
crates provide shared abstractions that are being extracted incrementally.
```
crates/
├── tui/ codewhale-tui binary (interactive TUI + runtime API)
├── cli/ codewhale binary (dispatcher facade)
├── app-server/ HTTP/SSE + JSON-RPC transport
├── core/ Agent loop / session / turn management
├── protocol/ Request/response framing
├── config/ Config loading, profiles, env precedence
├── state/ SQLite thread/session persistence
├── tools/ Typed tool specs and lifecycle
├── mcp/ MCP client + stdio server
├── hooks/ Lifecycle hooks (stdout/jsonl/webhook)
├── execpolicy/ Approval/sandbox policy engine
├── agent/ Model/provider registry
└── tui-core/ Event-driven TUI state machine scaffold
```
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the live data flow across
these crates, including the bottom-up build order.
## Submitting Changes
1. Create a feature branch from `main`:
```bash
git checkout -b feat/your-feature
```
2. Make your changes and commit them
3. Ensure CI passes:
```bash
cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features
cargo test --workspace --all-features
```
4. Push your branch and create a Pull Request
5. Describe your changes clearly in the PR description
## Pull Request Guidelines
- Use the [pull request template](.github/PULL_REQUEST_TEMPLATE.md) when opening
a PR — it includes the Summary, Testing, and Checklist sections reviewers
expect
- Keep PRs focused on a single change
- Update documentation if needed
- Add tests for new functionality
- Ensure CI passes before requesting review
## Shape of a Typical PR
A well-structured PR follows a consistent pattern. Recent exemplars include:
- **#386** — `/init` command: new `crates/tui/src/commands/init.rs` module, project-type detection,
AGENTS.md generation, command registration in `commands/mod.rs`, localization strings.
- **#389** — Inline LSP diagnostics: LSP subsystem in `crates/tui/src/lsp/`, engine hooks in
`core/engine/lsp_hooks.rs`, config toggle, test coverage.
- **#387** — Self-update: new `crates/cli/src/update.rs` module, CLI subcommand registration,
HTTP download + SHA256 verification + atomic binary replacement.
- **#393** — `/share` session URL: new `crates/tui/src/commands/share.rs`, HTML rendering,
`gh gist create` integration, command registration.
- **#343/#346** — (v0.8.5) Runtime thread/turn timeline and durable task manager refactors.
Typically each PR touches 13 new files, modifies 25 existing files for wiring
(registries, dispatch matches, localization), and adds or updates tests. Changes
are scoped to a single feature or fix — if you discover related work that needs
doing, open a separate issue rather than expanding the PR scope.
Before submitting, run:
```bash
cargo fmt --check
cargo clippy --workspace --all-targets --all-features 2>&1 | head -50
cargo check
```
## Reporting Issues
When reporting issues, please use one of the issue templates:
- [Bug report](.github/ISSUE_TEMPLATE/bug_report.md) — for reproducible problems
or regressions
- [Feature request](.github/ISSUE_TEMPLATE/feature_request.md) — for ideas and
improvements
Issue reports should include:
- Operating system and version
- Rust version (`rustc --version`)
- codewhale version (`codewhale --version`)
- Steps to reproduce the issue
- Expected vs actual behavior
- Relevant error messages or logs
## Security
If you discover a security vulnerability, please do **not** open a public issue.
See [SECURITY.md](SECURITY.md) for the responsible disclosure process and
contact information.
## Code of Conduct
Be respectful and inclusive. We welcome contributors of all backgrounds and
experience levels. See [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for the full
code of conduct.
## License
By contributing to codewhale, you agree that your contributions will be licensed under the MIT License.
## Questions?
Feel free to open an issue for any questions about contributing.