Files
pig-farm-controller/bmad/bmm/workflows/document-project/README.md
2025-11-01 19:22:39 +08:00

445 lines
14 KiB
Markdown

# Document Project Workflow
**Version:** 1.2.0
**Module:** BMM (BMAD Method Module)
**Type:** Action Workflow (Documentation Generator)
## Purpose
Analyzes and documents brownfield projects by scanning codebase, architecture, and patterns to create comprehensive reference documentation for AI-assisted development. Generates a master index and multiple documentation files tailored to project structure and type.
**NEW in v1.2.0:** Context-safe architecture with scan levels, resumability, and write-as-you-go pattern to prevent context exhaustion.
## Key Features
- **Multi-Project Type Support**: Handles web, backend, mobile, CLI, game, embedded, data, infra, library, desktop, and extension projects
- **Multi-Part Detection**: Automatically detects and documents projects with separate client/server or multiple services
- **Three Scan Levels** (NEW v1.2.0): Quick (2-5 min), Deep (10-30 min), Exhaustive (30-120 min)
- **Resumability** (NEW v1.2.0): Interrupt and resume workflows without losing progress
- **Write-as-you-go** (NEW v1.2.0): Documents written immediately to prevent context exhaustion
- **Intelligent Batching** (NEW v1.2.0): Subfolder-based processing for deep/exhaustive scans
- **Data-Driven Analysis**: Uses CSV-based project type detection and documentation requirements
- **Comprehensive Scanning**: Analyzes APIs, data models, UI components, configuration, security patterns, and more
- **Architecture Matching**: Matches projects to 170+ architecture templates from the solutioning registry
- **Brownfield PRD Ready**: Generates documentation specifically designed for AI agents planning new features
## How to Invoke
```bash
workflow document-project
```
Or from BMAD CLI:
```bash
/bmad:bmm:workflows:document-project
```
## Scan Levels (NEW in v1.2.0)
Choose the right scan depth for your needs:
### 1. Quick Scan (Default)
**Duration:** 2-5 minutes
**What it does:** Pattern-based analysis without reading source files
**Reads:** Config files, package manifests, directory structure, README
**Use when:**
- You need a fast project overview
- Initial understanding of project structure
- Planning next steps before deeper analysis
**Does NOT read:** Source code files (_.js, _.ts, _.py, _.go, etc.)
### 2. Deep Scan
**Duration:** 10-30 minutes
**What it does:** Reads files in critical directories based on project type
**Reads:** Files in critical paths defined by documentation requirements
**Use when:**
- Creating comprehensive documentation for brownfield PRD
- Need detailed analysis of key areas
- Want balance between depth and speed
**Example:** For a web app, reads controllers/, models/, components/, but not every utility file
### 3. Exhaustive Scan
**Duration:** 30-120 minutes
**What it does:** Reads ALL source files in project
**Reads:** Every source file (excludes node_modules, dist, build, .git)
**Use when:**
- Complete project analysis needed
- Migration planning requires full understanding
- Detailed audit of entire codebase
- Deep technical debt assessment
**Note:** Deep-dive mode ALWAYS uses exhaustive scan (no choice)
## Resumability (NEW in v1.2.0)
The workflow can be interrupted and resumed without losing progress:
- **State Tracking:** Progress saved in `project-scan-report.json`
- **Auto-Detection:** Workflow detects incomplete runs (<24 hours old)
- **Resume Prompt:** Choose to resume or start fresh
- **Step-by-Step:** Resume from exact step where interrupted
- **Archiving:** Old state files automatically archived
**Example Resume Flow:**
```
> workflow document-project
I found an in-progress workflow state from 2025-10-11 14:32:15.
Current Progress:
- Mode: initial_scan
- Scan Level: deep
- Completed Steps: 5/12
- Last Step: step_5
Would you like to:
1. Resume from where we left off - Continue from step 6
2. Start fresh - Archive old state and begin new scan
3. Cancel - Exit without changes
Your choice [1/2/3]:
```
## What It Does
### Step-by-Step Process
1. **Detects Project Structure** - Identifies if project is single-part or multi-part (client/server/etc.)
2. **Classifies Project Type** - Matches against 12 project types (web, backend, mobile, etc.)
3. **Discovers Documentation** - Finds existing README, CONTRIBUTING, ARCHITECTURE files
4. **Analyzes Tech Stack** - Parses package files, identifies frameworks, versions, dependencies
5. **Conditional Scanning** - Performs targeted analysis based on project type requirements:
- API routes and endpoints
- Database models and schemas
- State management patterns
- UI component libraries
- Configuration and security
- CI/CD and deployment configs
6. **Generates Source Tree** - Creates annotated directory structure with critical paths
7. **Extracts Dev Instructions** - Documents setup, build, run, and test commands
8. **Creates Architecture Docs** - Generates detailed architecture using matched templates
9. **Builds Master Index** - Creates comprehensive index.md as primary AI retrieval source
10. **Validates Output** - Runs 140+ point checklist to ensure completeness
### Output Files
**Single-Part Projects:**
- `index.md` - Master index
- `project-overview.md` - Executive summary
- `architecture.md` - Detailed architecture
- `source-tree-analysis.md` - Annotated directory tree
- `component-inventory.md` - Component catalog (if applicable)
- `development-guide.md` - Local dev instructions
- `api-contracts.md` - API documentation (if applicable)
- `data-models.md` - Database schema (if applicable)
- `deployment-guide.md` - Deployment process (optional)
- `contribution-guide.md` - Contributing guidelines (optional)
- `project-scan-report.json` - State file for resumability (NEW v1.2.0)
**Multi-Part Projects (e.g., client + server):**
- `index.md` - Master index with part navigation
- `project-overview.md` - Multi-part summary
- `architecture-{part_id}.md` - Per-part architecture docs
- `source-tree-analysis.md` - Full tree with part annotations
- `component-inventory-{part_id}.md` - Per-part components
- `development-guide-{part_id}.md` - Per-part dev guides
- `integration-architecture.md` - How parts communicate
- `project-parts.json` - Machine-readable metadata
- `project-scan-report.json` - State file for resumability (NEW v1.2.0)
- Additional conditional files per part (API, data models, etc.)
## Data Files
The workflow uses a single comprehensive CSV file:
**documentation-requirements.csv** - Complete project analysis guide
- Location: `/bmad/bmm/workflows/document-project/documentation-requirements.csv`
- 12 project types (web, mobile, backend, cli, library, desktop, game, data, extension, infra, embedded)
- 24 columns combining:
- **Detection columns**: `project_type_id`, `key_file_patterns` (identifies project type from codebase)
- **Requirement columns**: `requires_api_scan`, `requires_data_models`, `requires_ui_components`, etc.
- **Pattern columns**: `critical_directories`, `test_file_patterns`, `config_patterns`, etc.
- Self-contained: All project detection AND scanning requirements in one file
- Architecture patterns inferred from tech stack (no external registry needed)
## Use Cases
### Primary Use Case: Brownfield PRD Creation
After running this workflow, use the generated `index.md` as input to brownfield PRD workflows:
```
User: "I want to add a new dashboard feature"
PRD Workflow: Loads docs/index.md
→ Understands existing architecture
→ Identifies reusable components
→ Plans integration with existing APIs
→ Creates contextual PRD with epics and stories
```
### Other Use Cases
- **Onboarding New Developers** - Comprehensive project documentation
- **Architecture Review** - Structured analysis of existing system
- **Technical Debt Assessment** - Identify patterns and anti-patterns
- **Migration Planning** - Understand current state before refactoring
## Requirements
### Recommended Inputs (Optional)
- Project root directory (defaults to current directory)
- README.md or similar docs (auto-discovered if present)
- User guidance on key areas to focus (workflow will ask)
### Tools Used
- File system scanning (Glob, Read, Grep)
- Code analysis
- Git repository analysis (optional)
## Configuration
### Default Output Location
Files are saved to: `{output_folder}` (from config.yaml)
Default: `/docs/` folder in project root
### Customization
- Modify `documentation-requirements.csv` to adjust scanning patterns for project types
- Add new project types to `project-types.csv`
- Add new architecture templates to `registry.csv`
## Example: Multi-Part Web App
**Input:**
```
my-app/
├── client/ # React frontend
├── server/ # Express backend
└── README.md
```
**Detection Result:**
- Repository Type: Monorepo
- Part 1: client (web/React)
- Part 2: server (backend/Express)
**Output (10+ files):**
```
docs/
├── index.md
├── project-overview.md
├── architecture-client.md
├── architecture-server.md
├── source-tree-analysis.md
├── component-inventory-client.md
├── development-guide-client.md
├── development-guide-server.md
├── api-contracts-server.md
├── data-models-server.md
├── integration-architecture.md
└── project-parts.json
```
## Example: Simple CLI Tool
**Input:**
```
hello-cli/
├── main.go
├── go.mod
└── README.md
```
**Detection Result:**
- Repository Type: Monolith
- Part 1: main (cli/Go)
**Output (4 files):**
```
docs/
├── index.md
├── project-overview.md
├── architecture.md
└── source-tree-analysis.md
```
## Deep-Dive Mode
### What is Deep-Dive Mode?
When you run the workflow on a project that already has documentation, you'll be offered a choice:
1. **Rescan entire project** - Update all documentation with latest changes
2. **Deep-dive into specific area** - Generate EXHAUSTIVE documentation for a particular feature/module/folder
3. **Cancel** - Keep existing documentation
Deep-dive mode performs **comprehensive, file-by-file analysis** of a specific area, reading EVERY file completely and documenting:
- All exports with complete signatures
- All imports and dependencies
- Dependency graphs and data flow
- Code patterns and implementations
- Testing coverage and strategies
- Integration points
- Reuse opportunities
### When to Use Deep-Dive Mode
- **Before implementing a feature** - Deep-dive the area you'll be modifying
- **During architecture review** - Deep-dive complex modules
- **For code understanding** - Deep-dive unfamiliar parts of codebase
- **When creating PRDs** - Deep-dive areas affected by new features
### Deep-Dive Process
1. Workflow detects existing `index.md`
2. Offers deep-dive option
3. Suggests areas based on project structure:
- API route groups
- Feature modules
- UI component areas
- Services/business logic
4. You select area or specify custom path
5. Workflow reads EVERY file in that area
6. Generates `deep-dive-{area-name}.md` with complete analysis
7. Updates `index.md` with link to deep-dive doc
8. Offers to deep-dive another area or finish
### Deep-Dive Output Example
**docs/deep-dive-dashboard-feature.md:**
- Complete file inventory (47 files analyzed)
- Every export with signatures
- Dependency graph
- Data flow analysis
- Integration points
- Testing coverage
- Related code references
- Implementation guidance
- ~3,000 LOC documented in detail
### Incremental Deep-Diving
You can deep-dive multiple areas over time:
- First run: Scan entire project generates index.md
- Second run: Deep-dive dashboard feature
- Third run: Deep-dive API layer
- Fourth run: Deep-dive authentication system
All deep-dive docs are linked from the master index.
## Validation
The workflow includes a comprehensive 160+ point checklist covering:
- Project detection accuracy
- Technology stack completeness
- Codebase scanning thoroughness
- Architecture documentation quality
- Multi-part handling (if applicable)
- Brownfield PRD readiness
- Deep-dive completeness (if applicable)
## Next Steps After Completion
1. **Review** `docs/index.md` - Your master documentation index
2. **Validate** - Check generated docs for accuracy
3. **Use for PRD** - Point brownfield PRD workflow to index.md
4. **Maintain** - Re-run workflow when architecture changes significantly
## File Structure
```
document-project/
├── workflow.yaml # Workflow configuration
├── instructions.md # Step-by-step workflow logic
├── checklist.md # Validation criteria
├── documentation-requirements.csv # Project type scanning patterns
├── templates/ # Output templates
│ ├── index-template.md
│ ├── project-overview-template.md
│ └── source-tree-template.md
└── README.md # This file
```
## Troubleshooting
**Issue: Project type not detected correctly**
- Solution: Workflow will ask for confirmation; manually select correct type
**Issue: Missing critical information**
- Solution: Provide additional context when prompted; re-run specific analysis steps
**Issue: Multi-part detection missed a part**
- Solution: When asked to confirm parts, specify the missing part and its path
**Issue: Architecture template doesn't match well**
- Solution: Check registry.csv; may need to add new template or adjust matching criteria
## Architecture Improvements in v1.2.0
### Context-Safe Design
The workflow now uses a write-as-you-go architecture:
- Documents written immediately to disk (not accumulated in memory)
- Detailed findings purged after writing (only summaries kept)
- State tracking enables resumption from any step
- Batching strategy prevents context exhaustion on large projects
### Batching Strategy
For deep/exhaustive scans:
- Process ONE subfolder at a time
- Read files Extract info Write output Validate Purge context
- Primary concern is file SIZE (not count)
- Track batches in state file for resumability
### State File Format
Optimized JSON (no pretty-printing):
```json
{
"workflow_version": "1.2.0",
"timestamps": {...},
"mode": "initial_scan",
"scan_level": "deep",
"completed_steps": [...],
"current_step": "step_6",
"findings": {"summary": "only"},
"outputs_generated": [...],
"resume_instructions": "..."
}
```