Scaffold the Before-M1 repo skeleton

Stand up the modular-monolith skeleton per docs/V1_BUILD_PLAN.md: one .NET 10 solution with web + worker hosts sharing seven interface-bounded module projects, PostgreSQL 17 + pgvector via EF Core 10, a React 19 + Vite SPA built into wwwroot, and Docker Compose for one-command local dev. Skeleton only — no feature code. Architecture - One project per module (OrgBoard, Identity, Skills, Assembler, Governance, Memory, Integrations); each is its own assembly so non-public types (entities, DbContext) are invisible across modules at compile time. - TeamUp.Bootstrap is the only library that references all modules; both hosts reference only Bootstrap. SharedKernel/Infrastructure never reference modules. - IModule seam: Register(...) runs in both hosts; MapEndpoints(...) only in web. - PlatformDbContext owns the pgvector extension + the seven module schemas (InitialPlatform migration); MigrationRunner applies it then any module context. - One image, two roles selected by RUN_MODE at the Docker entrypoint. Verified - dotnet build green (nullable + warnings-as-errors). - ArchitectureTests 8/8 — reflection-based boundary rules (no module -> module, -> Infrastructure, -> Bootstrap, or -> host references). - IntegrationTests 10/10 — Testcontainers boots the host against real pgvector: migration applies, vector extension + 7 schemas exist, /health 200, every /api/<module>/ping 200, /openapi/v1.json served. - client builds clean (Vite 6 — pinned for Node 22.3.0; Vite 8 needs Node >=22.12). Packages and base images route through the Nexus mirror (mirror.soroushasadi.com), reachable from Iran when nuget.org / Docker Hub / MCR are not. CI is intentionally deferred to a later session. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 06:41:28 +03:30
commit 36fe158b43
89 changed files with 7329 additions and 0 deletions
@@ -0,0 +1,130 @@
+# TeamUp.AI — Project Memory
+
+> **Build human + AI teams.** Model your organization, fill open role-seats with governed AI agents that do the work, and run delivery on one board where humans and AI work side by side.
+>
+> A product of **AliaSaaS**. This file is the always-loaded context for Claude Code. Read `docs/PRODUCT.md` for the full model and `docs/V1_BUILD_PLAN.md` for what to build now.
+
+---
+
+## 1. What this is
+
+Small/mid software orgs rarely staff every role — often no product owner, no dedicated QA, no reviewer — so developers absorb that work and quality suffers. Existing AI tools sit in one developer's editor as a single helper; they have no concept of a *team*, no *role coverage*, no *governance*, and the work still lives in a separate tool.
+
+TeamUp.AI is a **live org chart that does work, on a board the team already runs delivery on**. You model the org; any open *seat* (a role) can be filled by an AI *agent* that is equipped with skills, given documents, granted tools, governed by an autonomy setting, and put to work — producing real output routed to a human review queue.
+
+It is also a **lightweight project-management framework**: the AI Product Owner writes a spec *and* generates child stories as real tasks on the board; the AI QA picks up a build and posts pass/fail; work flows backlog → done with assignees that are human or AI.
+
+---
+
+## 2. The bet (what V1 exists to prove)
+
+> **"For a Product Owner and a QA role, an AI agent produces output a human accepts with little editing — saving more time than supervising it costs."**
+
+Measured, not asserted. Primary metric: **human edit distance** (how much a reviewer changes output before approving). Instrumented from **M1**, not at the end. If it's low and falling across a sprint inside AliaSaaS, the product is real. If not, we learned cheaply.
+
+**Strategy: architect broad, build narrow, market narrow.** The full model (divisions, MCP, marketplace, custom model) is in `docs/PRODUCT.md` and the data model accommodates it — but V1 builds only the wedge below.
+
+---
+
+## 3. V1 scope
+
+**In V1:**
+- Org → one product → one team → seats (human / open / AI)
+- Two AI roles: **Product Owner** and **QA**
+- The **board**: backlog → in progress → in review → done; tasks assigned to humans or AI
+- **Skill registry** (Git-indexed) with ~4 atoms
+- **Assembler + worker**; prompt caching
+- **Autonomy dial**, **review inbox**, audit log
+- **Access control** (roles × scope) and the **cartable** (each person's pending-work inbox)
+- **BYOK** API config; per-seat model
+- Team **working memory** (basic)
+
+**Deferred (architected for, not built in V1):** divisions UI & non-engineering roles; multiple products & multi-tenant billing; per-agent MCP tool-calling & Git write-back; episodic/semantic memory; the gap finder; skill studio UI, template builder, tier enforcement, AI skill-suggestion; the skill/MCP marketplace; the custom TeamUp model; SSO/SCIM; cross-team event mesh beyond the single PO→QA trigger.
+
+---
+
+## 4. Architecture (decided)
+
+**Modular monolith + a background worker, on PostgreSQL. Not microservices.**
+
+- **One deployable**, internally divided into modules with explicit interfaces — modules call each other through interfaces, **never** by reaching into another module's tables. This is the discipline that keeps the monolith modular and the extraction path clean.
+- **The one split:** a **web/API** process and a **worker** process share the same codebase and the same Postgres DB; agent runs are enqueued on a **Postgres-backed job queue** and run in the worker, off the request path. This is the standard web+worker pattern, not microservices.
+- **Why not microservices now:** the domain is small, its boundaries are still unknown, and the distributed-systems tax (network hops, eventual consistency, partial failure, service discovery, distributed tracing) is pure overhead while we're proving the bet. Extract a module to a service **only on a measured signal** (the agent-runtime under load is the likely first candidate; a compliance boundary for future model training is another).
+- **Deployment:** one application image (run as web or worker via entrypoint) + PostgreSQL. Self-hostable and **air-gappable** as a single unit (important for the Iranian market).
+
+---
+
+## 5. Tech stack (locked)
+
+One backend language: **.NET**. Library-level bill of materials in `docs/V1_BUILD_PLAN.md` § *Tech stack & bill of materials*.
+
+| Layer | Choice | Notes |
+|---|---|---|
+| Backend | **.NET 10 (LTS) + ASP.NET Core** | Modular monolith. Chosen for enforceable module boundaries, the native web+worker host, and self-contained air-gapped images. **Go** reserved for a future hot-path runner; **Python** only as an optional sidecar if AI tooling ever demands it. |
+| Worker | Same codebase, separate entrypoint (Generic Host `BackgroundService`) | Web + worker, one image, one DB — not a second stack |
+| Data | **PostgreSQL 17+ + pgvector** | Relational data, skill index, working-memory embeddings, and the job queue — one store |
+| Frontend | **React SPA (Vite + TypeScript)**, served as static files from ASP.NET Core | Keeps the deployable a single unit. **Next.js** reserved for the public marketing site only (outside the air-gapped product). TeamUp.AI design system applies (see `docs/PRODUCT.md` §design) |
+| Git | **Gitea** (read-only in V1) | Skills source + code context; provider-agnostic adapter |
+| Models | **BYOK** over HTTP (OpenAI / Anthropic / Vertex / Ollama) via `Microsoft.Extensions.AI` | No token COGS, no lock-in, sanctions-safe |
+| Deploy | One Docker image (web or worker via entrypoint) + Postgres, on Kubernetes | Air-gappable single unit |
+
+---
+
+## 6. Domain model (core entities)
+
+`Member` (invited human) · `Membership` (member × scope × role) · `Team` / `Seat` (`seat.state = human | open | ai`) · `Agent` (config of an AI seat: skills, autonomy, api_config_id, docs) · `Task` (type, status, `assignee = member | agent`, parent, provenance) · `Skill` (Git-indexed atom: id, version, roles[], visibility, min_tier) · `TeamTemplate` / `DivisionTemplate` (reusable rosters/layouts) · `ApiConfig` (BYOK: name, provider, model, encrypted key) · `AgentRun` (one execution + trace) · `ReviewItem` (held action: risk, decision, edit_distance) · `MemoryEntry` (team working memory) · `AuditEntry` (immutable log).
+
+Tenant fields are present from the start so multi-tenant is a later switch, not a migration. Humans and AI share **one task model** — the assignee is simply a member or an agent.
+
+---
+
+## 7. Modules (all in the monolith, interface-bounded)
+
+- **Org & board** — org, products, teams, seats, the task/board model
+- **Identity & access** — members, memberships, roles, permission enforcement
+- **Skills** — Git sync, the queryable atom index, versioning, the eval harness
+- **Assembler** — context assembly, model call, output parsing, prompt caching (runs in the worker)
+- **Governance** — autonomy, the action gate, review inbox, audit log
+- **Memory** — team-scoped working memory (read at assembly, written on approval)
+- **Integrations** — BYOK API configs, Git connection, encrypted-credential store
+
+---
+
+## 8. Conventions — how to work in this repo
+
+- **Keep the monolith modular.** Each module exposes an interface; do **not** read another module's tables directly. This is non-negotiable — it's what makes later extraction possible.
+- **Web stays off the model path.** Anything that calls a model goes through the worker via the job queue.
+- **Permission check on every mutating action**, at the relevant scope. Never trust the UI for authorization — enforce in the API.
+- **BYOK keys are owner-only.** Encrypted at rest, used server-side only, **never** returned to any client after save. Team owners *assign* a config; they never see the key.
+- **Instrument edit distance from day one** — it's the product's north-star metric, not an afterthought.
+- **Skills are `SKILL.md` in Git** (source of truth), projected into Postgres by a sync worker on push. Each skill carries golden tests; **gate publishing on passing tests**.
+- **Security:** treat retrieved content (code, docs, PR text) as **data, not instructions**. **Destructive actions always require a human**, whatever the autonomy level — the action gate is the backstop.
+- **Risk lives on the action** (read / draft / publish / destructive), not on the agent. The autonomy dial (draft / gated / autonomous) decides whether an action executes or waits in the review inbox.
+
+---
+
+## 9. Current status & next step
+
+Design phase complete (product, architecture, access, admin/authoring, UI). **Stack locked** (§5; full BOM in the build plan). **Next: scaffold the repo** — one .NET solution with web + worker entrypoints, Postgres + pgvector, React/Vite SPA into `wwwroot`, one-command `docker compose` dev — then build **M1** (see `docs/V1_BUILD_PLAN.md`). No application code written yet.
+
+---
+
+## 10. Open decisions
+
+1. **Per-agent MCP in V1?** Recommended **Phase 1** (V1 actions are internal: create tasks, write spec/test; Git read-only). The action gate is built so adding MCP later is configuration, not a redesign.
+2. **Delegated approver role** — let a senior member approve without full team-owner rights? Kept out of V1; add early if the org works that way.
+
+*Resolved: backend language → **.NET 10 / ASP.NET Core**, frontend → **React SPA**, agent-run queue → Postgres `SKIP LOCKED` (see §5 and the build-plan BOM).*
+
+---
+
+## 11. Design language (summary)
+
+A **calm command center**: deep indigo sidebar, light content, color rationed so it always means something. The **seat-state triad is load-bearing** — human = slate, open = amber, AI = indigo — used on avatars, pills, bars, board cards. Teal = approved/good; amber = open/held; red = destructive only. The **autonomy dial** is a recurring color-graded control (draft slate → gated indigo → auto teal). Two trust surfaces get the most polish: **agent identity** (name, voice, work history) and the **review inbox** (with an expandable reasoning trace). Production font: Hanken Grotesk. Full hi-fi mockups exist (see deliverables index).
+
+---
+
+## 12. Design-phase deliverables (reference, not in-repo)
+
+These were produced during design and live outside the codebase; treat them as background:
+`TeamUp_V1_Solution_Document.docx` (the V1 spec — authoritative), `TeamUp_Business_Plan.docx`, `TeamUp_Business_Model_Canvas.pdf`, `TeamUp_Pitch_Deck.pptx`, `TeamUp_UI_HiFi.html/.pdf` (the design language), `TeamUp_Wireframes.html/.pdf`, `TeamUp_Divisions_Design.pdf`. `docs/PRODUCT.md` and `docs/V1_BUILD_PLAN.md` distill all of it for the build.
@@ -0,0 +1,167 @@
+# TeamUp.AI — Product Model & Decisions
+
+The full design. `CLAUDE.md` is the always-loaded summary; this is the reference for the complete model (including the parts deferred past V1). Phasing is noted per area.
+
+---
+
+## Object spine (6 layers)
+
+```
+Organization → Division → Product / Service → Team → Seat → Agent
+```
+
+- **Organization** — the company (e.g. AliaSaaS).
+- **Division** — Technical, Finance, HR, Sales & Marketing, Operations. *(Phase 1 UI; data model from the start.)*
+- **Product / Service** — engineering divisions organize around **products** (IPNOPS, Parsvice, …); other divisions around **services** (Payroll, Recruiting, …). Same entity, a `kind` tag.
+- **Team** — within a product/service; has a team type (template) seeding its default seats.
+- **Seat** — a role in one of three states: **human / open / AI**. A role = a name + reusable, versioned skill atoms, so any role on any team type can be AI-filled.
+- **Agent** — the AI staffing a seat: identity (name, monogram, voice, work history) + skills + tools + docs + autonomy + model.
+
+Everything below the division works unchanged per division. Humans and AI share one task model.
+
+---
+
+## Skills
+
+- **Atomic and versioned.** Authored as `SKILL.md` (YAML frontmatter + markdown body) in **Git (source of truth)**, projected to Postgres (queryable index) by a sync worker on push webhook.
+- **Frontmatter:** `id, name, version, roles[], inputs/outputs (I/O contract), actions (each risk-tagged: read/draft/publish/destructive), tools, context, visibility, min_tier`.
+- **Compose into seats.** A seat's behavior = house style + identity/overrides + its matched atoms. Suggested starting sets are UX scaffolding, not a "pack" entity.
+- **Eval:** golden tests per skill; quality metric = human edit distance; **publishing is gated on passing tests**.
+- **Free vs tier-based:** `visibility` (public | private-to-org) and `min_tier` (free/Team/Scale/Enterprise). Orgs see/assign only what their tier entitles + their private ones; higher-tier skills show locked with an upgrade prompt. *(Enforcement Phase 1.)*
+
+---
+
+## Autonomy, the action gate & review
+
+- **Autonomy dial** — per seat, set by the team owner: **draft-only / gated / autonomous**.
+- **Risk lives on the action**, not the agent. The gate compares the seat's autonomy to the action's risk: execute, or hold.
+- **Destructive actions always require a human**, whatever the autonomy level — the gate is the backstop (also the prompt-injection backstop).
+- **Review inbox** — held actions wait here; **approve / edit-and-approve / send back**. Edit-and-approve feeds the edit-distance metric and the future model. Each item carries an expandable **reasoning trace** (skills fired, context, memory, intended action) so approval is informed, never blind. This is the trust centerpiece and a competitive differentiator.
+- **Audit log** — every decision recorded immutably.
+
+---
+
+## The cartable & responsibilities
+
+Each person has a **cartable**: a single personal inbox of everything awaiting their action across all their teams. A **derived view**, not new storage.
+
+| Item | What it is | Who sees it |
+|---|---|---|
+| Task | A board task assigned to me | anyone in a seat |
+| Approval | A held AI action routed to me | team owners |
+| Input needed | An agent/teammate needs my answer mid-task | the person asked |
+| Handoff | Work arriving from another team, awaiting acceptance | receiving owner |
+| Sent back / mention | Returned to me, or where I'm named | the person named |
+
+The **board** is the team's shared view; the **cartable** is one person's pending slice; the **review inbox is the Approvals section of an owner's cartable** (members without approval rights don't see it). Responsibilities = the seats a person holds.
+
+---
+
+## Access / RBAC
+
+**Access = role × scope.** Roles are granted at a scope (org / division / product / team); memberships are additive.
+
+| Capability | Owner | Team owner | Member | Viewer |
+|---|---|---|---|---|
+| Billing & plan | ✓ | — | — | — |
+| Manage API keys (BYOK) | ✓ | assign only | — | — |
+| Invite / remove people | ✓ | team | — | — |
+| Create products / teams | ✓ | own team | — | — |
+| Configure agents | ✓ | own team | — | — |
+| Set autonomy dial | ✓ | own team | — | — |
+| Approve held / destructive | ✓ | own team | — | — |
+| Create & work tasks; chat | ✓ | ✓ | ✓ | — |
+| View board & outputs | ✓ | ✓ | ✓ | ✓ |
+| View audit log | ✓ | own team | — | — |
+
+**API keys are owner-only** (assign-only for team owners; keys never returned to a client; enforced server-side). V1 uses org + team scopes; division-lead role activates with the divisions layer. Deferred: SSO/SCIM, custom roles, delegated approver, division-scoped delegation.
+
+---
+
+## Built-in project management
+
+- **Board** with backlog → assigned → in progress → in review → done. **Tasks** have type (spec / story / test / review / release), status, assignee (human or AI), parent, provenance.
+- The **AI Product Owner** writes a spec **and** generates its child stories as real tasks. The **AI QA** picks up a build (a "done" transition) and drafts test results.
+- **Team-to-team handoff = an event, not a re-review.** When a task hits *done* it has already passed the producing team's governance; the boundary is a **pipe, not a gate**. The "done" transition emits a typed handoff event on the team-relationship graph, which lands as a **new task in the receiving team's basket** (with a provenance link). The receiving agent then acts per *its own* autonomy. Review happens per-seat on each side, never at the seam.
+- **Intake mode** per relationship: **auto-accept** (default for trusted internal links) or **triage** (events wait in an intake tray the owner accepts/declines). *(Event mesh beyond the single PO→QA trigger is Phase 1+.)*
+- **Guardrails:** rate-limit triggers and detect cycles (loop/storm protection); destructive always needs a human.
+
+---
+
+## Per-agent MCP *(Phase 1)*
+
+Each seat is assigned MCP servers and a **chosen subset of their tools** (a seat uses some tools of a server, not all). Every MCP tool call is **risk-tagged and flows through the same action gate** — publish/destructive calls are held in review. Git write-back can itself be an internal MCP server, so it falls out of the same mechanism rather than a separate build. Skills = how an agent thinks; MCP = what it can do; the dial governs both.
+
+---
+
+## Git abstraction
+
+Provider-agnostic `GitProvider` interface: **GitHub / GitLab / Azure DevOps / self-hosted Gitea**. Used for (a) the skill registry source, (b) codebase context (chunked, delta-indexed embeddings — re-embed only what changed), (c) write-back (PR comments, issues, branches) *(write-back Phase 2, as an MCP server)*. V1 = Gitea, read-only. Per-org OAuth/credentials, isolated across tenants.
+
+---
+
+## Shared memory
+
+Team-scoped, pgvector. Built progressively: **working memory** (V1 — decisions/findings/corrections, read at assembly, written on approval) → **episodic** (full output history + edit distance; the training-data foundation) → **semantic** (entity knowledge graph). Institutional knowledge = switching-cost moat. Strict isolation across teams and orgs.
+
+---
+
+## BYOK & models
+
+Customers connect their own providers (OpenAI / Anthropic / Google Vertex / Ollama-self-hosted / any OpenAI-compatible) and pay the model bill directly — **TeamUp.AI never resells tokens.** Eliminates token COGS (SaaS-grade margins), satisfies enterprise data-control, and via self-hosted models gives Iran a fully air-gapped, sanctions-safe option. Orgs manage N **named API configs** (e.g. `Vertex-Pro` for reasoning roles, `Vertex-Flash` for high-volume) and assign one per agent; skills suggest a tier (reasoning/fast), the owner chooses. Per-seat **fallback** config so a run doesn't fail silently. Keys encrypted at rest, server-side only.
+
+---
+
+## The runtime assembler
+
+For each agent run (in the worker):
+
+```
+trigger (task / event / chat) → enqueue AgentRun
+worker: house-style + identity/overrides + matched atoms (by task type / I/O)
+        + permitted docs & code (RAG, pgvector) + working memory  → prompt (+ prompt cache)
+call customer model (BYOK, per-seat config, with fallback) → parse output → action + risk tag
+action gate: autonomy vs risk → execute  |  hold in review inbox
+on approval (or autonomous): execute → record edit distance → write working memory → audit
+```
+
+V1 actions are **internal** (create/update tasks; write a spec or test plan; read Git for context). MCP tool-calling and Git write-back are deferred but the gate/risk-tags are built to accept them as configuration.
+
+---
+
+## Admin & authoring *(Phase 1; data hooks in V1)*
+
+**Two admin levels:** the **platform admin** (vendor) curates the public catalogue — the free/tier-gated skill library and the standard division/team templates, setting each one's tier; the **org admin** (a customer Owner) authors that org's **private** skills and **custom** templates within their tier (Scale+).
+
+- **Skill studio** — author frontmatter + body; versioning with changelog/diff/rollback; eval-gated publishing; commits to Git on publish (Git stays source of truth).
+- **Templates** — a **team template** = a default roster (seats + role name + suggested skills + default autonomy); a **division template** = services/products + the team templates beneath them; instantiating scaffolds the whole structure; versioned with opt-in propagation.
+- **AI-suggested skills per role** — name a role (+ optional description) and the internal AI recommends skills from the entitled library with a one-line rationale each; pick some or **add all**. Powers both seat config and the template builder.
+
+---
+
+## Business model
+
+**BYOK + a platform subscription** (no token reselling). Four tiers per org/month:
+
+| Tier | Price | For | Key inclusions |
+|---|---|---|---|
+| Free | $0 | Evaluation | 1 team, 1 AI seat, public skills, BYOK |
+| Team | $79 | A single product team | ≤3 products, ≤5 AI seats, all standard team types, review inbox |
+| Scale | $249 | Multi-product orgs | Unlimited products & seats, private skills, custom team types, API + webhooks, audit export, analytics |
+| Enterprise | Custom | Large / regulated | SSO, on-prem, compliance export, SLA, dedicated support |
+
+Free tier + the **gap finder** = product-led front door. Upgrade trigger = hitting the Team product/seat cap. **Future: a custom TeamUp model** fine-tuned on review-correction data (the data flywheel), offered free-for-a-period or at-cost, self-hostable/air-gapped.
+
+---
+
+## Competition & moat
+
+Closest: **Relevance AI** (general-purpose "AI workforce" + visual canvas, but no software-team org model, board, or governance depth). Others: **Devin** (one role: engineering), **ChatPRD** (PM role), **CrewAI/LangGraph/AutoGen** (frameworks, not products), **Copilot/Cursor** (individual assistants), **Jira/Linear/Azure DevOps** (track work, don't do it), **Copilot Studio/Agentforce** (general automation, ecosystem-locked). **No direct Iranian competitor**; sanctions keep global players out, and BYOK + self-hosted removes foreign dependency. **Moat:** team-memory (switching cost) + the review-correction dataset (powers the model) + the Iranian-market barrier — none replicable without first building the governance layer.
+
+---
+
+## Phasing
+
+- **Phase 0 — Dogfood:** V1 inside AliaSaaS (PO + QA, one team). Prove the bet.
+- **Phase 1 — PLG:** free tier + design partners; gap finder; external Git read; per-agent MCP; working memory; multi-tenant; 4-tier billing; eval/observability/analytics; skill studio, templates, AI suggestion; divisions UI.
+- **Phase 2 — Global + own model:** Git write-back; episodic/semantic memory; skill & MCP marketplaces; the TeamUp model; compliance pack.
@@ -0,0 +1,124 @@
+# TeamUp.AI V1 — Build Plan
+
+The narrow wedge: **AI Product Owner + AI QA, on one team, through the board and review, inside AliaSaaS.** Build in order; each milestone is shippable. The point of V1 is to measure **human edit distance** on PO and QA work — instrument it from M1.
+
+**Before M1:** the stack is locked (see *Tech stack & bill of materials* below). Stand up the repo — one **.NET 10** solution with two entrypoints (**web/API** + **worker**) sharing the domain-module projects, **PostgreSQL 17+ + pgvector**, EF Core migrations, and a React/Vite SPA built into the web project's `wwwroot` — plus one-command local dev (`docker compose`: app + worker + postgres) and CI. No feature code yet: just the skeleton and the project layout that enforces module boundaries.
+
+---
+
+## Tech stack & bill of materials (locked)
+
+**Backend.** .NET 10 (LTS), ASP.NET Core Minimal APIs (endpoints grouped per module). One solution, two Generic-Host entrypoints — `web` and `worker` — sharing the domain-module projects. Boundaries enforced as separate projects with interface-only references (no cross-module table access).
+
+**Data & persistence.** PostgreSQL 17+ with pgvector · EF Core 10 + Npgsql · `Pgvector.EntityFrameworkCore` for vector columns/queries · EF Core migrations.
+
+**Agent-run queue (M4).** A domain-owned `jobs` table drained with `SELECT … FOR UPDATE SKIP LOCKED` by a worker `BackgroundService` — the run lifecycle (queued → running → output → review) is domain state, kept explicit. *(Alternative if outbox/messaging ergonomics are wanted later: Wolverine on Postgres. Hangfire/Quartz only for M6's scheduled triggers.)*
+
+**AI layer — thin adapters (M3–M4).** `Microsoft.Extensions.AI` (`IChatClient` / `IEmbeddingGenerator`) as the provider-agnostic seam, with thin per-provider HTTP adapters behind it · `Microsoft.Extensions.Http.Resilience` (Polly) for the per-seat fallback/retry chain · air-gapped embeddings via `SmartComponents.LocalEmbeddings` or raw `Microsoft.ML.OnnxRuntime` (MiniLM/bge, CPU-only), switching to a provider's embedding API when BYOK keys are present.
+
+**Cross-cutting.** Auth/RBAC — ASP.NET Core Identity + JWT (OpenIddict later if a full OAuth server is needed) · BYOK at rest — AES-GCM with a deployment master key; keys owner-only, server-side, never returned to a client · Validation — FluentValidation · Mapping — Mapperly (source-gen) · Resilience — Polly · Observability — OpenTelemetry + Serilog (carries the edit-distance metric from M1).
+
+**Testing & the golden-tested-skills rule (M2).** xUnit · Testcontainers (real Postgres) · **Verify** for snapshot/golden tests of skills and prompt outputs.
+
+**Frontend.** React SPA — Vite + TypeScript, built into the web project's `wwwroot` (single deployable). React Router · TanStack Query (server state) · Zustand (client state) · shadcn/ui + Tailwind · **React Flow (xyflow)** for the live org chart · **dnd-kit** for the board · React Hook Form + Zod · Recharts/Tremor for the M6 analytics. Typed API client generated from ASP.NET's OpenAPI (orval / openapi-typescript) into TanStack Query hooks — end-to-end types. *(Next.js is reserved for the separate public marketing site, not the product.)*
+
+**Dev & deploy.** One Docker image run as web or worker via entrypoint, + Postgres; one-command `docker compose` for local dev; Kubernetes for prod; air-gappable as a single unit.
+
+---
+
+## M1 — Org, board, access & cartable
+
+**Goal:** the skeleton — people, permissions, and a working board with the three seat states. No AI yet.
+
+**Tasks**
+- Entities: `Member`, `Membership` (scope + role), `Team`, `Seat` (state: human/open/ai), `Task` (type, status, assignee = member|agent, parent, provenance), `AuditEntry`.
+- Roles & **permission enforcement** middleware — a check on every mutating action at the relevant scope (Owner / Team owner / Member / Viewer).
+- Invitation flow (email → join → land in cartable).
+- The **board** UI: columns backlog → in progress → in review → done; create/move/assign tasks (human assignees for now).
+- The **cartable** as a derived view (tasks assigned to me, sent-backs, mentions; Approvals section stubbed for owners).
+- Edit-distance instrumentation **stubbed in** (the data path exists, even with no AI output yet).
+- Audit log writing on key actions.
+
+**Acceptance:** a CEO can invite a member, assign them a role on a team, both see the board scoped to their permissions, tasks move across columns, and each person sees their own cartable. A member cannot perform owner-only actions (verified).
+
+---
+
+## M2 — Skill registry
+
+**Goal:** skills flow from Git into a queryable index, with the first PO/QA atoms.
+
+**Tasks**
+- `GitProvider` interface; Gitea read adapter; webhook → sync worker.
+- Parse `SKILL.md` (frontmatter + body) → `Skill` rows in Postgres (incl. `visibility`, `min_tier` fields — hooks only).
+- pgvector index over skills for matching.
+- Eval harness: run a skill's golden tests; report pass/fail + edit distance; **block publish on failure**.
+- Author the four V1 atoms in Git: `spec-writing`, `story-breakdown`, `test-plan-generation`, `diff-review` — each with frontmatter (roles, I/O, risk-tagged actions, context) and golden tests.
+
+**Acceptance:** pushing a `SKILL.md` to Gitea indexes it within seconds; the four atoms appear, queryable by role; their golden tests run and pass.
+
+---
+
+## M3 — Seat config + BYOK
+
+**Goal:** configure an AI seat and connect a model — securely.
+
+**Tasks**
+- `Agent` entity (skills[], autonomy, api_config_id, docs[]) bound to a seat; flip a seat open → AI.
+- Seat configurator UI: pick skills (+ versions), set autonomy dial, attach docs/repo context, choose model config.
+- `ApiConfig` (BYOK): name, provider, model, **encrypted** key. **Owner-only** create/view; team owners assign from a list and never see the key; keys never returned to the client after save.
+- Model adapter interface + adapters for the providers in use (HTTP); per-seat **fallback** config.
+
+**Acceptance:** an owner adds a `Vertex-Pro` config (key stored encrypted, not retrievable); a team owner configures Aria (PO) with skills, gated autonomy, docs, and that config — without ever seeing the key; a test call succeeds.
+
+---
+
+## M4 — Assembler + worker
+
+**Goal:** a task becomes an agent run becomes a parsed output.
+
+**Tasks**
+- Job queue: a Postgres `jobs` table drained with `FOR UPDATE SKIP LOCKED` by a worker `BackgroundService`; enqueue an `AgentRun` on trigger (task assigned / chat).
+- Worker pulls a job and runs the **assembler**: house-style + identity/overrides + matched atoms (by task type / I/O) + permitted docs & code (RAG via pgvector) + working memory → prompt, with **prompt caching**.
+- Call the seat's model (BYOK, with fallback); store the full run + trace on `AgentRun`.
+- Parse output into an **action + risk tag** (PO: spec + proposed child stories; QA: test plan from a diff).
+
+**Acceptance:** assigning a feature task to Aria produces a spec and a set of proposed child stories as a parsed result, with the assembled context and reasoning captured on the run. Nothing executes yet (gate is M5).
+
+---
+
+## M5 — Action gate + review inbox
+
+**Goal:** governance closes the loop; edit distance is captured for real.
+
+**Tasks**
+- Action gate: compare seat autonomy (draft/gated/autonomous) to action risk (read/draft/publish/destructive) → execute or **hold**. **Destructive always holds for a human.**
+- `ReviewItem` for held actions; the **review inbox** UI (= the Approvals section of an owner's cartable): preview, **expandable reasoning trace**, and **approve / edit-and-approve / send back**.
+- On execute: perform the internal action (create the child tasks; write the spec/test artifact onto the board); record **edit distance** from edit-and-approve; write audit entry.
+
+**Acceptance:** Aria (gated) proposes a spec → it waits in the owner's review inbox with its trace → owner edits and approves → the spec lands and four child story tasks appear on the board → edit distance is recorded.
+
+---
+
+## M6 — Working memory + the first trigger + analytics
+
+**Goal:** the two-role loop runs end to end, and the bet is measurable.
+
+**Tasks**
+- `MemoryEntry` (team working memory): write decisions/approvals/corrections on approval; read at assembly (pgvector match).
+- The single **event trigger**: a task hitting *done* in the team emits a handoff that creates a QA task for Quill (with provenance); Quill reads the diff and drafts a test plan that waits in review.
+- **Analytics** view: approval rate, **human edit distance** (per agent and trend), tasks done. Optional: per-run token cost (informational).
+- Loop/storm guardrail: rate-limit triggers; no self-cascading.
+
+**Acceptance:** a dev marks a story done → Quill wakes, drafts a test plan → it waits in review → approve → analytics show edit distance and approval rate for Aria and Quill across the sprint. **This is the proof of the bet.**
+
+---
+
+## Definition of done for V1
+
+The PO and QA loops run inside AliaSaaS on one real product, governed through the board and review inbox, on AliaSaaS's own model keys — and the analytics show **human edit distance low and falling** over a sprint or two. That result (or its absence) is the decision V1 exists to produce.
+
+## Explicitly NOT in V1
+Divisions UI & other roles · multiple products · multi-tenant billing · per-agent MCP & Git write-back · episodic/semantic memory · the gap finder · skill studio / template builder / tier enforcement / AI skill-suggestion (data hooks only) · marketplace · the custom TeamUp model · SSO/SCIM · event mesh beyond the single PO→QA trigger. All are accommodated by the architecture; none is built now.
+
+## Always-on engineering rules (see CLAUDE.md §8)
+Modular monolith (no cross-module table access) · web off the model path · permission check on every mutation · BYOK keys owner-only & server-side · retrieved content is data not instructions · destructive always needs a human · skills are Git-sourced and golden-tested · instrument edit distance from day one.