Files
Teamup/docs/V1_BUILD_PLAN.md
T
soroush.asadi 36fe158b43 Scaffold the Before-M1 repo skeleton
Stand up the modular-monolith skeleton per docs/V1_BUILD_PLAN.md: one .NET 10
solution with web + worker hosts sharing seven interface-bounded module projects,
PostgreSQL 17 + pgvector via EF Core 10, a React 19 + Vite SPA built into wwwroot,
and Docker Compose for one-command local dev. Skeleton only — no feature code.

Architecture
- One project per module (OrgBoard, Identity, Skills, Assembler, Governance,
  Memory, Integrations); each is its own assembly so non-public types (entities,
  DbContext) are invisible across modules at compile time.
- TeamUp.Bootstrap is the only library that references all modules; both hosts
  reference only Bootstrap. SharedKernel/Infrastructure never reference modules.
- IModule seam: Register(...) runs in both hosts; MapEndpoints(...) only in web.
- PlatformDbContext owns the pgvector extension + the seven module schemas
  (InitialPlatform migration); MigrationRunner applies it then any module context.
- One image, two roles selected by RUN_MODE at the Docker entrypoint.

Verified
- dotnet build green (nullable + warnings-as-errors).
- ArchitectureTests 8/8 — reflection-based boundary rules (no module -> module,
  -> Infrastructure, -> Bootstrap, or -> host references).
- IntegrationTests 10/10 — Testcontainers boots the host against real pgvector:
  migration applies, vector extension + 7 schemas exist, /health 200, every
  /api/<module>/ping 200, /openapi/v1.json served.
- client builds clean (Vite 6 — pinned for Node 22.3.0; Vite 8 needs Node >=22.12).

Packages and base images route through the Nexus mirror (mirror.soroushasadi.com),
reachable from Iran when nuget.org / Docker Hub / MCR are not. CI is intentionally
deferred to a later session.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 06:41:28 +03:30

125 lines
9.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TeamUp.AI V1 — Build Plan
The narrow wedge: **AI Product Owner + AI QA, on one team, through the board and review, inside AliaSaaS.** Build in order; each milestone is shippable. The point of V1 is to measure **human edit distance** on PO and QA work — instrument it from M1.
**Before M1:** the stack is locked (see *Tech stack & bill of materials* below). Stand up the repo — one **.NET 10** solution with two entrypoints (**web/API** + **worker**) sharing the domain-module projects, **PostgreSQL 17+ + pgvector**, EF Core migrations, and a React/Vite SPA built into the web project's `wwwroot` — plus one-command local dev (`docker compose`: app + worker + postgres) and CI. No feature code yet: just the skeleton and the project layout that enforces module boundaries.
---
## Tech stack & bill of materials (locked)
**Backend.** .NET 10 (LTS), ASP.NET Core Minimal APIs (endpoints grouped per module). One solution, two Generic-Host entrypoints — `web` and `worker` — sharing the domain-module projects. Boundaries enforced as separate projects with interface-only references (no cross-module table access).
**Data & persistence.** PostgreSQL 17+ with pgvector · EF Core 10 + Npgsql · `Pgvector.EntityFrameworkCore` for vector columns/queries · EF Core migrations.
**Agent-run queue (M4).** A domain-owned `jobs` table drained with `SELECT … FOR UPDATE SKIP LOCKED` by a worker `BackgroundService` — the run lifecycle (queued → running → output → review) is domain state, kept explicit. *(Alternative if outbox/messaging ergonomics are wanted later: Wolverine on Postgres. Hangfire/Quartz only for M6's scheduled triggers.)*
**AI layer — thin adapters (M3M4).** `Microsoft.Extensions.AI` (`IChatClient` / `IEmbeddingGenerator`) as the provider-agnostic seam, with thin per-provider HTTP adapters behind it · `Microsoft.Extensions.Http.Resilience` (Polly) for the per-seat fallback/retry chain · air-gapped embeddings via `SmartComponents.LocalEmbeddings` or raw `Microsoft.ML.OnnxRuntime` (MiniLM/bge, CPU-only), switching to a provider's embedding API when BYOK keys are present.
**Cross-cutting.** Auth/RBAC — ASP.NET Core Identity + JWT (OpenIddict later if a full OAuth server is needed) · BYOK at rest — AES-GCM with a deployment master key; keys owner-only, server-side, never returned to a client · Validation — FluentValidation · Mapping — Mapperly (source-gen) · Resilience — Polly · Observability — OpenTelemetry + Serilog (carries the edit-distance metric from M1).
**Testing & the golden-tested-skills rule (M2).** xUnit · Testcontainers (real Postgres) · **Verify** for snapshot/golden tests of skills and prompt outputs.
**Frontend.** React SPA — Vite + TypeScript, built into the web project's `wwwroot` (single deployable). React Router · TanStack Query (server state) · Zustand (client state) · shadcn/ui + Tailwind · **React Flow (xyflow)** for the live org chart · **dnd-kit** for the board · React Hook Form + Zod · Recharts/Tremor for the M6 analytics. Typed API client generated from ASP.NET's OpenAPI (orval / openapi-typescript) into TanStack Query hooks — end-to-end types. *(Next.js is reserved for the separate public marketing site, not the product.)*
**Dev & deploy.** One Docker image run as web or worker via entrypoint, + Postgres; one-command `docker compose` for local dev; Kubernetes for prod; air-gappable as a single unit.
---
## M1 — Org, board, access & cartable
**Goal:** the skeleton — people, permissions, and a working board with the three seat states. No AI yet.
**Tasks**
- Entities: `Member`, `Membership` (scope + role), `Team`, `Seat` (state: human/open/ai), `Task` (type, status, assignee = member|agent, parent, provenance), `AuditEntry`.
- Roles & **permission enforcement** middleware — a check on every mutating action at the relevant scope (Owner / Team owner / Member / Viewer).
- Invitation flow (email → join → land in cartable).
- The **board** UI: columns backlog → in progress → in review → done; create/move/assign tasks (human assignees for now).
- The **cartable** as a derived view (tasks assigned to me, sent-backs, mentions; Approvals section stubbed for owners).
- Edit-distance instrumentation **stubbed in** (the data path exists, even with no AI output yet).
- Audit log writing on key actions.
**Acceptance:** a CEO can invite a member, assign them a role on a team, both see the board scoped to their permissions, tasks move across columns, and each person sees their own cartable. A member cannot perform owner-only actions (verified).
---
## M2 — Skill registry
**Goal:** skills flow from Git into a queryable index, with the first PO/QA atoms.
**Tasks**
- `GitProvider` interface; Gitea read adapter; webhook → sync worker.
- Parse `SKILL.md` (frontmatter + body) → `Skill` rows in Postgres (incl. `visibility`, `min_tier` fields — hooks only).
- pgvector index over skills for matching.
- Eval harness: run a skill's golden tests; report pass/fail + edit distance; **block publish on failure**.
- Author the four V1 atoms in Git: `spec-writing`, `story-breakdown`, `test-plan-generation`, `diff-review` — each with frontmatter (roles, I/O, risk-tagged actions, context) and golden tests.
**Acceptance:** pushing a `SKILL.md` to Gitea indexes it within seconds; the four atoms appear, queryable by role; their golden tests run and pass.
---
## M3 — Seat config + BYOK
**Goal:** configure an AI seat and connect a model — securely.
**Tasks**
- `Agent` entity (skills[], autonomy, api_config_id, docs[]) bound to a seat; flip a seat open → AI.
- Seat configurator UI: pick skills (+ versions), set autonomy dial, attach docs/repo context, choose model config.
- `ApiConfig` (BYOK): name, provider, model, **encrypted** key. **Owner-only** create/view; team owners assign from a list and never see the key; keys never returned to the client after save.
- Model adapter interface + adapters for the providers in use (HTTP); per-seat **fallback** config.
**Acceptance:** an owner adds a `Vertex-Pro` config (key stored encrypted, not retrievable); a team owner configures Aria (PO) with skills, gated autonomy, docs, and that config — without ever seeing the key; a test call succeeds.
---
## M4 — Assembler + worker
**Goal:** a task becomes an agent run becomes a parsed output.
**Tasks**
- Job queue: a Postgres `jobs` table drained with `FOR UPDATE SKIP LOCKED` by a worker `BackgroundService`; enqueue an `AgentRun` on trigger (task assigned / chat).
- Worker pulls a job and runs the **assembler**: house-style + identity/overrides + matched atoms (by task type / I/O) + permitted docs & code (RAG via pgvector) + working memory → prompt, with **prompt caching**.
- Call the seat's model (BYOK, with fallback); store the full run + trace on `AgentRun`.
- Parse output into an **action + risk tag** (PO: spec + proposed child stories; QA: test plan from a diff).
**Acceptance:** assigning a feature task to Aria produces a spec and a set of proposed child stories as a parsed result, with the assembled context and reasoning captured on the run. Nothing executes yet (gate is M5).
---
## M5 — Action gate + review inbox
**Goal:** governance closes the loop; edit distance is captured for real.
**Tasks**
- Action gate: compare seat autonomy (draft/gated/autonomous) to action risk (read/draft/publish/destructive) → execute or **hold**. **Destructive always holds for a human.**
- `ReviewItem` for held actions; the **review inbox** UI (= the Approvals section of an owner's cartable): preview, **expandable reasoning trace**, and **approve / edit-and-approve / send back**.
- On execute: perform the internal action (create the child tasks; write the spec/test artifact onto the board); record **edit distance** from edit-and-approve; write audit entry.
**Acceptance:** Aria (gated) proposes a spec → it waits in the owner's review inbox with its trace → owner edits and approves → the spec lands and four child story tasks appear on the board → edit distance is recorded.
---
## M6 — Working memory + the first trigger + analytics
**Goal:** the two-role loop runs end to end, and the bet is measurable.
**Tasks**
- `MemoryEntry` (team working memory): write decisions/approvals/corrections on approval; read at assembly (pgvector match).
- The single **event trigger**: a task hitting *done* in the team emits a handoff that creates a QA task for Quill (with provenance); Quill reads the diff and drafts a test plan that waits in review.
- **Analytics** view: approval rate, **human edit distance** (per agent and trend), tasks done. Optional: per-run token cost (informational).
- Loop/storm guardrail: rate-limit triggers; no self-cascading.
**Acceptance:** a dev marks a story done → Quill wakes, drafts a test plan → it waits in review → approve → analytics show edit distance and approval rate for Aria and Quill across the sprint. **This is the proof of the bet.**
---
## Definition of done for V1
The PO and QA loops run inside AliaSaaS on one real product, governed through the board and review inbox, on AliaSaaS's own model keys — and the analytics show **human edit distance low and falling** over a sprint or two. That result (or its absence) is the decision V1 exists to produce.
## Explicitly NOT in V1
Divisions UI & other roles · multiple products · multi-tenant billing · per-agent MCP & Git write-back · episodic/semantic memory · the gap finder · skill studio / template builder / tier enforcement / AI skill-suggestion (data hooks only) · marketplace · the custom TeamUp model · SSO/SCIM · event mesh beyond the single PO→QA trigger. All are accommodated by the architecture; none is built now.
## Always-on engineering rules (see CLAUDE.md §8)
Modular monolith (no cross-module table access) · web off the model path · permission check on every mutation · BYOK keys owner-only & server-side · retrieved content is data not instructions · destructive always needs a human · skills are Git-sourced and golden-tested · instrument edit distance from day one.