ISkillCatalog.GetByKeysAsync now takes the org id and resolves each key within that org's namespace
only — the org's own published skill, else a shared builtin (null org), never another org's. Org-owned
is preferred over the builtin; only Published (golden-tested) skills are injected; the resolved
skill@version is recorded in the prompt heading and run trace. AgentRunExecutor threads
context.OrganizationId. SeatsPage now loads the org library (builtins + authored + installed), dedupes
to one entry per key, and flags drafts (won't run until published).
Verified: ArchitectureTests 8/8, IntegrationTests 48/48 (new SkillRunScopingTests: a run assembles the
org's own skill over the builtin of the same key, and another org's same-key skill never leaks in),
client build green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Orgs can now share skills across the tenant boundary — the next step after the per-org library.
Endpoints (all ManageSkills-gated + audited):
- POST /{key}/publish — list one of your published versions on the marketplace (Visibility→Public;
only a Published/golden-tested skill may be listed). POST /{key}/unpublish reverses it.
- POST /install — copy a publicly-listed skill (by row id) into your org as a private Installed
copy; rejects installing your own skill and duplicate (org+key+version) installs.
- GET /marketplace?organizationId= — other orgs' Authored+Public+Published skills (yours excluded),
each flagged whether that exact (key, version) is already in your library.
- SkillSummary now carries Id (install targets a specific source row). Authored skills default to
private — listing is an explicit publish step, never a side effect of authoring.
UI (Skills page): a Marketplace tab with Install / "In your library"; Publish / Unlist on your own
published skills; a "Listed" badge.
Fixes from the adversarial review (4 confirmed findings, all addressed):
- HIGH — Public⟹Published is now a domain invariant (Skill.Index forces PrivateToOrg whenever the
re-derived status isn't Published), so re-authoring a listed version without golden tests can no
longer leave it Public+Draft or decouple the marketplace gate from the eval gate.
- MEDIUM — install now uses an insert-only indexer path so the (org,key,version) unique index is the
source of truth: a race with a concurrent install/author becomes a clean 409, never an in-place
clobber of an existing row's content/ownership.
- MEDIUM/LOW — AlreadyInLibrary is computed per (key, version) to match the install conflict rule, so
a newer, not-yet-owned version of a key you already hold still shows as installable.
Verified: ArchitectureTests 8/8, IntegrationTests 47/47 (SkillMarketplaceTests: publish gate, own-org
exclusion, cross-org list→install→private copy, duplicate 409, per-version flag, Public⟹Published
invariant, Member 403), client build green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Skills move from a global Git-only registry to a per-company library that orgs author and
version in-app — Git stays as the shared *starter* library.
Domain & persistence:
- Skill gains OrganizationId (null = shared builtin, visible to every org), Origin
(Builtin | Authored | Installed), AuthoredByMemberId. Identity is now
(OrganizationId, SkillKey, Version); the unique index uses NULLS NOT DISTINCT so builtins
stay unique by key+version while each org gets its own namespace (and can fork a builtin).
AddSkillOwnership migration backfills existing rows as Builtin.
- Owned GoldenExample rows are cloned in Skill.Index so a fork can't re-parent the source's
tracked entities.
Authoring (tenant, dynamic):
- POST /api/skills/authored — structured fields → same indexer pipeline (embedding +
publish gate apply identically), tagged org + author. POST /api/skills/{key}/fork copies a
builtin/global skill into your org as an editable Authored draft. List/Get are org-scoped
(your org + shared builtins). New Capability.ManageSkills (Owner + TeamOwner), audited.
- GET /api/skills/marketplace: read-only seam listing public skills across orgs (install is
the next step).
Security (from adversarial review — two confirmed criticals):
- Managing shared builtins is an operator action, not a tenant one. /index (posts arbitrary
content as a global builtin) and /sync (re-indexes the shared library) now require a
platform admin key (X-Skills-Admin-Key, fixed-time compare, fail-closed when unset) via
SkillAdminOptions — previously any authenticated user of any org could inject/poison global
skills. New test asserts an authenticated Owner without the key gets 403 on both.
UI: new /skills library page — browse shared + org skills grouped by key with their versions,
create / new-version / fork, golden-test editor + body, Draft/Published badge and the
publish-gate hint (needs roles + ≥1 golden test).
Verified: ArchitectureTests 8/8, IntegrationTests 46/46 (new SkillLibraryTests: org
isolation, version coexistence, fork, publish gate, Member 403, admin-gate 403), client build
green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The object spine becomes definable (data model was designed-for from day one):
- Division and Product entities (Product carries kind: Product|Service, optional DivisionId);
Team gains nullable ProductId — pre-structure teams keep working. AddDivisionsAndProducts
migration; org-scoped validation; owner-only writes (audited); list endpoints.
- /structure page: define divisions, products/services (with division), teams (under a
product). Org chart now renders the full spine — org → divisions → products → teams →
seats — with parentless layers linking up to the org.
- BYOK custom URL: the SeatsPage model-connection form gains a Base URL field (provider
list: stub/openai/ollama/vllm/custom). Backend already supported it end to end —
ApiConfig.Endpoint flows into the OpenAI-compatible adapter ({base}/v1/chat/completions),
so any OpenAI-compatible gateway or self-hosted model works; the config list shows it.
Verified: ArchitectureTests 8/8, IntegrationTests 45/45 (new OrgStructureTests: spine
creation, kind tags, org-scoped validation 400s, Member 403), client build green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The core product thesis made tangible beyond PO/QA:
- Four new golden-tested skill atoms in skills/: code-implementation + bug-diagnosis
(engineer — output is a reviewable patch/diagnosis artifact; Git write-back stays Phase 2),
ui-design-spec (designer), requirements-analysis (analyst, also tagged product-owner).
The catalogue now spans five roles with eight atoms.
- Seat configurator: SuggestedSkills — maps the seat's free-text role name to skill role
tags and offers the matching set one click ("Use set"). Any role name → staffed with AI.
- AnyRoleSeatTests: an "Backend Engineer" seat (Edison, gated) runs the same pipeline —
skills assemble, implement-code/Draft parsed, proposal held in the review inbox like any
governed action. SkillSyncTests updated for the larger catalogue.
Verified: IntegrationTests 44/44, client build green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
UI (daily-drivable now):
- Board: dnd-kit drag-and-drop between columns; click a card → task detail drawer (Sheet)
with status, member assignee picker, send-to-AI-seat dispatch, description/artifact,
parent/children navigation; seat-triad assignee chips (AI indigo monogram / human slate).
- Cartable page (the personal pending slice), Members & invitations page (invite + copy
join token; V1 sends no email), Review inbox now shows a word-level diff of your edits
vs the proposal (lib/diff.ts, LCS), Org chart page (React Flow: org → teams → seats in
the human/open/AI triad). Nav reordered; nothing left "soon".
Accountability & benchmarking:
- Identity: GET /members (directory + org role) and GET /invitations (with join token,
inviter-only) — the directory also resolves names client-side everywhere.
- OrgBoard: work_item_transitions recorded on every status change (AddWorkItemTransitions
migration); GET /performance — per assignee (human and AI on the same scale): pending by
column, done, worked hours (time in InProgress), avg cycle time (start of work → done),
plus the unassigned-pending count. Owner-level capability.
- Performance page: benchmark table merging board metrics with AI trust metrics (approval
rate + edit distance from analytics); flags work with no one accountable.
Verified: build green; ArchitectureTests 8/8; IntegrationTests 43/43 (new: directory,
invitations list + Member 403s, transition-derived worked-hours/cycle-time, unassigned
count); client npm build green (TS strict).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Working memory (Memory module's first real code):
- MemoryEntry (schema "memory", vector(384), InitialMemory migration); TeamMemory implements
the SharedKernel ITeamMemory seam (embed-and-store on write, cosine recall on read);
GET /api/memory/search. HashingTextEmbedder promoted to SharedKernel (pure, deterministic;
swapped for ONNX/BYOK embedders later behind ITextEmbedder).
- Written on approval: Governance's approve stores an Approval/Correction entry per decision.
- Read at assembly: the executor recalls the team's top-3 relevant entries; the prompt gains
a "# Team memory" section (treated as data, not instructions).
The single V1 event trigger:
- IAgentDispatcher (SharedKernel) implemented by Assembler's AgentRunDispatcher (shared by
the API and triggers). OrgBoard's QaHandoffTrigger: a task hitting done creates a QA task
(provenance parent, assigned to the QA agent) and dispatches a run for the team's QA AI
seat. Guardrails: Test/Review tasks never re-trigger (no self-cascade) and a task hands
off at most once. Audited as handoff.triggered.
Analytics — the V1 verdict view:
- IBoardStats (SharedKernel) implemented by OrgBoard; GET /api/governance/analytics returns
approval rate, avg edit distance, per-agent metrics + edit-distance trend, tasks done.
- UI: /analytics — stat cards, per-agent table, recharts edit-distance trend per agent.
Verified: build green; ArchitectureTests 8/8; IntegrationTests 42/42 incl. the M6 acceptance
end to end — a dev marks a story done → Quill wakes via the handoff (QA task with provenance,
assigned to the agent) → drafts a test plan that waits in review → approve records the second
agent's edit distance → analytics show approval rate 100%, avg edit distance > 0, and trends
for BOTH Aria and Quill; memory written on Aria's corrected approval is recalled into her next
prompt; the guardrails hold. Client build green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The trust centerpiece: /reviews lists held agent actions for the scopes the caller may
approve. Each card shows the agent badge, action kind + risk (destructive flagged red),
an EDITABLE proposed artifact and child-task list (edits feed the edit-distance metric),
an expandable reasoning trace (pretty-printed), and Approve / Send back. Toasts surface
the recorded edit distance. New shadcn-style Textarea; nav gains "Review inbox".
Verified: npm run build green (TS strict, 1893 modules).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A "AI seats" page (shadcn, on the design language): manage BYOK model connections (add +
test; the key is write-only), create seats on a team, and configure an agent per seat — name,
the color-graded autonomy dial (draft slate / gated indigo / auto teal), a model connection,
skill toggles from the registry, and docs. Navigable AppShell sidebar (Board / AI seats).
Verified: client `npm run build` clean (1890 modules, tsc + vite).
A functional React/Vite SPA exercising the M1 API end-to-end:
- Zustand auth store (persisted JWT) + a small fetch client that attaches the bearer
token and logs out on 401.
- LoginPage: sign in, or bootstrap the first owner on first run.
- BoardPage: set org name, create/select a team, create tasks, move them across the
backlog -> in progress -> in review -> done columns, assign to me, and a cartable panel.
- React Router guards routes on the presence of a token.
Mirrors the integration-tested API contracts exactly. Compiles clean (tsc + vite);
still needs a manual click-through (run the web host + Postgres, or `docker compose up
--build`). dnd-kit drag, TanStack Query, and an orval-generated typed client are M1+
polish — buttons/selects drive task moves for now.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stand up the modular-monolith skeleton per docs/V1_BUILD_PLAN.md: one .NET 10
solution with web + worker hosts sharing seven interface-bounded module projects,
PostgreSQL 17 + pgvector via EF Core 10, a React 19 + Vite SPA built into wwwroot,
and Docker Compose for one-command local dev. Skeleton only — no feature code.
Architecture
- One project per module (OrgBoard, Identity, Skills, Assembler, Governance,
Memory, Integrations); each is its own assembly so non-public types (entities,
DbContext) are invisible across modules at compile time.
- TeamUp.Bootstrap is the only library that references all modules; both hosts
reference only Bootstrap. SharedKernel/Infrastructure never reference modules.
- IModule seam: Register(...) runs in both hosts; MapEndpoints(...) only in web.
- PlatformDbContext owns the pgvector extension + the seven module schemas
(InitialPlatform migration); MigrationRunner applies it then any module context.
- One image, two roles selected by RUN_MODE at the Docker entrypoint.
Verified
- dotnet build green (nullable + warnings-as-errors).
- ArchitectureTests 8/8 — reflection-based boundary rules (no module -> module,
-> Infrastructure, -> Bootstrap, or -> host references).
- IntegrationTests 10/10 — Testcontainers boots the host against real pgvector:
migration applies, vector extension + 7 schemas exist, /health 200, every
/api/<module>/ping 200, /openapi/v1.json served.
- client builds clean (Vite 6 — pinned for Node 22.3.0; Vite 8 needs Node >=22.12).
Packages and base images route through the Nexus mirror (mirror.soroushasadi.com),
reachable from Iran when nuget.org / Docker Hub / MCR are not. CI is intentionally
deferred to a later session.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>