feat: V2 microservices stack — backend services, gateway, JWT auth

Add full V2 architecture: identity, content, studio (.NET 10) and file,
render, notification, gateway (Go) services with vendored deps, plus DB
migrations, event/API contracts, and an init-db script.

Wire the Next.js frontend to the gateway: server-side JWT auth routes
(login/register/refresh/logout/me), gateway fetch helper, and session/
cookie/jwt helpers under src/lib.

Containerize the stack via docker-compose.v2.yml and per-service
Dockerfiles. Base images resolve through a Nexus mirror (Docker Hub) and
MCR directly; npm/NuGet pull from Nexus groups. Self-host fonts via
next/font/local to avoid Google Fonts (geo-blocked).

Add CI workflow and ignore .env.v2, *.stackdump, and .NET bin/obj.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
soroush.asadi
2026-05-29 23:29:31 +03:30
parent 53ea78a00d
commit 90ac0b81d1
7636 changed files with 3707504 additions and 240 deletions
@@ -0,0 +1,199 @@
# WebSocket Protocol — Render Progress
Live render progress is pushed to the browser via WebSocket.
## Connection
```
URL: wss://api.flatrender.ir/ws/v1/render/{job_id}?token={jwt}
Headers: Sec-WebSocket-Protocol: flatrender.v1
```
The JWT carries `user_id` + `tenant_id`. The Gateway validates that the
caller owns the `job_id`. Connection is closed with code `4403` if not.
## Server → Client messages
All server messages are JSON with `type` discriminator.
### `hello` — sent on connect
```json
{
"type": "hello",
"job_id": "uuid",
"current_state": {
"step": "Rendering",
"progress": 47,
"current_frame": 470,
"total_frames": 720,
"started_at": "2026-05-27T10:00:00Z"
}
}
```
### `progress` — frequent (max every 500ms)
```json
{
"type": "progress",
"job_id": "uuid",
"step": "Rendering",
"progress": 67,
"current_frame": 482,
"total_frames": 720,
"eta_seconds": 45,
"preview_b64": "data:image/jpeg;base64,/9j/4AAQ...",
"active_nodes": 3,
"message": null
}
```
`preview_b64` is sent at most every 2s (last rendered frame thumbnail,
~5-15 KB).
### `step_change` — pipeline transition
```json
{
"type": "step_change",
"job_id": "uuid",
"from_step": "Rendering",
"to_step": "Validating",
"at": "2026-05-27T10:05:00Z"
}
```
### `frame_repair` — frames being re-rendered
```json
{
"type": "frame_repair",
"job_id": "uuid",
"missing_frames": [347, 348, 521],
"corrupt_frames": [402],
"attempt": 1,
"max_attempts": 3
}
```
### `node_event` — node-related notice
```json
{
"type": "node_event",
"job_id": "uuid",
"event": "node_crashed",
"node_id": "uuid",
"auto_recovered": true,
"message": "AE crashed on node-7, work reassigned"
}
```
### `done` — terminal success
```json
{
"type": "done",
"job_id": "uuid",
"export_id": "uuid",
"output_url": "https://cdn.flatrender.ir/exports/abc.mp4",
"thumbnail_url": "https://cdn.flatrender.ir/exports/abc.jpg",
"duration_sec": 30,
"size_bytes": 14523456,
"compute_seconds": 124
}
```
### `failed` — terminal failure
```json
{
"type": "failed",
"job_id": "uuid",
"failed_at_step": "Rendering",
"error_message": "AE crashed too many times on this template",
"error_code": "AE_REPEATED_CRASH",
"refund_issued": true,
"trace_id": "uuid"
}
```
### `cancelled` — terminal cancellation
```json
{
"type": "cancelled",
"job_id": "uuid",
"cancelled_at": "2026-05-27T10:03:12Z",
"progress_when_cancelled": 42
}
```
### `error` — protocol-level error (kept open)
```json
{
"type": "error",
"code": "RATE_LIMIT",
"message": "Too many messages; slow down"
}
```
### `ping` — keepalive
```json
{ "type": "ping", "t": 1714294800 }
```
Client SHOULD respond with `{"type":"pong","t":1714294800}` within 30s
or the server may close the connection.
## Client → Server messages
### `pong`
```json
{ "type": "pong", "t": 1714294800 }
```
### `subscribe_snapshot` — also get scene snapshot updates over same socket
```json
{ "type": "subscribe_snapshot", "snapshot_id": "uuid" }
```
### `cancel_job` — request cancellation
```json
{ "type": "cancel_job", "job_id": "uuid" }
```
The server replies with a `cancelled` message once accepted.
## Close codes
| Code | Meaning |
|-------|------------------------------------------|
| 1000 | Normal close (job completed/failed) |
| 1001 | Server going away (deploy) |
| 1008 | Policy violation (bad message) |
| 4401 | Unauthorized (bad/expired JWT) |
| 4403 | Forbidden (don't own this job) |
| 4404 | Job not found |
| 4429 | Rate limited |
## Reconnect strategy (client)
- Reconnect with exponential backoff (1s, 2s, 4s, 8s, max 30s)
- On reconnect, `hello` carries `current_state` so UI catches up
- WebSocket is best-effort; UI should also poll `GET /v1/renders/{id}`
if it hasn't received `progress` in > 15s
## Rate limits
| Direction | Limit |
|-----------|-------|
| Server → Client `progress` | Max 2 Hz |
| Server → Client total messages | 10 per second |
| Client → Server | 5 per second |