Add scrape/ingestion engine + validation, and 24h shift hour-range visualization

Scrape engine (Services/Scraping/): pluggable IListingSource (working sample + Telegram/Divar credential-ready stubs) → IngestionService (content-hash dedupe → parse → validate → review queue) → ListingValidator (completeness score + spam screen) → IngestionWorker (config-gated hosted service). RawListing gains ContentHash/Confidence/ValidationNotes; RawListingStatus.Flagged. Admin /Admin gets run-now, source list, confidence + flagged queue.

Hour-range viz: _HourBar 24h timeline bar (colored by type, overnight wrap) on shift cards, recommendation cards, and detail.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
soroush.asadi
2026-06-03 08:18:19 +03:30
parent 69fa921fbd
commit 931b7b6ffb
24 changed files with 1439 additions and 26 deletions
+15 -5
View File
@@ -75,11 +75,21 @@ Shifts support fixed (مقطوع), hourly (ساعتی), **profit-share (درصد
centralizes the display; `Shift.SharePercent` holds the percentage; the listing-parser detects
"۵۰٪ / درصد / سهم" from raw posts; and `/Shifts` has a "سهم درآمد" filter.
### Listing parser (Stage 1)
`IListingParser` / `HeuristicListingParser` extracts kind (shift vs hire), role, shift type,
employment type, pay, city/district, and phone from a raw Persian post via keyword + regex
heuristics — **no AI dependency** (LLM APIs are blocked from Iran). Admin reviews the prefilled
form and publishes. Swap in an `LlmListingParser` later behind the same interface.
### Scrape / ingestion engine
Pluggable `IListingSource`s (working `SampleListingSource`; credential-ready `Telegram`/`Divar`
stubs) → `IngestionService` **dedupes by content hash → parses → validates → enqueues** as
`RawListing` (status New / Flagged / Discarded-spam) with a confidence score. `ListingValidator`
scores completeness (role, location, pay, phone, length) and screens spam. `IngestionWorker`
(hosted, config-gated `Ingestion:Enabled`) runs it on a timer; admins can also run it on demand
from `/Admin`. `IListingParser` / `HeuristicListingParser` does the field extraction (kind, role,
shift type, employment, pay, **profit-share %**, city/district, phone) — **no AI dependency** (LLM
APIs are blocked from Iran). Admin reviews the prefilled form and publishes. Swap an
`LlmListingParser`/real sources behind the same interfaces later.
### Hour-range visualization
Every shift card, recommendation card, and detail page shows a **24-hour timeline bar**
(`_HourBar`) with the shift's hours filled and colored by type; overnight shifts wrap past
midnight into two segments.
### Auth
Phone OTP via `OtpService` (in-memory codes; dev shows the code on screen — wire Kavenegar/SMS.ir