Where deterministic geocoding gives up (neighborhood not in the TehranGeo table),
fall back to the registered AI model: the auditor now also returns approximate
lat/lng for a recognized Tehran neighborhood (folded into the existing single
audit call — no extra requests), and Publish uses it only after the source ad and
the local table, and only when it falls inside greater Tehran (InTehran bbox
guard rejects hallucinated points). Coords order: Divar point → TehranGeo → AI.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Qualified live applicants and found three problems, all fixed:
- Duplicate cards: one ad fanned out into «پرستار» + «پرستار کودک» (same person).
Applicants now publish ONE listing (no role fan-out); secondary roles → tags.
- Role sprawl: modifiers became roles. Prompt now returns the BASE profession
and pushes age-group/ward/seniority to tags; new roles only for a genuinely
new base profession (تکنسین داروخانه ✓, پرستار کودک ✗).
- Tag/category noise: categories pinned to the 5 fixed groups (+سایر, never
invented); BuildTags drops pay/contact/location/fragment words.
Reprocess action: IngestionService.ReprocessAsync re-runs the current pipeline
over every stored RawListing WITHOUT re-fetching (keeps the raw text, so nothing
is lost to sources only exposing recent posts), deleting the old aggregated
posts and republishing cleanly. Admin dashboard button «پردازش مجددِ آیتمهای
ذخیرهشده» runs it on a background scope; result lands in the run-log.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Unknown roles from the AI are now resolved-or-CREATED (Persian-normalized dedupe) instead of dropped/fallback; new role gets the AI's category, assigned to the applicant.
- AI output gains category + tags; AI-detected skills/requirements (ICU, MMT, پروانهدار…) now fold into the applicant's searchable Tags.
- System prompt is hardcoded in AppSetting.DefaultPrompt and used directly by the auditor; admin sees it read-only (cannot edit/break it).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Test-AI button called AuditAsync, which caught every exception and returned
null, and used EnsureSuccessStatusCode() (discarding the response body). So a
failing AI service only ever produced a generic 'no response' message with no
detail — impossible to diagnose.
- Add IAiAuditor.TestAsync: runs the real call and returns a detailed Persian
diagnostic — HTTP status + response body on non-2xx, raw body when the shape
isn't OpenAI-compatible, and network/proxy/timeout specifics on exceptions.
- AuditAsync now logs the actual HTTP status + response body (and proxy state)
instead of a bare warning, so server logs show why a call failed.
- ExtractContent / ParseVerdict no longer throw on unexpected JSON; they return
null so the caller can show the raw body.
- Settings 'Test AI' button uses TestAsync; result box renders multi-line and
switches to alert-error styling when the test fails.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AI (when enabled, now that the server proxy is up):
- AiStructured gains phone, personName, yearsExperience, isLicensed.
- The auditor appends an authoritative output-schema to the admin prompt
so classification stays correct even with an older stored prompt — it
now classifies kind as shift|job|talent and extracts the contact phone
and talent details.
- Ingestion publish prefers the AI's tags (kind/role/city/facility/phone +
talent fields) over the heuristic parser when present.
- Default prompt updated to describe the three kinds + new fields.
Phone extraction from websites (Medjobs / generic sites), where the
number sits behind a "تماس با این آگهی" reveal:
- HtmlUtil.HarvestPhones scans the full markup for tel: links, JSON-LD
"telephone", data-*phone* attributes, and inline Iranian mobile/landline
numbers (Persian digits folded), normalized (mobiles 09…, landlines 0…).
- Medjobs + Website sources append harvested numbers to the ad text so the
parser/AI capture them; manual review then prefills the phone too.
- Parser phone extraction now also captures a landline as a fallback.
Note: if a site loads the number purely via XHR (not in HTML), a
per-source reveal endpoint would be a follow-up.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add «تست اتصال VPN/پروکسی» (reaches a filtered site through the proxy and reports connected/latency) and «تست هوش مصنوعی» (sends a sample post through the configured model and shows the verdict + extracted fields) to admin Settings. Fix OpenAiCompatibleAuditor.ParseVerdict: TryGetInt32/64 threw on null/string JSON values (the model commonly returns payAmount/sharePercent as null), which silently failed every audit — now guarded on ValueKind==Number. Verified the real OpenAI key extracts perfectly (approve / role=پرستار / city=تهران / shift=night).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add AiUseProxy setting + a toggle in the AI settings section. ScrapeHttpClients.ForAi(settings) returns a proxied HttpClient (reusing IngestProxyUrl, 100s timeout) when AiUseProxy is on, otherwise direct; AI-cache keys are protected from the scrape-client cleanup. OpenAiCompatibleAuditor now uses it, so the AI auditor (e.g. api.openai.com) is reachable through the same Xray sidecar that serves Telegram. Migration adds the column.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>