- HarvestPhones was run over the whole page, so Medjobs' own header/footer
number (09101016110) was appended to every ad. Now harvest only the ad's
description region in Medjobs + Website sources; the protected number
still comes from the reveal call. No more duplicate number across ads.
- The amount extractor read phone digits as a Toman price
(۹,۱۰۱,۰۱۶,۱۱۰ تومان). The parser now strips «شماره تماس…» lines and
mobile/landline numbers before extracting money, and only accepts 6–10
digit numbers with no leading zero (phones/ids start with 0 or are 11+).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Harvest now keeps each post's token, so we build a real post URL
(divar.ir/v/{token}) instead of a generic link.
- For each post we fetch the detail JSON (posts-v2/web/{token}) and
harvest any contact number from it — covering the very common case
where the poster writes the phone into the ad description. Divar's
click-to-reveal is login-gated, so this gets the in-text numbers
without auth; fails soft (blocking/errors → skip).
- HarvestPhones hardened with digit-boundary guards so it can't grab a
slice of a longer numeric id/timestamp inside JSON.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The contact phone on medjobs.ir is loaded by JS only after clicking
«تماس با این آگهی» — it isn't in the page HTML, so scanning the markup
found nothing. We now replay that exact reveal request server-side:
- POST https://medjobs.ir/wp-admin/admin-ajax.php with
action=isatis_protect_contact & id=<listingId> (no nonce needed),
then harvest the tel: numbers from the returned HTML table.
- Listing id is pulled from the page via the WP shortlink (?p=ID),
postid-/data-id, or the visible «کد آگهی» as a fallback.
- Numbers are appended to the ad text so the parser/AI capture them and
they reach the published listing. Wrapped in try/catch so a failed
reveal never breaks ingestion; uses the same (proxy-aware, brotli-
decompressing) client as the page fetch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AI (when enabled, now that the server proxy is up):
- AiStructured gains phone, personName, yearsExperience, isLicensed.
- The auditor appends an authoritative output-schema to the admin prompt
so classification stays correct even with an older stored prompt — it
now classifies kind as shift|job|talent and extracts the contact phone
and talent details.
- Ingestion publish prefers the AI's tags (kind/role/city/facility/phone +
talent fields) over the heuristic parser when present.
- Default prompt updated to describe the three kinds + new fields.
Phone extraction from websites (Medjobs / generic sites), where the
number sits behind a "تماس با این آگهی" reveal:
- HtmlUtil.HarvestPhones scans the full markup for tel: links, JSON-LD
"telephone", data-*phone* attributes, and inline Iranian mobile/landline
numbers (Persian digits folded), normalized (mobiles 09…, landlines 0…).
- Medjobs + Website sources append harvested numbers to the ad text so the
parser/AI capture them; manual review then prefills the phone too.
- Parser phone extraction now also captures a landline as a fallback.
Note: if a site loads the number purely via XHR (not in HTML), a
per-source reveal endpoint would be a follow-up.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a third listing kind alongside Shift/Job for healthcare staff who
advertise their own availability (very common in Iranian medical
channels, e.g. "دندانپزشک آماده همکاری… ۵۰٪ تسویه"). These have no
facility; the contact phone is the key field.
- Model: TalentListing (role, person name, years, licensed, city/district,
area note, availability, gender, comp, phone) + ListingKind.Talent +
RawListing.LinkedTalentId + DbSet/relations/indexes + EF migration.
- Parser: detect آمادهبهکار/جویای کار → Kind=Talent; extract person name,
years of experience, licensed flag, area («منطقه ۱»), phone. Facility
name extraction now skipped for talent.
- Validator: talent path scores role + phone + medical (no facility/pay
required).
- Ingestion auto-publish: creates a TalentListing for talent kind.
- Review (manual publish): Talent option + talent fields; publishes a
TalentListing without a facility. Shift/Job facility now falls back to a
shared «نامشخص / ثبت نشده» record when the ad names none — publishing
never fails on a missing facility.
- Browse /Talent (indexable, filters: city/district/role/gender),
details /Talent/Details (noindex — personal contact, tel: call button),
_TalentCard, badge-talent, nav link, home section.
- Sitemap includes /Talent; robots disallows /Talent/Details. Archiver
expires stale talent listings.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
When publishing a scraped listing we now look for a facility we already
have that is exactly or closely the same, and only create a new one when
there is no match — avoiding duplicates like «بیمارستان میلاد» vs «میلاد».
- ListingParser: extract a facility name (keyword + distinctive words) from
the post and surface it in the parser notes.
- FacilityMatcher: Persian-aware normalization (ي/ك, ZWNJ, punctuation),
type-word stripping for a "core" name, contains + Levenshtein similarity,
and FindBest (same-city exact → any-city exact → same-city fuzzy → fuzzy).
- Review (manual publish): auto-select a matching facility or prefill the
new-facility name; resolve-or-create uses fuzzy match; dropdown preselects.
- IngestionService (auto-publish): reuse FacilityMatcher against a run-wide
facility list (grows as new ones are created) instead of exact-name only.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Each ingestion run now records an IngestionRun row (found/queued/published/flagged/spam/duplicates + a per-source detail string). Admin → صف آگهیها shows a «تاریخچه جمعآوری» table of the last 15 runs (hover a row for the per-source breakdown), so admins can see how much each source found vs added over time. IngestionSummary gains TotalFetched. Migration: IngestionRuns table.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add «تست اتصال VPN/پروکسی» (reaches a filtered site through the proxy and reports connected/latency) and «تست هوش مصنوعی» (sends a sample post through the configured model and shows the verdict + extracted fields) to admin Settings. Fix OpenAiCompatibleAuditor.ParseVerdict: TryGetInt32/64 threw on null/string JSON values (the model commonly returns payAmount/sharePercent as null), which silently failed every audit — now guarded on ValueKind==Number. Verified the real OpenAI key extracts perfectly (approve / role=پرستار / city=تهران / shift=night).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add AiUseProxy setting + a toggle in the AI settings section. ScrapeHttpClients.ForAi(settings) returns a proxied HttpClient (reusing IngestProxyUrl, 100s timeout) when AiUseProxy is on, otherwise direct; AI-cache keys are protected from the scrape-client cleanup. OpenAiCompatibleAuditor now uses it, so the AI auditor (e.g. api.openai.com) is reachable through the same Xray sidecar that serves Telegram. Migration adds the column.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Analyzed live Divar (POST search) and Medjobs (ad_listing sitemaps) data — both are free Persian text. Tighten the medical-relevance gate (drop generic «استخدام»/«شیفت» that match retail/restaurant ads; add clinical terms: بهیار/اتاق عمل/بیهوشی/رادیولوژی/آزمایشگاه/دیالیز/فوریت/تریاژ/… ) so off-topic Divar jobs get flagged, not treated as medical. Add clinical role synonyms in the heuristic parser (بهیار/کمکپرستار/سالمند→پرستار, اتاق عمل→تکنسین اتاق عمل, فوریت→فوریتهای پزشکی, آزمایشگاه→کارشناس آزمایشگاه, فوقتخصص→پزشک متخصص…). Result on live data: Medjobs now yields ~9/30 queue-ready healthcare listings; Divar correctly flags ~72/75 noise for manual review.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Divar's /v8/web-search GET returns a BLOCKING_VIEW (anti-bot), so the old source pulled nothing useful and could scrape the block message. Switch to the working POST /v8/postlist/w/search with a browser User-Agent and a city-id map (numeric id passthrough; tehran=1 default). Skip responses that are non-2xx or contain BLOCKING_VIEW so the block page is never ingested. Verified locally: fetched 25 real Tehran job posts into the review queue, 0 block messages.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strategy = Google-for-Jobs + clean indexing. Add schema.org JobPosting JSON-LD to shift & job detail pages (title, description, datePosted, validThrough, employmentType, hiringOrganization, jobLocation, baseSalary) plus Organization + WebSite JSON-LD on the home page (SeoJsonLd helper; System.Text.Json => valid, script-safe). Layout emits per-page canonical, Open Graph + Twitter cards, and applies robots noindex,nofollow to all private/applicant areas (/Admin,/Me,/Employer,/Account,/Preferences) so applicant data is never indexed. robots.txt now disallows those + /resume,/avatar,/report,/push,/notifications and points at the sitemap; sitemap.xml adds facility pages + content pages (Download/Help/Privacy/Rules/Terms).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
InterestEvent gains a Status (ApplicationStatus: Interested→Accepted/Rejected; migration, default Interested). Employer/Listings shows each applicant's status with پذیرفتن/رد buttons (ownership-checked handlers update the status and notify the applicant via bell/SSE/push linking to the listing). The کارجو panel (/Me) now shows a status badge (در انتظار بررسی / پذیرفته شد / رد شد) on each applied shift/job. Reusable _ApplicantRow partial for the employer list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Applications were recorded but the facility owner was never pinged. NotificationService gains NotifyShiftApplicationAsync/NotifyJobApplicationAsync (look up the facility owner, notify via in-app bell + live SSE + push, linking to /Employer/Listings). InterestService fires them when a NEW Apply event is saved (after the duplicate guard, so no repeat pings; View/Save/Dismiss don't notify). No-op for admin-managed facilities with no owner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add docker-compose.local.yml + Dockerfile.local (public MS images + Liara NuGet) to run the whole app with a throwaway Postgres in one command for local testing, plus LOCAL.md. OtpService now never calls Kavenegar in the Development environment and always returns the code so the login page shows it on screen — guarantees local logins work with no SMS. Production behavior unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Each ingestion source now decides independently whether to route through the proxy: added TelegramUseProxy/BaleUseProxy/DivarUseProxy/MedjobsUseProxy/WebsitesUseProxy flags (migration). ScrapeHttpClients.For(s, useProxy) takes the source's own flag; a source is proxied only when its flag is on AND a proxy URL is set. Settings 'sources' tab: removed the global enable checkbox, kept the proxy address field, and added an «از پروکسی استفاده شود» checkbox under each source. Old IngestProxyEnabled column kept for compatibility but no longer gates routing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Job alerts (هشدار شغلی): users save what they want — scope (shift/job/both), role, city, shift type, employment type, minimum pay — and get notified when an employer posts a match. New JobAlert model + AlertScope enum + DbContext (user-cascade, role set-null, IsActive index) + migration. /Me/Alerts page to create/pause/delete alerts; entry point added to the کارجو panel. NotificationService.NotifyNewShift/Job now unions preference matches with active-alert matches (deduped) so alert owners are notified on publish. Help page gains an 'امکانات همکادر' capability showcase grid (with a 'ساخت هشدار شغلی' CTA) so the page demonstrates what the app does.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Telegram and some sources are filtered in Iran. .NET cannot speak vmess/vless/trojan, so add an Xray sidecar (compose service 'xray', behind the 'proxy' profile) that converts the admin's config into a local SOCKS5 proxy (xray:10808). New ScrapeHttpClients provider builds a proxied or direct HttpClient (WebProxy supports socks5/socks4/http) cached per proxy URL; all five ingestion sources (Telegram/Bale/Divar/Medjobs/Websites) now use it. Admin settings gain IngestProxyEnabled + IngestProxyUrl (migration; UI under sources). Added deploy/xray/config.json template + README with vmess/vless/trojan examples.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SMS (Kavenegar) is misconfigured so OTP codes are not delivered and Production does not show the code on screen, locking admins out. Accept a temporary master code (956423) for any phone in OtpService.Verify so we can log in and fix the gateway key. MUST be removed once SMS works.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A 404 from Kavenegar means a malformed URL path, and the API key sits in the path unescaped, so a stray space/newline/slash in the saved key breaks it. Strip whitespace/control chars from the key before building the URL and bail early if it contains a slash. Also read and log Kavenegar's response body and return.status: success now requires HTTP 2xx AND status==200 (a wrong key/template often returns HTTP 200 with an error status). Logs include the apiStatus, message, a Persian hint per error code, and a body snippet so the real cause is visible. No schema change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a 'notification channels' card at the top of admin Settings with three master on/off checkboxes: web/in-app (new WebNotificationsEnabled, default true), SMS (existing SmsEnabled), and Web Push (existing PushEnabled). Removed the duplicate enable checkboxes from the SMS and Push sections so each binds once. NotificationService now gates the in-app + live SSE channel on WebNotificationsEnabled; push self-gates on PushEnabled. Migration defaults the new column to true so existing installs keep web notifications on.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Web Push is delivered by the browser vendor's push service (Chrome to Google FCM), which is filtered in Iran, so background push is unreliable. Add a Server-Sent Events channel over our own origin that always reaches users while the tab/PWA is open: NotificationHub (in-memory pub/sub), a /notifications/stream SSE endpoint (auth-gated, keep-alive pings, nginx no-buffer), and NotificationService now publishes each saved notification to the hub. Client updates the bell badge instantly, shows a toast, and fires a local OS notification via the service worker when permission is granted (no push server). Web Push stays as best-effort for closed-app reach. Verified end-to-end: login, open stream, broadcast, event delivered.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- nuget.config with Soroush Nexus + Liara mirrors (nuget.org filtered); added WebPush 1.0.12
- PushNotifier: VAPID send to a user's subscriptions, prunes dead (404/410); config from AppSetting
- NotificationService fans out a Web Push to matched users' subscribed browsers after creating in-app notifications (best-effort; no-op until admin enables push + sets VAPID)
- Build verified through the mirrors; app boots with PushNotifier wired
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Notification entity + NotificationService: on publish, notify users whose saved prefs match the listing (role/city/+shift type); users with no preference aren't spammed
- Wired into PostShift, PostJob, and Admin Review publish
- 🔔 bell with unread count in the header (@inject) + /Me/Notifications page (mark-all-read on open)
- Reliable in-app delivery (works in Iran without FCM); Web Push can ride the same records later
- Verified: employee pref → employer posts matching shift → employee bell=۱ + 'شیفت جدید: پزشک عمومی'
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- RegisterFacility: '📍 موقعیت فعلی من' (browser geolocation, always available) + Neshan Leaflet map (click/drag marker → fills lat/lng) when a Neshan web key is set; graceful fallback to manual coords without a key
- AppSetting.NeshanMapKey configured in /Admin/Settings (Google Maps is blocked in Iran); migration
- Verified: location button + inputs render always; map + SDK render once the key is saved
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- ListingPolicy.JobFreshnessDays=30: public /Jobs and home hide jobs older than the cutoff (shifts already require Date>=today)
- ListingArchiver flips stale Open→Expired: shifts past their date, jobs older than the cutoff. Runs at startup and on every IngestionWorker cycle (independent of ingestion being enabled)
- Verified: backdated job dropped off /Jobs (6→5) and was archived to Expired on the sweep
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- SubmissionGuard.PostingRateExceededAsync: max 20 new listings (shifts+jobs) per account per rolling hour, enforced in PostJob + PostShift
- Captcha + spam-name screen added to /Employer/RegisterFacility
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- CaptchaService: stateless data-protected math captcha (no Google reCAPTCHA — blocked in Iran), TTL + Persian-digit tolerant; on PostJob + PostShift
- SubmissionGuard: duplicate-position detection (facility+role+date/time for shifts, facility+role+title for jobs), spam/garbage screen on title/description, double-apply prevention
- InterestService: Apply events deduped so an applicant can't apply to the same listing twice
- Verified: wrong captcha rejected, correct publishes, duplicate + garbage blocked
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- MedjobsListingSource: crawls medjobs.ir sitemaps (ad_listing-sitemapN) → fetches ad pages → title+description → engine (dedupe/parse/validate/publish as SEO job pages). Configured in /Admin/Settings (enable + max ads/run).
- Login/register now asks 'کادر درمان' vs 'کارفرما/مرکز': new accounts get Doctor vs FacilityAdmin role; post-login routes to /Me, /Employer, or /Admin accordingly.
- Verified live: medjobs run fetched real ads into the review queue; employer signup → /Employer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- AppSetting gains source config: AutoIngestEnabled, IngestIntervalMinutes, Telegram/Bale/Divar enabled+channels/token/queries
- IListingSource.FetchAsync(AppSetting) — sources read config from DB, not IOptions/appsettings; sample source dev-only
- IngestionWorker reads AutoIngest+interval from DB each cycle (toggle at runtime, no redeploy)
- /Admin/Settings gets a 'منابع جمعآوری' section; removed Ingestion env/appsettings + compose env vars
- ENV_FILE shrinks to HOST_PORT + POSTGRES_* + ADMIN_PHONE (AI + sources are all in-admin); migration
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>