Add iranestekhdam.ir as an ingestion source (clinical job ads at named facilities)
New IranEstekhdamListingSource: reads the site monthly ad sitemaps (sitemap-ads.xml -> sitemap-ads-YYYY-M.xml), keeps only ad URLs whose Persian slug names a clinical role (veterinary/non-clinical excluded), then extracts each ad title + description (+ phone). These are employer ads at NAMED facilities, so they directly improve the unknown-facility problem the classifieds content has. Wired in like Medjobs: AppSetting toggles (IranEstekhdamEnabled/MaxAds/UseProxy) + EF migration, SettingsService persistence, admin Settings UI, and DI registration. Off by default; the medical-gate validator + AI auditor + junk filters screen results downstream. Note: e-estekhdam / jobinja / jobvision are JS-rendered SPAs whose ad lists are not in static HTML, so they need API reverse-engineering (a separate effort), not this static-scrape path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -81,6 +81,12 @@ public class AppSetting
|
||||
/// <summary>Max ads to fetch per ingestion run (be polite; dedupe skips already-seen).</summary>
|
||||
public int MedjobsMaxAds { get; set; } = 40;
|
||||
|
||||
/// <summary>Scrape iranestekhdam.ir clinical job ads (crawled via its monthly ad sitemaps;
|
||||
/// employer ads at named facilities, filtered to clinical-role slugs).</summary>
|
||||
public bool IranEstekhdamEnabled { get; set; } = false;
|
||||
public int IranEstekhdamMaxAds { get; set; } = 40;
|
||||
public bool IranEstekhdamUseProxy { get; set; } = false;
|
||||
|
||||
// --- SMS OTP (Kavenegar). When off, the code is shown on screen (dev only). ---
|
||||
public bool SmsEnabled { get; set; } = false;
|
||||
[MaxLength(200)] public string? SmsApiKey { get; set; }
|
||||
|
||||
Reference in New Issue
Block a user