Fix role + contact mislabels seen on a live iranestekhdam ad
(1) Specialist guard: the AI sometimes labels a clearly-specialist ad («پزشک متخصص گوش و
حلق و بینی»، «فلوشیپ»، «فوق تخصص») as «پزشک عمومی», so an ENT post published as
«استخدام پزشک عمومی». When the primary role is GP but the ad text names a specialist, swap
it to «پزشک متخصص» (the subspecialty stays as a tag).
(2) Phone type: the landline regex 0\d{2,3} also matched 09xx MOBILE numbers and labeled them
«تلفن ثابت». Iranian landline area codes are 0[1-8]xx (021/026/…), never 09 — restrict it so
mobiles are no longer mislabeled as landlines.
Both apply to new ingests; existing mislabeled rows correct on turnover/reprocess.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -361,7 +361,9 @@ public class HeuristicListingParser : IListingParser
|
||||
if (d.Length == 10 && d[0] == '9') d = "0" + d;
|
||||
Add(ContactType.Mobile, d);
|
||||
}
|
||||
foreach (Match m in Regex.Matches(latin, @"(?<!\d)0\d{2,3}[\s-]?\d{7,8}(?!\d)"))
|
||||
// Landline area codes start 0[1-8] (021 Tehran, 026 Karaj, …) — never 09, which is a MOBILE.
|
||||
// The old 0\d{2,3} matched 09xx numbers and mislabeled mobiles as «تلفن ثابت».
|
||||
foreach (Match m in Regex.Matches(latin, @"(?<!\d)0[1-8]\d{1,2}[\s-]?\d{7,8}(?!\d)"))
|
||||
Add(ContactType.Phone, Regex.Replace(m.Value, @"\D", ""));
|
||||
|
||||
return list.Take(8).ToList();
|
||||
|
||||
Reference in New Issue
Block a user