AI qualify: de-dupe applicants, base roles, closed categories, tag hygiene + reprocess-stored action
CI/CD / CI · dotnet build (push) Successful in 2m35s
CI/CD / Deploy · hamkadr (push) Successful in 1m23s

Qualified live applicants and found three problems, all fixed:
- Duplicate cards: one ad fanned out into «پرستار» + «پرستار کودک» (same person).
  Applicants now publish ONE listing (no role fan-out); secondary roles → tags.
- Role sprawl: modifiers became roles. Prompt now returns the BASE profession
  and pushes age-group/ward/seniority to tags; new roles only for a genuinely
  new base profession (تکنسین داروخانه ✓, پرستار کودک ✗).
- Tag/category noise: categories pinned to the 5 fixed groups (+سایر, never
  invented); BuildTags drops pay/contact/location/fragment words.

Reprocess action: IngestionService.ReprocessAsync re-runs the current pipeline
over every stored RawListing WITHOUT re-fetching (keeps the raw text, so nothing
is lost to sources only exposing recent posts), deleting the old aggregated
posts and republishing cleanly. Admin dashboard button «پردازش مجددِ آیتم‌های
ذخیره‌شده» runs it on a background scope; result lands in the run-log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
soroush.asadi
2026-06-20 14:24:20 +03:30
parent 4c0b29addf
commit d62929ca0d
5 changed files with 182 additions and 48 deletions
@@ -13,11 +13,15 @@ public class IndexModel : PageModel
{
private readonly AppDbContext _db;
private readonly IngestionService _ingest;
private readonly IServiceScopeFactory _scopes;
private readonly ILogger<IndexModel> _log;
public IndexModel(AppDbContext db, IngestionService ingest)
public IndexModel(AppDbContext db, IngestionService ingest, IServiceScopeFactory scopes, ILogger<IndexModel> log)
{
_db = db;
_ingest = ingest;
_scopes = scopes;
_log = log;
}
public List<RawListing> Queue { get; private set; } = new();
@@ -94,6 +98,26 @@ public class IndexModel : PageModel
return RedirectToPage();
}
/// <summary>
/// Clean up EXISTING aggregated content by re-running the current pipeline over the stored raw
/// text — no re-fetch, so nothing is lost to sources only exposing recent posts. Long-running
/// (one AI call per item), so it runs on a background scope and returns immediately; the result
/// shows up as a new row in the «تاریخچهٔ اجرا» log when it finishes.
/// </summary>
public IActionResult OnPostReprocessStored()
{
_ = Task.Run(async () =>
{
using var scope = _scopes.CreateScope();
var svc = scope.ServiceProvider.GetRequiredService<IngestionService>();
var log = scope.ServiceProvider.GetRequiredService<ILogger<IndexModel>>();
try { await svc.ReprocessAsync(); }
catch (Exception ex) { log.LogError(ex, "Background reprocess failed"); }
});
IngestMessage = "پردازش مجدد آیتم‌های ذخیره‌شده در پس‌زمینه آغاز شد. نتیجه پس از اتمام در «تاریخچهٔ اجرا» نمایش داده می‌شود (بسته به تعداد آیتم‌ها و سرعت هوش مصنوعی، چند دقیقه طول می‌کشد).";
return RedirectToPage();
}
private async Task LoadAsync()
{
Queue = await _db.RawListings