Files
Siro/docs/ai_document_extraction_prompt.md
2026-06-25 18:05:26 +03:00

120 lines
5.4 KiB
Markdown

# AI Document Extraction Prompt — Country-Specific Field Mapping
## Overview
Extract driver registration fields from uploaded document images. Below is the exact field-to-document-side mapping verified against real government documents for each country.
## Field Extraction Matrix
| Field | Jordan | Syria | Egypt |
|---|---|---|---|
| `first_name` + `last_name` | ID front, License front | ID front, License front | ID front, License front |
| `national_number` | ID front, License front | **ID front** (bottom), License front | ID front, License front |
| `birthdate` | ID front, License front | ID front, License front | ID **front** (above national number, left side) |
| `gender` | ID front | ID **back** | ID **back** |
| `address` | ID **back**, License front | ID **back** | ID front, License front |
| `site` (مكان القيد) | ID **back** | ID **back** | — |
| `maritalStatus` | **Not on ID** (`null`) | **Not on ID** (`null`) | ID **back** |
| `license_type` | License front (symbols/numbers at bottom) | License front | License front |
| `license_categories` | — | License **back** (detailed) | — |
| `issue_date` | License front | License front | License front |
| `expiry_date` | License **front** | License front | License front |
| `owner` | Car reg front | Car reg front | Car reg front |
| `car_plate` | Car reg front | Car reg front | Car reg front |
| `make`, `model`, `year` | Car reg front | Car reg front | Car reg **back** |
| `color` | Car reg front | Car reg front | Car reg **back** |
| `vin` | Car reg front | Car reg front | Car reg **back** |
| `fuel` | Car reg front | Car reg front | Car reg **back** |
| `expiration_date` | Car reg front | Car reg front | Car reg **back** |
## Country-Specific Details
### 🇯🇴 Jordan
- **ID Front:** Full name (one line → split to `first_name` + `last_name`), `national_number`, `gender`, `birthdate`, place of birth, mother's name
- **ID Back:** Place of registration (`site`), card expiry date, place of issue, `address`
- **License Front:** Full name + English, `national_number`, `birthdate`, `address`, license number, `issue_date`, `expiry_date`, `license_type` (symbols/numbers)
- **License Back:** Blood type, medical restrictions/notes only
- **Car Reg Front:** All vehicle data (`owner`, `car_plate`, `make`, `model`, `year`, `color`, `vin`, `fuel`, `expiration_date`)
- **Note:** `maritalStatus` does NOT appear on Jordanian documents → set to `null`
### 🇸🇾 Syria
- **ID Front:** Full name (first name + father name + mother name + last name), place of birth, `birthdate`, `national_number` (bottom)
- **ID Back:** Registration (`site`), `address`, `gender`, eye color, complexion, distinguishing marks, issue date
- **License Front:** Name, father name, `national_number`, `birthdate`, `issue_date`, `expiry_date`
- **License Back:** `license_categories` (detailed categories)
- **Car Reg Front:** All vehicle data (`owner`, `car_plate`, `make`, `model`, `year`, `color`, `vin`, `fuel`, `expiration_date`)
- **Note:** `maritalStatus` does NOT appear on Syrian documents → set to `null`
### 🇪🇬 Egypt
- **ID Front:** Full name (first name + rest), complete `address`, `birthdate` (printed on front, above national number, left side), `national_number` (14 digits)
- **ID Back:** Occupation, `maritalStatus`, `gender` (ذكر/أنثى), religion, issue date, expiry date
- **License Front:** Name, `address`, `national_number`, `issue_date`, `expiry_date`, license type/grade
- **Car Reg Back:** All technical vehicle data (`make`, `model`, `year`, `vin`, `fuel`, `color`, `expiration_date`)
- **Note:** `birthdate` is on ID **front**, not back
## Criminal Record Verification
- `full_name` — name on document
- `result` — varies: "لا حكم عليه" (Syria), "عدم محكومية" (Jordan), "فيش وتشبيه" (Egypt)
- `is_valid` — true/false (document is valid and current)
## Face Matching
- `profile_vs_id` — match/mismatch/unclear
- `profile_vs_license` — match/mismatch/unclear
## Output Format
Return ONLY raw JSON:
```json
{
"status": "success|failure",
"face_match_confidence": "high|medium|low",
"driver": {
"first_name": "",
"last_name": "",
"phone": "",
"email": "",
"gender": "Male|Female",
"birthdate": "YYYY-MM-DD",
"national_number": "",
"site": "",
"address": "",
"maritalStatus": "",
"license_type": "",
"license_categories": "",
"issue_date": "YYYY-MM-DD",
"expiry_date": "YYYY-MM-DD",
"licenseIssueDate": "YYYY-MM-DD"
},
"car": {
"owner": "",
"car_plate": "",
"make": "",
"model": "",
"year": "",
"color": "",
"color_hex": "",
"fuel": "",
"vin": "",
"expiration_date": "YYYY-MM-DD"
},
"criminal_record": {
"full_name": "",
"result": "",
"is_valid": true
},
"face_matching": {
"profile_vs_id": "match|mismatch|unclear",
"profile_vs_license": "match|mismatch|unclear"
}
}
```
## Rules
1. Convert Eastern-Arabic digits (٠١٢٣٤٥٦٧٨٩) to Western (0-9).
2. Dates in ISO format: `YYYY-MM-DD`.
3. If unreadable/missing → set to `null`, do NOT fail.
4. Fail only on: face mismatch, forged/fake documents, or missing primary identity.
5. `national_number` and `vin` must contain only Latin digits/characters.
6. Normalize color names: "أبيض" → "White", with hex code.
7. Return ONLY raw JSON → no markdown formatting.
8. Refer to the Field Extraction Matrix above: extract each field from the correct document side.