Scoring Models · LLPA Overlay (within-cell residual risk-rank)

Category: Capital Markets

What it does

Two-stage HistGradientBoosting regressor + isotonic calibration predicting the within-cell residual credit-loss expectation that the Fannie/Freddie LLPA grid leaves unpriced. Output is an ordinal band (low / baseline / elevated / high) and decile rank (Q0-Q10) per loan — not a continuous bps charge.

Trained on the GSE single-family acquisition data (Fannie SFP + Freddie STACR), 2014-2020 originations (~12M loans), with 2021-2022 OOT validation (1.25M loans). Overlay AUC 0.67, combined-prediction AUC 0.74 on credit-event ranking. Per-loan calibrated upfront overlay bps is also surfaced as overlay_bps_raw but is informational only — the Phase 4 OOT validation showed a ~60% magnitude under-prediction (the model rank-orders risk correctly but undershoots realized loss magnitudes on the rate-shock 2021-2022 vintages). v2 will close the calibration gap once those vintages season further.

Architecture — Stage 1 trains on pass-through features that mirror the LLPA grid (FICO bucket, LTV bucket, loan purpose, occupancy, property type, units, product type, high-balance flag, subordinate-financing flag). Stage 1 predicts what the grid's implicit risk model would say. Stage 2 trains on the residual = (realized loss − Stage 1 prediction), with the full feature set including DTI, state proxies (HPA volatility, unemployment volatility, FEMA disaster index, BLS employment HHI), channel, origination rate, seller name, first-time-homebuyer flag, and number of borrowers. Stage 2's output is the overlay — orthogonal to the grid by construction.

Why it matters. The May 2026 LLPA risk article ("The LLPA grid prices a politically tolerable subset of mortgage risk") showed empirically that within a single LLPA cell — same FICO band, same LTV bucket, same base bps — modification rate varies 8.8× across DTI sub-bands and 21× across states. The unpriced residual doesn't vanish; it migrates to MSR strips, spec pool pay-ups, lender overlays, and ultimately to borrower-facing rate spread. This model gives lender rate-sheet engineers, secondary-marketing desks, and capital-markets teams a quantified pre-funding signal of where the grid is wrong, so the residual can be priced explicitly rather than absorbed as noise.

What it is NOT. The overlay does not replicate spec-pool pay-ups, which include liquidity premium, convexity adjustment, and investor risk premium beyond pure credit loss. The overlay isolates the incremental credit-loss expectation only. Per Phase 4 NARROW verdict, the raw bps output is approximate — use overlay_band and overlay_decile for operating decisions.

Fair-lending audit. The Phase 5 disparate-impact audit (Path A, state-level vs ACS demographics) returned ACCEPTABLE: AIR(Q4/Q1) = 0.946 (within EEOC 4/5ths rule), regression effect after FICO/LTV/DTI controls = +0.067 bps (immaterial). One state proxy (FEMA disaster index) correlates r = +0.63 with minority concentration — documented as a known limitation. Mitigation playbook (cap disaster-index contribution, re-fit) is available if a future fair-lending exam requires it.

› Try it on the home page (Loan-level model scoring → LLPA Overlay)

API connector

Programmatic access. Calibrated probability + risk band + operating recommendation in the response.

POST /api/score_llpa_overlay
Content-Type: application/json

{
  "borrower_fico":   720,
  "original_ltv":    80,
  "loan_purpose":    "P",            // 'P' (purchase) or 'R' (any refi)
  "property_state":  "CA",           // 2-letter postal abbreviation
  "dti":             38,             // optional — missing OK
  "original_interest_rate": 6.5,
  "original_cltv":   80,             // optional — defaults to LTV
  "first_time_homebuyer": "N",       // optional
  "number_of_borrowers": 2,          // optional
  "channel":         "R",            // optional: R / C / B / 9
  "product_type":    "FRM30"         // optional
}

Response shape: overlay_band (string), overlay_decile (0-10), overlay_bps_raw (float — informational), recommendation (operating action), and a caveats array documenting the Phase 4 magnitude-calibration limitation + the Phase 5 disaster-index correlation.

Schema reference (request / response shape): GET /api/score_llpa_overlay/schema

Model metadata (training cohort, AUC, calibration): GET /api/score_llpa_overlay/info