Scoring Models · LLPA Overlay (within-cell residual risk-rank)
What it does
Two-stage HistGradientBoosting regressor + isotonic calibration predicting
the within-cell residual credit-loss expectation that the
Fannie/Freddie LLPA grid leaves unpriced. Output is an ordinal band
(low / baseline / elevated /
high) and decile rank (Q0-Q10) per loan — not a continuous
bps charge.
Trained on the GSE single-family acquisition data
(Fannie SFP + Freddie STACR), 2014-2020 originations (~12M loans),
with 2021-2022 OOT validation (1.25M loans). Overlay AUC 0.67,
combined-prediction AUC 0.74 on credit-event ranking. Per-loan
calibrated upfront overlay bps is also surfaced as
overlay_bps_raw but is informational only — the Phase 4
OOT validation showed a ~60% magnitude under-prediction (the model
rank-orders risk correctly but undershoots realized loss magnitudes on
the rate-shock 2021-2022 vintages). v2 will close the calibration gap
once those vintages season further.
Architecture — Stage 1 trains on pass-through features that mirror the LLPA grid (FICO bucket, LTV bucket, loan purpose, occupancy, property type, units, product type, high-balance flag, subordinate-financing flag). Stage 1 predicts what the grid's implicit risk model would say. Stage 2 trains on the residual = (realized loss − Stage 1 prediction), with the full feature set including DTI, state proxies (HPA volatility, unemployment volatility, FEMA disaster index, BLS employment HHI), channel, origination rate, seller name, first-time-homebuyer flag, and number of borrowers. Stage 2's output is the overlay — orthogonal to the grid by construction.
Why it matters. The May 2026 LLPA risk article ("The LLPA grid prices a politically tolerable subset of mortgage risk") showed empirically that within a single LLPA cell — same FICO band, same LTV bucket, same base bps — modification rate varies 8.8× across DTI sub-bands and 21× across states. The unpriced residual doesn't vanish; it migrates to MSR strips, spec pool pay-ups, lender overlays, and ultimately to borrower-facing rate spread. This model gives lender rate-sheet engineers, secondary-marketing desks, and capital-markets teams a quantified pre-funding signal of where the grid is wrong, so the residual can be priced explicitly rather than absorbed as noise.
What it is NOT. The overlay does not replicate spec-pool
pay-ups, which include liquidity premium, convexity adjustment, and
investor risk premium beyond pure credit loss. The overlay isolates the
incremental credit-loss expectation only. Per Phase 4 NARROW
verdict, the raw bps output is approximate — use
overlay_band and overlay_decile for operating
decisions.
Fair-lending audit. The Phase 5 disparate-impact audit (Path A, state-level vs ACS demographics) returned ACCEPTABLE: AIR(Q4/Q1) = 0.946 (within EEOC 4/5ths rule), regression effect after FICO/LTV/DTI controls = +0.067 bps (immaterial). One state proxy (FEMA disaster index) correlates r = +0.63 with minority concentration — documented as a known limitation. Mitigation playbook (cap disaster-index contribution, re-fit) is available if a future fair-lending exam requires it.
› Try it on the home page (Loan-level model scoring → LLPA Overlay)
API connector
Programmatic access. Calibrated probability + risk band + operating recommendation in the response.
POST /api/score_llpa_overlay
Content-Type: application/json
{
"borrower_fico": 720,
"original_ltv": 80,
"loan_purpose": "P", // 'P' (purchase) or 'R' (any refi)
"property_state": "CA", // 2-letter postal abbreviation
"dti": 38, // optional — missing OK
"original_interest_rate": 6.5,
"original_cltv": 80, // optional — defaults to LTV
"first_time_homebuyer": "N", // optional
"number_of_borrowers": 2, // optional
"channel": "R", // optional: R / C / B / 9
"product_type": "FRM30" // optional
}
Response shape: overlay_band (string),
overlay_decile (0-10), overlay_bps_raw (float —
informational), recommendation (operating action), and a
caveats array documenting the Phase 4 magnitude-calibration
limitation + the Phase 5 disaster-index correlation.
Schema reference (request / response shape): GET /api/score_llpa_overlay/schema
Model metadata (training cohort, AUC, calibration): GET /api/score_llpa_overlay/info
themortgagellm™