themortgagellm

Scoring Models

A portfolio of calibrated loan-level risk and pricing models, built from the ground up on the full public mortgage record. Each model ships with an out-of-time test AUC, a calibrated probability you can drop straight into expected-value math, and an interpretable recommendation.

60M+
GSE loan acquisitions
(Fannie + Freddie, 2013-2025)
1.3B
HMDA application records
(2018-2025 LAR)
11+
years of monthly performance
(GSE + GNMA disclosures)
15
production-grade scoring models
(calibrated, OOT-validated)

How the models are built

Every model is a gradient-boosting classifier (sklearn HistGradientBoosting) trained on full-population data — no sampling, no synthetic cohorts. Hyperparameters are tuned via random-grid search with early stopping; class imbalance (e.g. denials, repurchases) is handled with class-weight balancing rather than oversampling so the recovered probabilities stay interpretable.

Raw classifier scores are then isotonically calibrated against the empirical positive rate in each score decile. The result: a predicted 5% probability matches an actual 5% rate in the historical data, not just "this loan is riskier than that loan." Calibration quality is reported on every model page as Brier score and ECE on the held-out test set.

Validation is out-of-time: we hold out the latest 1–2 vintages from training and report AUC on those. That 's a stricter bar than random-split AUC (which is also reported, usually 5-10 points higher) because the test set carries the distribution shift the model will face in production. Every number on this site reflects the out-of-time figure unless explicitly tagged otherwise.

Why calibrated probabilities matter

Rank-ordering ("riskier than baseline") is enough for triage but not for pricing. A loan with a calibrated 8.0% repurchase probability and one with a 2.0% repurchase probability should produce expected R&W reserves that differ by a factor of 4 — and they do, on this platform. Uncalibrated scores can rank-order correctly while being 10× off on absolute magnitudes, which silently breaks any downstream EV calculation.

Practical consequence: the probabilities here can be multiplied by loss assumptions (e.g. $200K per repurchase, $10K per EPD signal) to produce expected-loss numbers that the pricing desk can use directly. That's how the Fannie vs Freddie channel-choice model works under the hood — a compound expected-loss calc using calibrated probabilities from the underlying repurchase + EPD models.

Pre-funding decisioning

For loan officers, secondary marketing, and pipeline managers — score an application before it funds.

Credit Denial Probability AUC 0.91

Gradient-boosting classifier rating the probability that an application is denied for credit reasons (action_taken = 3).

Higher Priced Loan (HPML) AUC 0.87

Gradient-boosting classifier rating a HMDA-style loan application's probability of pricing into a Higher-Priced Mortgage Loan under Reg Z: rate_spread ≥ 1.5 pp for first liens (≥ 3.5 pp for subordinate liens), O…

Pricing & channel

For best-execution, pricing optimization, and delivery routing — compare expected outcomes across channels.

LLPA Overlay (within-cell residual risk-rank) AUC 0.67

Two-stage HistGradientBoosting regressor + isotonic calibration predicting the within-cell residual credit-loss expectation that the Fannie/Freddie LLPA grid leaves unpriced.

Pipeline & post-funding

For QC, repurchase reserves, MSR valuation, and pool composition — score loans through the post-funding window.

Repurchase-risk scoring (v4) AUC 0.72

Gradient-boosting model rating a Fannie or Freddie loan's probability of being repurchased for rep-and-warranty defect.

Prepayment 12-mo

Gradient-boosting classifier rating a Fannie or Freddie loan's probability of prepayment (zero_balance_code = '01') within the first 12 months of loan age.

Prepayment 24-mo

Same loan-feature schema as Prepayment 12-mo, but predicts cumulative prepayment probability within the first 24 months.

Prepayment 36-mo

Predicts cumulative prepayment probability within the first 36 months — the bulk of the refi-burnout window for typical 30-year fixed product.

EPD 12-mo (v2) AUC 0.83

Gradient-boosting model rating a loan's probability of reaching 60+ days delinquent within the first 12 months post-origination — the industry- standard Early Payment Default definition.

EPD 24-mo AUC 0.78

Same loan-level feature schema as EPD 12-mo, but predicts the probability of 60+ DQ within the first 24 months post-origination.

EPD 36-mo AUC 0.75

Same feature schema, predicts 60+ DQ within the first 36 months.

Try it

Pick a model, describe a loan, get a calibrated probability + risk band + recommendation in under a second. Signed-in users also get Compare across all pipeline models — same loan scored against every pipeline / post-funding model in parallel for a side-by-side scorecard.

› Go to Loan-level scoring