Scoring Models
A portfolio of calibrated loan-level risk and pricing models, built from the ground up on the full public mortgage record. Each model ships with an out-of-time test AUC, a calibrated probability you can drop straight into expected-value math, and an interpretable recommendation.
(Fannie + Freddie, 2013-2025)
(2018-2025 LAR)
(GSE + GNMA disclosures)
(calibrated, OOT-validated)
How the models are built
Every model is a gradient-boosting classifier (sklearn HistGradientBoosting) trained on full-population data — no sampling, no synthetic cohorts. Hyperparameters are tuned via random-grid search with early stopping; class imbalance (e.g. denials, repurchases) is handled with class-weight balancing rather than oversampling so the recovered probabilities stay interpretable.
Raw classifier scores are then isotonically calibrated against the empirical positive rate in each score decile. The result: a predicted 5% probability matches an actual 5% rate in the historical data, not just "this loan is riskier than that loan." Calibration quality is reported on every model page as Brier score and ECE on the held-out test set.
Validation is out-of-time: we hold out the latest 1–2 vintages from training and report AUC on those. That 's a stricter bar than random-split AUC (which is also reported, usually 5-10 points higher) because the test set carries the distribution shift the model will face in production. Every number on this site reflects the out-of-time figure unless explicitly tagged otherwise.
Why calibrated probabilities matter
Rank-ordering ("riskier than baseline") is enough for triage but not for pricing. A loan with a calibrated 8.0% repurchase probability and one with a 2.0% repurchase probability should produce expected R&W reserves that differ by a factor of 4 — and they do, on this platform. Uncalibrated scores can rank-order correctly while being 10× off on absolute magnitudes, which silently breaks any downstream EV calculation.
Practical consequence: the probabilities here can be multiplied by loss assumptions (e.g. $200K per repurchase, $10K per EPD signal) to produce expected-loss numbers that the pricing desk can use directly. That's how the Fannie vs Freddie channel-choice model works under the hood — a compound expected-loss calc using calibrated probabilities from the underlying repurchase + EPD models.
Pre-funding decisioning
For loan officers, secondary marketing, and pipeline managers — score an application before it funds.
Pull-through (application → origination) AUC 0.92
Gradient-boosting classifier rating the probability that a HMDA-style application closes as an originated loan (action_taken = 1).
Credit Denial Probability AUC 0.91
Gradient-boosting classifier rating the probability that an application is denied for credit reasons (action_taken = 3).
Credit Approval Probability
Positive-framing sibling of Credit Denial Probability.
Higher Priced Loan (HPML) AUC 0.87
Gradient-boosting classifier rating a HMDA-style loan application's probability of pricing into a Higher-Priced Mortgage Loan under Reg Z: rate_spread ≥ 1.5 pp for first liens (≥ 3.5 pp for subordinate liens), O…
Pricing & channel
For best-execution, pricing optimization, and delivery routing — compare expected outcomes across channels.
Appraisal Waiver Probability (PIW / Value Acceptance / ACE) AUC 0.85
Gradient-boosting classifier rating the probability that a conventional conforming loan will be granted an appraisal waiver — either Fannie Mae's Value Acceptance (formerly Property Inspection Waiver / PIW) or F…
LLPA Overlay (within-cell residual risk-rank) AUC 0.67
Two-stage HistGradientBoosting regressor + isotonic calibration predicting the within-cell residual credit-loss expectation that the Fannie/Freddie LLPA grid leaves unpriced.
Pipeline & post-funding
For QC, repurchase reserves, MSR valuation, and pool composition — score loans through the post-funding window.
Repurchase-risk scoring (v4) AUC 0.72
Gradient-boosting model rating a Fannie or Freddie loan's probability of being repurchased for rep-and-warranty defect.
Prepayment 12-mo
Gradient-boosting classifier rating a Fannie or Freddie loan's probability of prepayment (zero_balance_code = '01') within the first 12 months of loan age.
Prepayment 24-mo
Same loan-feature schema as Prepayment 12-mo, but predicts cumulative prepayment probability within the first 24 months.
Prepayment 36-mo
Predicts cumulative prepayment probability within the first 36 months — the bulk of the refi-burnout window for typical 30-year fixed product.
EPD 12-mo (v2) AUC 0.83
Gradient-boosting model rating a loan's probability of reaching 60+ days delinquent within the first 12 months post-origination — the industry- standard Early Payment Default definition.
EPD 24-mo AUC 0.78
Same loan-level feature schema as EPD 12-mo, but predicts the probability of 60+ DQ within the first 24 months post-origination.
EPD 36-mo AUC 0.75
Same feature schema, predicts 60+ DQ within the first 36 months.
GNMA EPD (FHA / VA / USDA / PIH) AUC 0.76
Government-insured EPD model.
Fannie vs Freddie channel choice
For a given loan that's eligible for both GSEs, predicts which channel produces lower expected loss.
Try it
Pick a model, describe a loan, get a calibrated probability + risk band + recommendation in under a second. Signed-in users also get Compare across all pipeline models — same loan scored against every pipeline / post-funding model in parallel for a side-by-side scorecard.
themortgagellm™