themortgagellm

‹ All scoring models

Scoring Models · Higher Priced Loan (HPML)

Category: Origination

What it does

Gradient-boosting classifier rating a HMDA-style loan application's probability of pricing into a Higher-Priced Mortgage Loan under Reg Z: rate_spread ≥ 1.5 pp for first liens (≥ 3.5 pp for subordinate liens), OR HOEPA-flagged. Trained on the 2018-2023 HMDA Snapshot LAR (60.9M originated loans), tested on 2024-2025 (13.0M originations): AUC 0.87 (cross-cycle holdout; random-split within 2018-2023 gives AUC 0.93). Isotonic-calibrated; calibration is near-perfect (ECE = 0.0003 on test, Brier 0.035). Empirical HPML rate ~5.8% in train.

Why it matters. HPML status triggers Reg Z operational requirements — escrow account mandate, full appraisal protections, expanded ATR documentation — that add roughly $300-500 per file in build cost and create CFPB / examiner compliance exposure (findings can run $5-25K per loan) if missed. Use the prediction pre-funding to flag the file for HPML-specific build on day one, and verify pricing-tier supports the APR build. Particularly material for FHA / VA channels where ~14% of originations price into HPML.

› Try it on the home page (Loan-level model scoring → Higher-Priced loan)

API connector

Programmatic access. Calibrated probability + risk band + operating recommendation in the response.

POST /api/score_higher_priced
Content-Type: application/json

{
  "loan_type": "2",                 // 1=Conv, 2=FHA, 3=VA, 4=USDA-RD
  "lien_status": "1",               // 1=first, 2=subordinate
  "loan_purpose": "1",              // 1=purchase, 31=refi, 32=cash-out refi
  "occupancy_type": "1",
  "cltv": 95,
  "debt_to_income_ratio": ">60",
  "loan_amount": 285000,
  "property_value": 300000,
  "state_code": "TX",
  "lei": "549300...",
  ...
}

Schema reference (request / response shape): GET /api/score_higher_priced/schema

Model metadata (training cohort, AUC, calibration): GET /api/score_higher_priced/info

See also: How to read these AUC numbers.