Methodology

How we build forecasts you can defend.

Every step, showable. From which government data we ingest, to how we turn it into features, to the exact formula that produces the forecast, to the test we put the forecast through before shipping. A forecast you can’t explain to an investment committee isn’t worth publishing.

Data ingestion

Eight federal and open sources pulled on their native cadence (daily, weekly, monthly, annual). All raw responses versioned to S3 before any transformation.

BLS · FRED · Census ACS + BPS · FHFA HPI · HUD FMR · Zillow ZHVI/ZORI · BEA regional GDP · Redfin Data Center

Feature engineering

Raw series are cleaned, normalized to MSA boundaries, seasonally adjusted where appropriate, and combined into ~120 features per market per month.

Affordability · permits · migration · jobs · rent-to-price · inventory

Model fit

Baseline: a mean-reverting momentum blend (local YoY × national drift × national mean). Research track: per-metro regressions on the observable feature panel, refit monthly. All model versions are retained so any historical forecast is reproducible.

Baseline v0.2 · research v0.3-multi · monthly refit

Forecast + intervals

12-month-ahead point forecasts with 80% confidence bands. Forecasts are versioned — every historical forecast is recoverable for audit.

Point + 80% band · 12-month horizon · full version history

Validation

Walk-forward backtests refit the model on data strictly prior to each holdout period, then score MAPE and direction hit-rate against naive-persistence and trailing-mean baselines. Both expanding and rolling-window variants are published per market.

Walk-forward · expanding + rolling 36mo · vs. dual baselines

Publication

Forecasts published through the product, API, and research notes. Every number cites its underlying feature weights and data lineage.

Web · API · research notes · CSV export

Accuracy

We publish our forecast errors.

Out-of-sample mean absolute error against the FHFA HPI release, on 12-month horizons, refreshed monthly.

MAE (v1 target)

≤ 2.0%

12m national HPI

WMAPE (v1 target)

≤ 18%

Relative error, all MSAs

Skill score (target)

≥ 0.12

vs AR(1) baseline

Top-quartile (target)

≤ 1.2%

Best-calibrated MSAs

Why WMAPE? Traditional MAPE explodes when realized HPI growth is near zero, which happens regularly in smaller MSAs during flat markets. WMAPE (Σ|y−ŷ| / Σ|y|) weights by market size and stays well-defined.

Why skill score? Reports improvement over the AR(1) baseline directly — if a model can't beat "last year's change, applied forward" by at least 12%, it's not worth deploying.

Live metrics land here on the first production run. Full method + definitions in the whitepaper.

Glossary

Terms we use.

HPI

House Price Index. We use FHFA All-Transactions, seasonally adjusted, as our anchor series at the MSA level.

MSA / CBSA

Metropolitan Statistical Area / Core-Based Statistical Area, as defined by the OMB. 410 covered.

MAE

Mean Absolute Error — mean(|y − ŷ|). Robust, in the same units as the target.

WMAPE

Weighted Mean Absolute Percentage Error — Σ|y−ŷ| / Σ|y|. Stays well-defined near zero, unlike vanilla MAPE.

MdAPE

Median Absolute Percentage Error — median(|y−ŷ|/|y|). Outlier-robust; half of markets beat this.

Skill score

1 − (MAE_model / MAE_baseline). Positive means the model beats AR(1); 0.12 means 12% reduction in MAE.

Bias

mean(ŷ − y). Positive = systematic over-forecast. Target: |bias| < 0.3pp on the holdout.

Signal: Expand / Hold / Contract

Portfolio-oriented rollup of forecast, risk score, and valuation.

Risk score (A–F)

Forward-looking concentration of downside risk drivers: affordability, supply, migration, jobs.

ZHVI / ZORI

Zillow Home Value Index and Zillow Observed Rent Index — used as validation series.

The model card

One-page summary

The equation, the inputs, the live coefficients, the latest accuracy numbers, the data freshness. For prospects who want the read in 60 seconds.

Read the model card →

Whitepaper

Full technical spec

Features, training protocol, full accuracy metrics (MAE, RMSE, WMAPE, MdAPE, skill score), and reproducibility instructions.

Read the whitepaper