Berkeley Housing Pipeline Analysis

This is the technical beginning. To quickly see all 174 projects, click on the **Pipeline Explorer** button below, then on **Spatial**

These several web sites are designed to track every step of our actual housing development from permit application to certificate of occupancy to actual occupation. Originally built to prepare Berkeley's Annual Progrss Report to the California Housing and Community Development department using an independent, completely open public database constructed from the City of Berkeley Open.Gov data, we now have outlined a new way to enlist thousands of data science students and instructors to help our overburdened planning offices in every city tell our housing story in a clear and accurate way. These pages are a first attempt.

In building our independent APR for 2023, 2024, and 2025, it became clear that the official APR suffered from cumulative double counting, errors in dating of entitlements, of issuance of actual building permits and inspections. Since Berkeley submitted its report to HCD on March 26, 2026, covering 202, here are the differences; we corrected for accounting errors and redundant project entries for both Table A and Table A-1. The more accurate number of residential units issued building permits is as follows:

Residential units issued building permits after correcting accounting errors and redundant project entries
Year Reported Units Permitted Deductions (Inaccuracies) Actual Unique Units
2025 463 units -200 units 263 units
2024 731 units -486 units 245 units
2023 431 units -126 units 305 units
2022 887 units 0 units 887 units
2021 659 units 0 units 659 units
2020 733 units -267 units 466 units

Not huge errors, but when Berkeley is tasked by the State to enable over 1,000 new units to be built each year until 2031, we are behind in applications for permits, behind in entitlements, behind in building permits issued, and, most importantly, behind in buildings completed and ready for occupancy. The State tracks building permits issued: the real issue is that almost all actual construction is stalled due to increased building costs, a decreasing labor force, higher interest rates, and lower levels of bank lending and investor interest.

An off-setting new supply comes from UC Berkeley, with almost 4,000 new beds in dormitories under constuction, putting downward pressure on rental costs.

Residential units completed after removing identified duplicate entries
Year Reported Units Completed Identified Duplicates Actual Unique Units
2025 493 units -11 units 482 units
2024 708 units 0 identified 708 units
2023 716 units 0 identified 716 units
2022 828 units 0 identified 828 units
2021 (Broken Summary) N/A N/A
2020 399 units 0 identified 399 units
--
Housing Projects
--
Net New Units
--
APN Coverage
--
Geocoded

Pipeline Explorer

Interactive dashboard with sortable tables, timeline visualization, APR comparison, and height analysis.

Explore Data

Live Data Map

Interactive map of all Berkeley housing projects with status, units, and timeline data.

View Map

Run Analysis (Colab)

Execute the full pipeline in Google Colab. No local setup required.

Open in Colab

GitHub Repository

Source code, data files, and documentation for the complete analysis pipeline.

View Code

5-Stage Pipeline

A: Collection API + Manual
B: Tracking Timeline
C: Analysis Statistics
D: Reporting APR Export
F: Feasibility Pro Forma

Notebooks

Click any notebook to open it directly in Google Colab:

StageNotebookDescriptionRun
00 00A_tour_of_the_pipeline Non-technical tour of housing pipeline concepts Colab
00 00B_first_notebook_in_colab Your first hands-on notebook - create data and charts Colab
A A1_data_sources_setup Connect to Berkeley Open Data API, handle WAF blocks Colab
A A2_address_standardization Normalize addresses (FIFTH ↔ 5TH, Ave ↔ AV) Colab
A A3_geocoding_pipeline Match to lat/lon using 563K address lookup Colab
A A4_apn_enrichment Match projects to Assessor Parcel Numbers Colab
A A5_buildingeye_import Import permit dates from BuildingEye planning portal Colab
A A6_community_map_import Parse Gellerman KML, extract news links from 203 projects Colab
A A7_comprehensive_integration Fuzzy match and merge official + community data Colab
A A9_city_profile_builder Generate city profile and setup checklist for adaptation Colab
B B1_lifecycle_tracking Track permit stages: Zoning → Building → CO Colab
B B2_status_classification Classify status into pipeline categories Colab
B B3_progress_indicators Identify stalled projects (>180 days) Colab
C C0_methods_overview Core methods: joins, aggregation, plotting Colab
C C1_pipeline_analysis Projects by status, conversion rates Colab
C C2_timeline_analysis Processing times, bottleneck identification Colab
C C3_proposal_vs_reality RHNA tracking, "Making It Pencil" analysis Colab
C C4_quality_checks Data validation: soft asserts, null checks, bounds Colab
D D1_monthly_report_generator HTML/PDF/JSON report generation Colab
D D2_dashboard_data_export SQLite + Datasette deployment Colab
D D3_alerts_monitoring Status change alerts, milestone tracking Colab
D D4_hcd_apr_tables Map pipeline data to HCD APR Table A2 fields Colab
F F1_development_math Basic feasibility: TDC, NOI, Return on Cost Colab
F F2_pro_forma_transparent Full pro forma with IRR, draw schedule, policy scenarios Colab

APR Compliance Status

Progress toward California HCD Annual Progress Report requirements:

CategoryCoverageStatus
Direct mappings (year, address, permits) 5 fields Ready
Derivable (unit_category, SB35 flag) 7 fields Ready
APN (Assessor Parcel Number) 96.5% Ready
Income breakdown (VLI, LI, MOD) 0% Needs Data
Permit dates (entitlement, BP, CO) 0% In Progress

Quick Start

Three ways to explore the data:

  1. Live Map: berkeley-housing.fly.dev - Browse projects interactively
  2. Colab: Click "Open in Colab" above - Run full analysis in browser
  3. Local: git clone https://github.com/blockXblock/berkeley-housing-analysis && jupyter lab MASTER_ANALYSIS.ipynb

Roadmap

Planned additions to expand the course:

ModuleDescriptionStatus
E1: Environmental Overlays CEQA, flood zones, historic districts Planned
E2: Zoning Analysis Parcel-level zoning constraints and allowed density Planned
G1: Comparative Analysis Cross-city benchmarking for Bay Area jurisdictions Planned
Multi-City Network Adapt pipeline for Oakland, SF, San Jose using city_template.yaml Seeking Contributors

Want to adapt this for your city? | Contribute on GitHub