The Problem
Asset managers need weekly visibility into leasing performance across the portfolio. Each Monday requires: pulling occupancy data, aggregating leasing activity, calculating conversion rates, comparing scheduled rent to market, and benchmarking against competitor surveys.
The data exists in multiple places: property management system exports for occupancy and applications, market survey PDFs showing competitor activity, and historical tracking spreadsheets. Building the executive summary manually means opening 9 separate reports, copying metrics into Excel, calculating week-over-week changes, and creating individual property snapshots with time series charts.
This takes 4-5 hours weekly. Manual aggregation delays distribution and creates transcription errors when copying between systems.
The Data Sources
PMS Exports (9 reports):
- Leasing activity
- Application pipeline
- Availability
- Retention
- Economic occupancy
- Rent positioning
- Move-ins and move-outs
- Prospects and tours
- Lease expirations
Market Survey PDFs:
- Competitor property performance published by third-party survey firms
- Applications completed by comp properties (last 7 days)
- Total unit counts for competitive set
- Published weekly, one PDF per market
Historical Tracking:
- Prior weeks stored in master workbook
- Time series data for trending (12+ weeks of history per property)
Extracting Competitor Data from PDFs
Market survey PDFs contain tables, but the formatting varies. Text extraction with position-based parsing handles this:
import pdfplumber
import pandas as pd
import re
from pathlib import Path
INPUT_DIR = r"C:\path\to\surveys"
OUTPUT_FILE = "Survey_Parsed_Clean.xlsx"
# Text normalization for inconsistent formatting
DASHES = "\u2212\u2010\u2011\u2012\u2013\u2014\u2015"
THIN_SPACES = "\u2009\u202f"
def normalize_text(s: str) -> str:
if s is None: return ""
s = str(s)
s = s.translate(str.maketrans({c: "-" for c in DASHES}))
s = s.replace("\xa0", " ").translate(str.maketrans({c: " " for c in THIN_SPACES}))
s = re.sub(r"[\x00-\x1F]", " ", s)
s = re.sub(r"\s+", " ", s).strip()
return s
Extract total units from "Property Details" section:
def parse_units_text(pdf_path: Path) -> pd.DataFrame:
rows = []
with pdfplumber.open(pdf_path) as pdf:
comp_mode, buf = False, ""
for page in pdf.pages:
raw = page.extract_text(x_tolerance=1, y_tolerance=3) or ""
lines = [normalize_text(l) for l in raw.splitlines() if normalize_text(l)]
for l in lines:
U = l.upper()
if "PROPERTY DETAILS" in U:
comp_mode = True; buf = ""; continue
if comp_mode and ("AVERAGE / TOTAL" in U or "AVERAGE/TOTAL" in U):
if buf:
rec = _parse_comp_units_row(buf)
if rec: rows.append(rec)
buf = ""
comp_mode = False
continue
if not comp_mode:
continue
# Buffer rows that wrap across lines
if re.match(r"^[A-Z][.)]?\s+\S", l):
if buf:
rec = _parse_comp_units_row(buf)
if rec: rows.append(rec)
buf = l
else:
buf += " " + l
return pd.DataFrame(rows, columns=["Property", "Total Units"]).drop_duplicates("Property")
Extract applications from "Property Performance" section:
def parse_apps_text(pdf_path: Path) -> pd.DataFrame:
rows = []
# Regex requires 4 percent tokens before apps, apps followed by percent
ROW_APPS = re.compile(
r"(?P.+?)\s+"
r"\d{1,3}%\s+[^\d\s]?\s*\d{1,3}%\s+" # leased % and delta
r"\d{1,3}%\s+[^\d\s]?\s*\d{1,3}%\s+" # exposure % and delta
r"(?P\d+|[-—–])\s+\d{1,3}%", # apps followed by percent
re.I
)
with pdfplumber.open(pdf_path) as pdf:
perf_mode, buf = False, ""
for page in pdf.pages:
raw = page.extract_text(x_tolerance=1, y_tolerance=3) or ""
lines = [normalize_text(l) for l in raw.splitlines() if normalize_text(l)]
for l in lines:
U = l.upper()
if "PROPERTY PERFORMANCE" in U:
perf_mode = True; buf = ""; continue
if perf_mode and ("AVERAGE / TOTAL" in U or "AVERAGE/TOTAL" in U):
if buf:
m = ROW_APPS.search(buf)
if m:
prop = normalize_prop_name(m.group("prop").strip())
apps = 0 if m.group("apps") in ("-", "—", "–") else int(m.group("apps"))
rows.append([prop, apps])
buf = ""
perf_mode = False
continue
if perf_mode:
buf = (buf + " " + l).strip() if buf else l
return pd.DataFrame(rows, columns=["Property", "Applications_Last_7D"]).drop_duplicates("Property")
Join units and applications data:
frames = []
for pdf in Path(INPUT_DIR).glob("*.pdf"):
df_units = parse_units_text(pdf)
df_apps = parse_apps_text(pdf)
# Build join keys
df_units["__k"] = df_units["Property"].astype(str).map(keyify)
df_apps["__k"] = df_apps["Property"].astype(str).map(keyify)
# Left join: keep only comps from Units list
out = pd.merge(
df_units[["__k", "Property", "Total Units"]],
df_apps[["__k", "Applications_Last_7D"]],
on="__k", how="left"
).drop(columns="__k")
# Fill missing with 0
out["Applications_Last_7D"] = (
pd.to_numeric(out["Applications_Last_7D"], errors="coerce")
.fillna(0)
.astype(int)
)
# Attach market name
market = infer_market(pdf)
out.insert(0, "Market", market)
frames.append(out)
final = pd.concat(frames, ignore_index=True)
Building the Weekly Report
VBA macros handle the weekly refresh. The process:
Data Import: Read the 9 PMS export files and the parsed competitor data. Map property names between systems (some reports use abbreviations, others use full legal names).
Metric Calculations: Compute week-over-week changes for all KPIs. Calculate conversion rates and rent positioning variances.
Time Series Updates: Append current week's metrics to historical arrays. Shift chart data ranges forward one week. Remove oldest week if the rolling window exceeds 12 weeks.
Portfolio Summary: Aggregate all properties into a single-page view. Color-code cells based on thresholds (red/yellow/green for performance against targets).
Property Snapshots: Generate individual sheets for each property with 12-week time series. Include leasing trends chart and rent chart. Add data tables showing weekly metrics and competitor comparison when available.
Automated Observations: Formula-driven "This Week's Observations" section flags significant changes from prior week across three categories: Leasing & Traffic (prospect volume, tour conversion), Rent + App Funnel (scheduled rent changes, application conversion), and Operations & Availability (unit readiness, economic occupancy variance). Thresholds trigger automatic callouts when metrics cross defined limits.
Formatting: Apply conditional formatting rules. Lock formulas. Protect sheets. Add date stamp showing report generation time.
The macro runs in under 10 seconds for the full portfolio.
The Output
Portfolio Summary (1-pager):
- Grid showing all properties with current week metrics
- Color-coded cells highlighting underperformance
- Week-over-week delta indicators (arrows for improving/declining)
- Competitor benchmarking where survey data exists
- Sorted by occupancy or leasing activity
Property Snapshots (individual tabs):
- 12-week time series table with metrics by week
- Leasing trends chart (3 lines: leased %, occupied %, trend %)
- Rent positioning chart (2 lines: market rent, scheduled rent)
- Current week summary (prospects, tours, applications, conversion %)
- Available units breakdown (ready vs. not ready, days vacant)
- Competitor context (when survey data available for that market)
- "This Week's Observations" callout box with automated alerts on significant changes
Files are saved with date stamp and distributed via email Monday morning.
What This Enables
Weekly reporting previously consumed 4-5 hours every Monday. Opening 9 exports, copying data, calculating changes, updating charts, creating property tabs.
Processing now takes 10 minutes. Drop the PMS exports and survey PDFs in the input folder, run the Python script to parse competitor data, run the VBA macro to refresh the workbook. The same 240 minutes of weekly work, eliminated.
The competitor data integration provides context that didn't exist before. When a property shows 3 applications last week while the competitive set averaged 8, that signals a marketing or pricing issue. Without the survey parsing, this comparison required manually transcribing competitor tables from PDFs.
Time series trending reveals patterns invisible in point-in-time snapshots. A property at 92% leased looks healthy until the 12-week chart shows consistent decline from 95%. Early visibility allows intervention before occupancy craters.
The automation enables Monday morning distribution. Asset managers and executives see prior week performance before the week begins. Decisions about pricing adjustments, marketing spend, or staffing happen days earlier than manual reporting allowed.