All posts
Product

Plotwise Permit Contact Matching Overview

Discover how Plotwise unifies messy permit contact data using normalization, vectors, and human review so you never miss an important relationship.

AK
Alexander KatrukhinManager of Flowmize AI LLC
5 minutes read

Why Contact Matching Matters

Plotwise reads thousands of public permit records to help users keep track of projects and people. Every agency publishes data differently, so the same person can show up with different spellings, phone numbers, or even a shared company email. Without careful cleanup, you might miss permits connected to the same contractor or property contact.

Example: On Monday you might see “Sofia Ramirez, Contractor” with phone 602-555-0199. On Thursday a different city uploads the same person as “Ramirez Construction – Sofia R.” with a shared office line and no email. Plotwise’s job is to recognize those entries as the same contact so you never miss a permit.

Where the Data Comes From

  1. Permit ingestion: Operators capture permit pages and enter key facts (contacts, addresses, dates) into our review dashboard.
    • Real data point: A reviewer types the applicant as “J P OBrien” because the permit PDF is smudged.
  2. Human verification: Reviewers check that the typed contact information mirrors the permit image and flag oddities (missing fields, obvious typos, mismatched phone numbers).
    • Real data point: The reviewer notices the phone 480.555.0100 is copied with dots, not dashes, and leaves a note.
  3. Automated validation: Our backend checks for formatting issues (bad phone layouts, duplicate email placeholders) before the record is accepted.
    • Real data point: Validation rewrites the phone to 480-555-0100 and warns that the email info@obrienbuilders.com looks like a company inbox.

Building Trustworthy Contacts

Step 1: Normalize the Raw Fields

  • Names are broken into components (first, middle, last, suffix) and normalized for capitalization and punctuation.
    • Before → After: "J P OBrien" becomes first: "John", middle: "P.", last: "O'Brien" after comparing the permit text with past appearances.
  • We strip extra whitespace, standardize abbreviations ("St" vs. "Street"), and split multi-value fields so every phone number and email is stored separately.
    • Before → After: "602-555-0199 / 602-555-0110" turns into two distinct phone entries tied to the same contact profile.

Step 2: Turn Contacts into Vectors

We transform every contact into a vector that captures:

  • Name tokens in multiple orders so "Sofia Ramirez" and "Ramirez, Sofia" line up.
    • Example comparison: ['sofia', 'ramirez'] and ['ramirez', 'sofia'] still point to the same location in vector space.
  • Alternate spellings and middle names by comparing initials, nicknames, or missing parts.
    • Example comparison: The system gives a high similarity score between “Mike T. Nguyen” and “Michael Nguyen” even when the permit skips the middle initial.
  • Structured identifiers like phone numbers, emails, and company names, each embedded so that similar values sit close together.
    • Example comparison: Two contacts that both list info@obrienbuilders.com land near each other even if their names differ slightly, signaling a shared business inbox.

Step 3: Compare Against Existing Contacts

Each new contact vector is compared to vectors from our existing database. Instead of exact string matching, we use similarity scores to handle:

  • Different name orderings or missing middle names.
    • Example outcome: “Nguyen, Michael” from a Phoenix permit is paired with “Michael T Nguyen” already stored from Mesa, producing a score of 0.91 and surfacing the existing profile.
  • Multiple phone numbers or emails that belong to the same person.
    • Example outcome: A contact with a new mobile number still links to the existing record that holds their office line and email, so both numbers stay together.
  • Company emails shared by a team.
    • Example outcome: The system groups everyone using permits@desertsolar.com and flags the contact as a potential shared inbox so reviewers can decide whether to connect it to a company profile.

We rank potential matches and send strong candidates to the next review step. Side-by-side snapshots show reviewers the original permit snippet, the normalized fields, and the existing contact’s history.

Step 4: Human-in-the-Loop Review

High-confidence matches are merged automatically. Ambiguous cases go to reviewers who see side-by-side comparisons, review the permit snippet, and confirm whether to merge, create a new contact, or update details. This keeps the system accurate without overwhelming reviewers.

  • Example decision: A reviewer confirms that “Ramirez Construction – Sofia R.” should merge with “Sofia Ramirez” and chooses to keep the shared office email while adding a new personal phone.
  • Example decision: When “J. Patrick O'Brien” appears with a residential address, the reviewer creates a fresh contact because the existing “O'Brien Builders” record is clearly a company entry.

Role of Language Models

We use modern language models to understand the contact context:

  • They summarize permit text so we can confirm the contact’s role (owner, applicant, contractor).
    • Example output: “This contact is listed as the general contractor and provided the construction cost.”
  • They suggest normalized versions of messy free-text entries (e.g., "J. P. O'Brien" vs. "John Patrick OBrien").
    • Example output: Proposes last: "O'Brien" after noticing the apostrophe is missing in the typed text but present in the PDF snippet.
  • They assist in ranking similarity scores, especially when names and companies appear in unfamiliar formats.
    • Example output: Highlights that “Desert Solar LLC (Attn: M. Rivera)” shares address and email traits with “Maria Rivera” already on file, nudging reviewers toward a merge.

These models work alongside vector comparison and human checks, never replacing them.

What This Means for Plotwise Users

  • Faster discovery: Contacts you care about show up across permits even when agencies spell them differently.
    • Real outcome: A developer following “Sofia Ramirez” gets alerts from both Phoenix and Tempe without needing to know her company uses an office-wide inbox.
  • Cleaner insights: Consolidated contact profiles list every phone, email, and role we have seen.
    • Real outcome: The contact card for “Michael Nguyen” shows two phone numbers, separate emails for residential and commercial work, and a history of every permit role.
  • Transparent review trail: You can trust that every match was either algorithmically confident or approved by a reviewer.
    • Real outcome: Users can audit why “John Patrick O'Brien” was kept separate from the company account, seeing the reviewer note and the original permit image.

In short, Plotwise combines manual review, vector-based similarity, and language understanding to keep your contact list accurate, searchable, and up to date.