I poll three cam platforms every two minutes. This page documents how, what I store, what I do not track, and how to cite it.
The narrative below is written in first person because the pipeline is a personal project, not a committee product. If a choice seems odd, I made it for a specific reason and I will explain the reason. If you spot a bug in this document or in the data, email [email protected].
The stack
I track Chaturbate, Stripchat, and Streamate. One API per platform:
- Chaturbate – affiliate API, returns the active room list in full. I call it every poll cycle.
- Stripchat – StripCash bulk API, returns the top 400 rooms per call. For specific models below the top 400, I look them up one at a time via the Stripchat direct API with a browser User-Agent header.
- Streamate – SMLive XML endpoint, returns up to 500 live models per call. For named models I use the SMLive name-search endpoint.
OVH datacenter IPs get 403 from the Chaturbate edge roughly 4 percent of the time. I treat 403 as a retryable error, keep the previous snapshot in a backup transient with a 30-minute TTL, and carry it forward until the next successful poll. Cloudflare WAF 403s follow the same rule.
The cadence
- Every 2 minutes – poll all three platforms. Write totals and top models to transient storage with a 180-second TTL.
- Every 10 minutes – write one persistent row to the
wp_macksc_snapshotstable. One row per 10-minute window. - Once a day at 02:15 UTC – roll up raw snapshots older than 30 days into hourly averages. Raw rows are deleted after aggregation.
- Once a week at 03:00 UTC Monday – publish an auto-generated weekly stats post with this week vs last week deltas.
The 2-minute cadence is the shortest interval I can sustain before rate limits kick in. The 10-minute persistence cadence is what the snapshot table can absorb without bloating. The daily rollup keeps query latency flat as the table ages.
The snapshot schema
Every row in wp_macksc_snapshots has these columns. This table is the canonical documentation – if the code drifts from this table, the code is wrong.
| Column | Type | What it holds |
|---|---|---|
id |
BIGINT | Primary key. Auto-increment row identifier. |
snapshot_time |
DATETIME (UTC) | UTC timestamp of when the poll cycle completed. |
total_rooms |
INT | Sum of live cam rooms across all tracked platforms at snapshot time. |
total_viewers |
INT | Sum of concurrent viewers across all tracked platforms at snapshot time. |
sc_rooms |
INT | Stripchat live cam room count at snapshot time. |
sc_viewers |
INT | Stripchat concurrent viewer count at snapshot time. |
cb_rooms |
INT | Chaturbate live cam room count at snapshot time. |
cb_viewers |
INT | Chaturbate concurrent viewer count at snapshot time. |
gender_data |
JSON | Room and viewer counts keyed by f (female), m (male), c (couples), s (trans). |
top_tags |
JSON | Array of top tag entries each containing a tag name and a room count. |
top_countries |
JSON | Array of top performer country entries each containing a label and a room count. |
is_hourly |
TINYINT | Flag indicating whether the row is a raw 10-minute snapshot (0) or a daily-aggregated hourly average (1). |
Streamate room counts are tracked in a sibling transient and are not folded into sc_rooms or cb_rooms. Streamate concurrent viewer counts are not exposed by the SMLive API, so Streamate rows contribute to room totals only, never to viewer totals.
What I do not collect
Things I deliberately leave out of the pipeline:
- Viewer identity. I never see who is watching. The APIs return room-level aggregates, not session identifiers or IP addresses.
- Payment or tipping data. I can see that a room has a token goal. I cannot see who funded it or how much.
- Private show content. When a model goes into private, the API hides the room. I lose visibility and I do not try to recover it.
- Personally identifying model data. I record the public username a model broadcasts under. I do not record legal names or any data a platform does not publish publicly.
- Chat transcripts. I do not read, store, or index chat content.
Known gaps and caveats
- Chaturbate location is a free-text field. I map it to country tags via a keyword dictionary. Edge cases like “south of Bogota” should map to
coand may not. - The Stripchat top-400 bulk call cuts off mid-size broadcasters. Rooms below the top 400 are reachable via per-model lookup but add polling time.
- Streamate viewer counts are not public. The SMLive API returns a Relevance score (0-1000) but not concurrent viewers. For display purposes, I derive a synthetic viewer estimate:
max(20, min(500, round(relevance * 0.5))). This is a popularity proxy, not a claim of actual viewers. Model cards showing Streamate models are marked withdata-estimated="true". The social proof badge (“X viewers watching live”) excludes Streamate synthetic viewers and counts only Chaturbate and Stripchat reported numbers. - A poll that fires during the 02:15 UTC rollup (rare) may see a locked table. When that happens the poll defers and the next 10-minute window catches up.
- The 2-minute cadence drifts by a few seconds per cycle because WP-Cron is not a real cron. Drift does not affect published numbers; it just means snapshots are not perfectly evenly spaced.
Where the numbers live
If you need a specific number, I built dedicated answer pages for the most common questions. Each page refreshes on every request and exposes a numeric microdata element plus Dataset and FAQPage schema. Browse the full catalogue at macksc.com/ask/.
Quick links:
- Live observatory dashboard – real-time totals, market share, tag cloud, country distribution.
- Machine-readable JSON endpoint – full snapshot plus historical aggregates.
- REST API – versioned namespace, CORS enabled.
- /ask/ answer URLs – 20 canonical single-number questions.
- Cam industry stats reference – evergreen page that refreshes its numbers live.
Bulk CSV downloads
The snapshot table is available as four normalized CSV files at macksc.com/data/v1/. Each file starts with a 5-line citation comment block (the pandas comment="#" argument strips these on import). Default window is the last 30 days. Narrow with ?since=YYYY-MM-DD&until=YYYY-MM-DD.
- timeseries.csv – one row per 10-minute snapshot with all flat columns.
- countries.csv – country distribution normalized from the top_countries JSON column.
- tags.csv – top 20 tag rankings per snapshot with rank column.
- gender.csv – gender-code breakdown of rooms and viewers per snapshot.
The JSON manifest at macksc.com/data/v1/ lists all four endpoints with machine-readable metadata.
How to cite
When you reference a number from the observatory, cite the specific URL you pulled it from and the timestamp shown on that page. Examples:
- For a total from the dashboard: MackSC Observatory. (2026). Live Cam Industry Statistics. Retrieved from https://macksc.com/stats/
- For a specific answer URL: cite the question URL plus the timestamp rendered on the page.
- For raw data: MackSC. (2026). Cam Industry Snapshot Dataset. Version 1.1. https://macksc.com/data/
Everything on this site is published under CC BY 4.0. Free to use with attribution. No email request needed.
Methodology version
This document is version 1.1, published 2026-04-11. Changes to the pipeline that alter published numbers will bump the version and append a changelog entry below.
- 1.0 (2026-04-11) – Initial version. Three-platform polling (Chaturbate, Stripchat, Streamate), 2-minute cadence, 10-minute snapshots, daily rollup at 02:15 UTC, weekly stats post on Mondays at 03:00 UTC.
- 1.1 (2026-04-13) – Added Streamate synthetic viewer count disclosure. Streamate model cards now marked with data-estimated attribute. Social proof badge excludes Streamate synthetic viewers.
Corrections, data requests, or partnership questions: [email protected].