Local-first survey analytics for macOS

Survey analytics, built for market researchers.

Statsflow reads and writes SPSS files without losing a byte of metadata, builds weighted banner tables with proper significance testing, and runs 20 statistical models — fast enough that a 50,000-interview study feels instant. All on your Mac, all private.

Download for macOS See what it does

Free · v0.8.0 · Apple Silicon · macOS 12 or later

From a market researcher with 30 years on the buyer side · MSc Applied Statistics, University of Oxford

A workbench, not a spreadsheet

Statsflow is metadata-native from the ground up. It wraps best-in-class statistics with a survey-aware interface — built for studies of 1k–100k interviews and thousands of variables.

Lossless SPSS files

Read and write .sav and .zsav with complete metadata fidelity — variable and value labels, measurement levels, user-missing ranges, multi-response sets. CSV, Excel, Parquet, Stata and SAS too.

Weighted banner tables

Publication-grade cross-tabs with proper within-banner significance testing — column-proportions z-tests and Welch's t, computed on the Kish effective base.

20 statistical models

The full regression family, CHAID, clustering, PCA, EFA, correspondence analysis, driver analysis, TURF, ANOVA — plus latent class / profile analysis and multiple correspondence analysis, new in this release. Survey-weighted wherever the mathematics allows.

Native survey weighting

IPF raking, post-stratification and weight trimming built in, parity-tested against R's survey::rake. Effective sample size is surfaced everywhere it matters.

Local-first and private

Your data never leaves your Mac. No account, no cloud, no telemetry — open a file and start working.

AI interpretations

One click turns any model, table or chart into a plain-language write-up — what's significant, how strong the effect is, which bases are too thin to trust. Bring your own OpenAI or Anthropic key; everything else works fully offline.

Built for studies that break spreadsheets

Blazing fast on real survey sizes

Statsflow's engine is built on Polars and Apache Arrow — columnar, multi-core, zero-copy. The work that makes you wait in other tools happens before you've let go of the mouse.

50,000 interviews

The bundled European Social Survey demo opens, filters and cross-tabs without a spinner — a study size that grinds a spreadsheet to a halt.

575 variables

Carried with full SPSS metadata — value labels, missing rules, multi-response sets — and still navigable in real time.

Seconds not minutes

A full K-Means run across all 50k respondents finishes while you're still reading the config panel. Regression and driver analysis return on the spot.

Stack up a dozen weighted banner tables, toggle a weight, swap a filter, flip significance testing on — the whole grid re-renders as fast as you can click. No batch runs, no progress bars, no waiting for the spreadsheet to recalculate.

Seven banner columns, 50,118 interviews, weighted and significance-tested — built and rebuilt in real time.

Twenty analyses, one workbench

Everything a survey study needs — from a quick frequency table to model-based segmentation — survey-weighted wherever the mathematics allows, and parity-tested against R and SPSS.

Cluster analysis — K-Means segments named and colour-coded, with a genuinely weighted silhouette behind the scenes.

Regression & prediction

Linear regression

OLS and WLS with robust and survey-design standard errors, VIF diagnostics and conformal prediction intervals — the workhorse, done properly.

Logistic regression

Binary outcomes with odds ratios, a ROC curve and conformal prediction sets — on a genuinely weighted GLM path, not the broken Logit shortcut.

Ordered logit / probit

Likert and rating-scale outcomes modelled with their ordering intact, a Brant proportional-odds check and an optional Bayesian mode.

Multinomial logistic

Unordered choice outcomes with a proper Hausman-McFadden IIA test and per-contrast interpretation.

Driver analysis

Six importance methods on one screen — Shapley, Johnson's RWA, Kruskal, Bayesian RWA, permutation, std-β² — so collinear drivers can't fool you.

Segmentation & trees

CHAID

The classic market-research decision tree, with survey-weighted splits and an interactive zoom-and-pan tree, rules table and sunburst.

Cluster analysis

K-Means and hierarchical clustering with a genuinely weighted silhouette and respondent-friendly cluster profiles.

Latent class / profile analysis

New

Model-based segmentation that uncovers hidden respondent types from categorical, continuous or mixed batteries — defensible with BIC and entropy.

TURF

Reach-and-frequency portfolio optimisation — the smallest set of products or messages that covers the most respondents.

Dimension reduction & latent structure

Principal component analysis

Component extraction with Varimax and Promax rotation and rotation-aware saved scores.

Factor analysis (EFA)

True common-factor EFA with a polychoric kernel for Likert items — loadings that don't under-state.

Correspondence analysis

Brand-image maps from a single cross-tab, with a Rao-Scott survey-design χ² correction.

Multiple correspondence analysis

New

Perceptual maps across many categorical questions at once, with a corrected eigenvalue scree and supplementary-variable projection.

Hypothesis testing

One-way ANOVA

Group-mean comparison with Welch's F, η² and ω², and Tukey / Games-Howell post-hoc tests — weighted where the mathematics allows.

Chi-square / Cramér's V

Independence testing with a Rao-Scott survey correction and honest, uncorrected effect sizes.

Descriptives & exploration

Frequencies

Value-label-aware frequency tables with one-click nets and Save-to-Data recodes.

Descriptives

Means, dispersion, weighted medians and any percentile you ask for, across a whole numeric battery.

Explore (by group)

Side-by-side group comparisons — distributions, normality and dispersion tests in a single pass.

Correlation explorer

An auto-dispatching correlation matrix — Pearson, polychoric, polyserial or tetrachoric, picked per variable pair.

Crossplot

Scatterplots with OLS, polynomial and LOWESS fits that respect value labels and missing values.

Need to compare subgroups? Any eligible model can be run as a split-sample analysis — the same model fitted once per subgroup, with a cross-subgroup interaction test and a coefficient forest chart.

AI interpretation

Every result, explained in plain language

Click Interpret on any model, table or chart and Statsflow turns the numbers into a written read-out — what reached significance, how large the effect is, which bases are too thin to trust, and what to flag for the reader. It's tuned for survey work, not generic data science.

Base-size aware: It refuses to interpret an n<30 cell and caveats anything thin — it won't hand you a confident story about 40 respondents.
Speaks effect sizes: Written in the language of magnitude and confidence, not p-value worship — the read-out a researcher would actually write.
Yours to edit: Every interpretation is fully editable and saved with the run, so it travels with the project and into your report.

Interpretation is everywhere you analyse

The same one-click write-up sits on every result surface in the app:

Regression results — coefficients, fit, robust SEs
Weighted banner tables — what the sig testing actually says
Driver analysis — which drivers to act on, and why
Cross-subgroup interaction tests — does the effect really differ?
PCA, EFA & clustering — naming and reading the structure
Correlation explorer & verbatim text analysis

Bring your own API key

AI interpretation runs on OpenAI or Anthropic models. You supply your own API key in Settings — there's no Statsflow account and no subscription to us. The key is stored locally on your Mac and the result summary is sent directly to the provider you choose; you pay that provider for usage, typically a fraction of a cent per interpretation.

No key, no problem. Every other feature in Statsflow — file I/O, tables, all 20 models, weighting, exports — works fully offline. AI interpretation is the one feature that reaches out, and only with a key you control.

A latent-class result, read back in plain language — the big picture, the key insights, and what each base size can and can't support.

See it in action

Every surface is designed for survey data — value labels, weights and missing values are first-class, not afterthoughts.

Banner tables — weighted cross-tabs with within-banner sig testing and the effective base shown per cell.

Model results — summary diagnostics, coefficient tables and robust standard errors.

Driver analysis — importance decomposition with Shapley, relative weights and bootstrap CIs.

CHAID — the decision tree as an interactive sunburst, colour-coded by splitting predictor and drillable to any branch.

Multiple correspondence analysis — dozens of categorical responses placed on one perceptual map.

Explore — frequencies, descriptives and crossplots that respect value labels and missing values.

Download Statsflow

Free for macOS. One signed disk image — no installer, no account.

Download for macOS v0.8.0

Statsflow_0.8.0_aarch64.dmg · ~456 MB · Apple Silicon · macOS 12+

Version: 0.8.0
Released: 2026-05-20
SHA-256: 3eaf54865c7e0e04…

Opening Statsflow the first time

Statsflow isn't notarized by Apple yet, so macOS tags the download as "quarantined" and may warn that the app is damaged or from an unidentified developer. It's safe — just unsigned. On recent macOS the right-click → Open trick no longer clears this, so you remove the quarantine flag once, from Terminal.

Drag Statsflow into your Applications folder.
Open Terminal (Applications → Utilities, or ⌘-Space and type "Terminal").
Paste the command below, press Return, then open Statsflow as normal.

xattr -dr com.apple.quarantine /Applications/Statsflow.app

Installed somewhere other than Applications? Replace the path with wherever Statsflow.app lives. You only need to do this once — updates from inside the app open normally.