Two acts on a parish-admin platform
Provstiskyen: optimising then rewriting a 10-year SaaS
Two acts on a 44,000-line R Shiny platform that runs about half of Denmark's deaneries. Act I cut cold start from 50s to 18s and deploys from 35min to 80s on the existing codebase. Act II, once the architecture itself was the ceiling, is a full rewrite onto FastAPI, Polars, and React: performant by default, far more maintainable, with the legacy app retiring as the last module ports across.
50s → 18s
Cold start, optimised (Act I)
35 min → 80 s
Deploy time, 26× (Act I)
44k
Lines of R, now retiring (Act II)
1,000+
Tests on the new platform
The starting point
Provstiskyen is a Danish administration platform for parish councils: the bookkeeping, reporting, and appropriations workflow that runs about half of the country's deaneries. It had been built and run single-handed for about ten years, in R and Shiny, with ShinyProxy hosting one R process per user on a 16-core / 64 GB host. It worked, and it had steady paying customers. It was a success-at-scale problem: every new user added a fixed per-user resource bill, and the cold-start experience was rough.
I came on in late 2023 as its first engineer, formally from January 2024, originally to move the platform to Kubernetes. The remit grew. Over the next two years the work split cleanly into two acts. Act I made the existing R Shiny platform cheaper, faster, and more reliable, without rewriting it. Act II, once the architecture itself had become the ceiling, was a full rewrite onto a modern stack built to carry the platform another decade. They are two different kinds of engineering, and the judgement that connects them is the point of this case: optimise the system you have first, and rewrite only once you have proven it is the architecture, not the code, that is holding you back.
Act I: a cheaper, faster platform on the existing stack
Kubernetes migration (completed April 2024)
The flat 16-core host was replaced with Google Kubernetes Engine. Pods scale up on demand and scale to zero when nobody is logged in. The application didn't change yet, but the bill did: from a fixed monthly cost regardless of usage, to paying only for what the platform actually serves. Same app, same code, much lower floor.
Kubernetes solved cost and elasticity. Cold start was still the visible problem. Users felt every second of the 50-second login wait, and deploys took long enough that we shipped monthly instead of daily. The next stretch, through 2024 and into 2025, was a sustained optimisation pass on the legacy stack.
Pre-warmed pod pool
ShinyProxy normally spins up a fresh R process on user login. That process is the 50-second cost. I changed the cluster topology to keep a small pool of warm pods ready ahead of demand: when a user logs in, an already-running pod is claimed instantly and a new warm one starts in the background. The login latency the user experiences becomes the speed of the load-balancer redirect, not the speed of R booting. The trade-off is real (you pay for warm pods that aren't being used yet), but the pool is small relative to total capacity and the UX win is dramatic.
Base image: 35-minute builds to 80 seconds
Every deploy of the legacy app reinstalled the entire R-package dependency tree from source: about 35 minutes per build, which made shipping anything during the workday painful. I split the Dockerfile into two layers: a base image with R and all package dependencies installed once (rebuilt only when the dependency manifest changes), and a thin app layer on top that contains only the source code. Cold base-image rebuilds still take 35 minutes; routine app rebuilds run in roughly 80 seconds. That is a 26× drop on the hot path, and it unblocked the deploy cadence. We went from monthly to multiple times per day without trying.
Flame graph, Polars, and a single stored procedure
With the pod pool in place, the user-facing wait was no longer "wait for R to start"; it was "wait for the newly-claimed pod to load the app's data into memory before serving the first page", about 32 seconds. I profiled that startup path with R's flame-graph tooling and found two things worth fixing:
- Many separate database round-trips during initialisation. Each was small and harmless on its own, but they were sequential, and sequential network round-trips against MariaDB stack up fast. I consolidated them into a single stored procedure that returns every table the app needs in one response.
- Two compute-heavy R functions on the startup path: financial data transformations the app needed before any UI could render. I ported both to Polars (the Rust-based dataframe library, called from R via its arrow integration). More than 10 seconds saved on those two functions alone.
Together those drop pod-claim-to-first-page from 32 seconds to 18. Combined with the pod pool already delivering an instant pod, the end-to-end cold start the user experienced went from 50 seconds to 18, on the existing codebase, before a single line of the rewrite existed.
Act II: the rewrite (started July 2025, first release January 2026)
After a year of optimising R Shiny, the architecture itself was the ceiling. Per-user R
processes don't scale to ten times the user count. The R ecosystem for web-first concerns
(auth, tenancy, caching) is thinner than the Python or TypeScript equivalents. And a
44,000-line R codebase (one large app.R plus surrounding modules and helper
scripts) was slowing every new feature. By mid-2025 I was convinced the platform needed a
clean rebuild, and in July I got the go-ahead to start one. I did not argue that R Shiny
had been a mistake; it had run a real business for ten years. I argued that the features
customers would want over the next decade needed a foundation it could not give them.
The stack I chose, with the reasoning:
| Layer | Choice | Why |
|---|---|---|
| Backend | FastAPI + Polars + Pydantic | Async-friendly, automatic OpenAPI, dataframe operations that beat anything R offers for the analytics workloads. Pydantic validates at the boundaries. |
| Frontend | React 19 + Vite + TanStack Query/Router | Tailwind and shadcn/ui for the component layer. A typed, modular SPA in place of one monolithic Shiny UI. |
| Database | MariaDB (Cloud SQL) | Same engine as the legacy app, so a shared production database bridges the two while modules move across. |
| Cache | DragonflyDB | Drop-in Redis protocol, much higher throughput per node. Cache-aside with Arrow IPC serialisation. |
| Auth | Auth0 | Separate staging and production tenants, so local development can never touch real user data. |
| Hosting | GKE | A managed control plane buys a tiny team what it could not cheaply build or keep running: high availability, automatic security patches, node auto-repair. Self-hosting that reliability is a full-time job, and one thunderstorm should not be able to take the product down. |
The win is maintainability, not raw speed
The rewrite needed no clever performance work at all. Polars on FastAPI is fast enough by
default: response times stay sub-second across every migrated module without anyone tuning
for it. What it bought instead was a codebase that is larger than the 44k-line
original and far easier to live in. It is modular, typed, and independently testable, and a
new feature now lands in hours instead of a wrestling match with one giant app.R.
Act I was targeted optimisation on a system I could not replace yet: a pre-warmed pod pool,
a Polars port of the hot startup path, one stored procedure in place of a dozen sequential
round-trips. Act II was the opposite discipline, choosing an architecture solid enough that
none of that was necessary. Both are the same call made twice: put the effort where the
leverage actually is.
Architecture
Dumb router, smart service. Routers do HTTP semantics only (request
parsing, status codes, content types). Services own business logic and call data loaders.
Data loaders are layered: BaseDataLoader abstract, MariaDBDataLoader
for SQL, CachedDataLoader wrapping it with DragonflyDB cache-aside. Each layer
is independently testable; tests mock at the data-loader boundary so real service and
router code runs against known fixtures.
One React tree, app-wide. Under TanStack Router the whole platform is a single React root. Auth0, the React Query cache, and the nanostore state are created once at the root route, so they survive navigation: the cache stays warm, sidebar and filter state persist, and route changes are client-side and sub-100ms.
Snapshot-immutable analysis runs. Each Analyse run persists its input
configuration and its output together in an analysis_runs table (config as
JSON, results gzip-compressed), so any two runs can be diffed to explain why their numbers
differ. The new engines reproduce the legacy R calculations module by module, each checked
against a fixture from the legacy R app as it lands, with the last few still in progress.
The cutover: bifurcated, not a migration weekend
A gradual module-by-module migration was not the shape this took. The practical version was a bifurcation. At the first release in January 2026, an NGINX ingress in front of both apps sent everything to the new platform except the Analyse module, which kept routing to the legacy R app. Analyse is the rarer, more complex, slower-to-port use case, so it was deliberately left for last. Users never saw a migration weekend; they saw the whole product move to the new stack while one module stayed where it was.
Where things stand
The full product runs on the new platform today, with over 1,000 tests passing, including a parametrised access-matrix suite that checks every user persona against every endpoint group so no permission regression can ship undetected. The one piece still on the legacy app is Analyse, and it is the active work: several engines are in place, a couple still run a simpler interim model, and each is checked against the legacy R output as it is finished. When the last engine and the cross-module summary land, the NGINX route to the old app is removed and the legacy R Shiny platform is switched off for good. That release is 2.0.
What this case shows
- Optimise first, rewrite once the architecture is the ceiling. The R Shiny app got a year of focused performance and infrastructure work before any rewrite began. The 50-second login was 18 seconds before a single FastAPI route existed. That year bought the runway, and the rewrite was the right call only once per-user R processes could no longer scale, not before.
- The rewrite's value was maintainability, not speed. No Zig, no hand-tuning. A boring, modern, modular stack that is performant by default and that a small team can extend for years.
- A 26× build-time win changes how a team works. Shipping monthly versus shipping daily is not 30× faster delivery; it is a different culture. For a small team it is the most valuable single optimisation available.
Related work
This site
Tachyon
The same haversine kernel walked from a naïve pandas `.apply` through C++, Rust, Zig SIMD, and finally an analyzer-driven V7 in Zig that reads its own compiled assembly to land at 150 GB/s, plus a WebGPU compute lab in the browser. End-to-end demo of the optimisation work I do for clients.
Optimization DevOps FullstackEnterprise CI cluster
Jenkins pipeline right-sizing
Took 2,600 production pipelines from 8% to ~60% memory utilisation by building per-build telemetry, then designing bins from real percentile data. Same hardware, several multiples more headroom, no rewrite of any pipeline required.
DevOps Observability Optimization