NEW Introducing Sources.

Finally, answer impossible questions without code. Learn more.

Heap Blog

Engineering

Analyzing Millions of Postgres Query Plans

Making Heap fast is a unique and particularly difficult adventure in performance engineering. Our customers run hundreds of thousands of queries per week and each one is unique. What’s more, our product is designed for rich, ad hoc analyses, so the resulting SQL is unboundedly complex. For some background, Heap is a tool for analyzing […]

Migrating To React + MobX While Shipping New Features

A year ago our front-end was written in a cumbersome combination of Backbone, TypeScript, and a custom state management layer. It was maintainable, but we wanted to ship features faster than it would let us. We wanted to migrate to a React + MobX architecture, but we couldn’t afford to spend six months rewriting most […]

Terraform Gotchas And How We Work Around Them

Heap’s infrastructure runs on AWS, and we manage it using Terraform. This post is a collection of tips and gotchas we’ve picked up along the way. Terraform and infrastructure as code Terraform is a tool from Hashicorp to help manage infrastructure declaratively. Instead of manually creating instances, networks, and so on in your cloud provider’s […]

How Basic Performance Analysis Saved Us Millions

This is the story of how I applied basic performance analysis techniques to find a small change that resulted in a 10x improvement in CPU use for our Postgres cluster and will save Heap millions of dollars over the next year. Indexing Data for Customer Analytics Heap is a customer analytics tool that automatically captures […]

Redshift Pitfalls And How To Avoid Them

Amazon Redshift is a data warehouse that’s orders of magnitudes cheaper than traditional alternatives. Many companies use it, because it’s made data warehousing viable for smaller companies with a limited budget. Since so many Heap customers use Redshift, we built Heap SQL to allow them to sync their Heap datasets to their own Redshift clusters. […]

When To Avoid JSONB In A PostgreSQL Schema

PostgreSQL introduced the JSONB type in 9.4 with considerable celebration. (Well, about as much as you can expect for a new data type in an RDBMS.) It’s a wonderful feature: a format that lets you store blobs in the lingua franca of modern web services, without requiring re-parsing whenever you want to access a field, […]

Goodbye CoffeeScript, Hello TypeScript

Web apps are becoming increasingly feature-rich, and the Heap frontend is no different. We expose an interface that lets users organize their data and build custom visualizations. Nearly every interaction changes an underlying model and there are subtle rules around how the UI behaves. Our previous stack of Backbone and CoffeeScript wasn’t scaling well to […]

Speeding Up PostgreSQL With Partial Indexes

Did you know PostgreSQL supports indexing a subset of your table? This enables very fast reads from that subset with almost no index overhead. It’s often the best way to index your data if you want to repeatedly analyze rows that match a given WHERE clause. This makes PostgreSQL a great fit for workflows that […]

PostgreSQL’s Powerful New Join Type: LATERAL

PostgreSQL 9.3 has a new join type! Lateral joins arrived without a lot of fanfare, but they enable some powerful new queries that were previously only tractable with procedural code. In this post, I’ll walk through a conversion funnel analysis that wouldn’t be possible in PostgreSQL 9.2. What is a LATERAL join? The best description […]

Building Automated Analytics Logging for iOS Apps

Analytics is often the first tool developers add to their iOS app. A standard approach is to write logging code like this: Let’s call this manual event-tracking. With manual event-tracking, you write logging code for each analytics event you care about. A logEvent: for signing in, a logEvent: for inviting a friend, a logEvent: for […]

Creating PostgreSQL Arrays Without A Quadratic Blowup

At Heap, we lean on PostgreSQL for most of the backend heavy lifting.[1] We store each event as an hstore blob, and we keep a PostgreSQL array of events done by each user we track, sorted by time. Hstore lets us attach properties to events in a flexible way, and arrays of events give us […]