Reach out with questions, bugs, or suggestions. We'll respond ASAP. Your message is sent! We'll respond via email shortly.

Heap SQL

Heap SQL lets you directly access your Heap data using SQL. You can run ad-hoc analyses, connect to BI tools such as Tableau, or join the raw Heap data with your own internal data sources.

Best of all, we automatically keep the SQL data up-to-date and optimize its performance for you. Define an event within the Heap interface, and in just a few hours, you'll be able to query it retroactively in a clean SQL format.

Heap SQL is available in our enterprise tier. To get started, email us at support@heapanalytics.com, and we'll be in touch shortly. You can also check out our guide here to look at some common use cases and queries.

This recorded webinar provides everything you need to know about getting data out of Heap.

 

(Total running time: 31 minutes.)

Setup

Heap SQL is powered by Amazon Redshift. In order to get started, you'll need to choose one of two options:

In both scenarios, you need to be careful that any changes you make to the cluster do not interfere with Heap's dynamic schema updates.

Schema

All data in your account is hosted within a single Redshift cluster.

Data in separate environments are stored in separate schemas. The schemas are named using the project name, then an underscore, followed by the environment name. For instance, each table in your main production environment will be part of the main_production schema, while the tables in the development environment of a project named "My iOS App" will be in a schema called my_ios_app_development.

For each environment, Heap will create the following tables:

  1. One users table.
  2. One sessions table.
  3. One table for each event you've defined in the Heap interface or logged via our custom API.
  4. One all_events table which contains all defined or custom events.

These tables have a dynamic schema that auto-update to reflect any new events or properties you've tracked.

Users Table

This table is called users and resides under the projectname_environmentname schema. It contains a row for each user in the environment. The column schema is:

Column Name Type Description
user_id BIGINT Unique ID of user, randomly generated by Heap.
identity TEXT User's username or other unique token, passed via heap.identify API. Must be unique.
handle TEXT User's username or other unique token, passed in via heap.addUserProperties API.
email TEXT User's email address, passed in via heap.addUserProperties API.
joindate TIMESTAMP UTC timestamp when the user was first seen.
last_modified TIMESTAMP UTC timestamp when the user's data was last modified.
user properties… TEXT There will be one column for every unique user property you've sent via the heap.addUserProperties API (name, gender, account status, etc). The column type is automatically inferred from the underlying property values.

Sessions Table

For web, a session ends after 30 minutes of inactivity from the user. For iOS, a session ends after your app has entered the background. The sessions table has the following schema:

Column Name Type Library Description
event_id BIGINT All Unique ID of associated session event, randomly generated by Heap used internally by Heap for data syncing.
user_id BIGINT All Unique ID of associated user, randomly generated by Heap.
session_id BIGINT All Unique ID of associated session, randomly generated by Heap.
time TIMESTAMP All UTC timestamp when session started.
library TEXT All Version of Heap library which initiated the session. Can be one of "web" or "iOS".
platform TEXT Web iOS User's operating system.
device_type TEXT Web iOS Device type, which can be one of "Mobile", "Tablet", or "Desktop".
country TEXT Web iOS Country in which user session occurred, based on IP.
region TEXT Web iOS Region in which user session occurred, based on IP.
city TEXT Web iOS City in which user session occurred, based on IP.
IP TEXT Web iOS The IP address for the session, which is used for determining geolocation.
referrer TEXT Web URL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as direct.
landing_page TEXT Web URL of the first pageview of the session.
browser TEXT Web User's browser.
search_keyword TEXT Web Search term that brought the user to your site.
utm_source TEXT Web GA-based utm_source tag associated with the session's initial pageview.
utm_campaign TEXT Web GA-based utm_campaign tag associated with the session's initial pageview.
utm_medium TEXT Web GA-based utm_medium tag associated with the session's initial pageview.
utm_term TEXT Web GA-based utm_term tag associated with the session's initial pageview.
utm_content TEXT Web GA-based utm_content tag associated with the session's initial pageview.
device TEXT iOS User's device model.
carrier TEXT iOS User's mobile carrier.
app_name TEXT iOS Current name of iOS app, as determined by CFBundleName.
app_version TEXT iOS Current version of iOS app, as determined by CFBundleShortVersionString.

Event Tables

Heap will create one table for each of the following:

  1. Every defined event you've created within the Heap interface.
  2. Every custom event you've tracked via our custom API.

For each event table, the columns are determined by the version of the Heap library that sent the event, along with any custom properties you may have added via Snapshots or the custom API. For instance, if an event was sent by the Heap iOS library, its SQL table will contain iOS-specific columns like app_version or carrier, but not web-specific columns like landing_page or search_keyword. This helps keep your schema as clean as possible.

The name of these tables will be the event_name itself and reside under the projectname_environmentname schema. To produce the event_name portion, we strip the event name of any non-alphanumeric characters and snake-case the result. For example, an event named "Sign Up - Click Link (Any)" becomes sign_up_click_link_any. (Note that this means it's possible for events to collide into the same table. Make sure your event names are sufficiently distinct!)

Each of these event tables will be fully retroactive, meaning it will contain a row for every occurrence of the event since the day you installed Heap. The column schema is:

Column Name Type Libraries Description
event_id BIGINT All Unique ID of associated event, randomly generated by Heap.
user_id BIGINT All Unique ID of associated user, randomly generated by Heap.
session_id BIGINT All Unique ID of associated session, randomly generated by Heap.
time TIMESTAMP All UTC timestamp when event happened.
session_time TIMESTAMP All Timestamp when session started.
type TEXT All For web auto-tracked events, can be any of view page, click, submit, change, with push state events registered as view page events. For iOS auto-tracked events, can be touch, edit field, or a gesture recognizer you've defined. For custom events, this will be the custom event name.
library TEXT All Version of Heap library on which event occurred. Can be one of "web", "iOS", or "server".
platform TEXT Web iOS User's operating system.
device_type TEXT Web iOS Device type, which can be one of "Mobile", "Tablet", or "Desktop".
country TEXT Web iOS Country in which user session occurred, based on IP.
region TEXT Web iOS Region in which user session occurred, based on IP.
city TEXT Web iOS City in which user session occurred, based on IP.
IP TEXT Web iOS The IP address for the session, which is used for determining geolocation.
referrer TEXT Web URL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as direct.
landing_page TEXT Web URL of the first pageview of the session.
browser TEXT Web User's browser.
search_keyword TEXT Web Search term that brought the user to your site.
utm_source TEXT Web GA-based utm_source tag associated with the session's initial pageview.
utm_campaign TEXT Web GA-based utm_campaign tag associated with the session's initial pageview.
utm_medium TEXT Web GA-based utm_medium tag associated with the session's initial pageview.
utm_term TEXT Web GA-based utm_term tag associated with the session's initial pageview.
utm_content TEXT Web GA-based utm_content tag associated with the session's initial pageview.
domain TEXT Web Domain including subdomain, e.g. blog.heapanalytics.com.
path TEXT Web Portion of the current URL following your domain, e.g. /docs for heapanalytics.com/docs.
hash TEXT Web Portion of the current URL following the hash sign, e.g. #install for heapanalytics.com/docs#install.
query TEXT Web Query params of the page's current URL, e.g. ?utm_id=1234 for heapanalytics.com?utm_id=1234.
title TEXT Web Title of the current page.
href TEXT Web href property of link (used for clicks on anchor tags).
device TEXT iOS User's device model.
carrier TEXT iOS User's mobile phone carrier.
app_name TEXT iOS Current name of iOS app, as determined by CFBundleName.
app_version TEXT iOS Current version of iOS app, as determined by CFBundleShortVersionString.
action_method TEXT iOS Name of the action method triggered by this event, e.g. loginButtonWasPressed.
view_controller TEXT iOS Name of the current view controller.
screen_a11y_id TEXT iOS accessibilityIdentifier for the current view controller.
screen_a11y_label TEXT iOS accessibilityLabel for the current view controller.
target_view_class TEXT iOS Underlying class name of an iOS action's target, e.g. UITableCellView.
target_view_name TEXT iOS Instance variable name of an iOS action's target, e.g. loginButtonView.
target_a11y_id TEXT iOS accessibilityIdentifier of an iOS action's target.
target_a11y_label TEXT iOS accessibilityLabel of an iOS action's target.
target_text TEXT Web iOS Button text of the event target.
Event properties… TEXT All There will be one column for every unique event property you've attached, either via the heap.track API, the addEventProperties API, or Snapshots. The column type is automatically inferred from the underlying property values.

All Events Table

Heap will also create a table called all_events, which contains all events you've defined in the Heap interface or logged via our custom API. The all_events table will have the following columns.

Column Name Type Description
event_id BIGINT Unique ID of associated event, randomly generated by Heap.
user_id BIGINT Unique ID of associated user, randomly generated by Heap.
session_id BIGINT Unique ID of associated session, randomly generated by Heap.
time TIMESTAMP UTC timestamp when event happened.
event_table_name TEXT Name of the table where similar events can be found.

Data Syncing

Your Redshift cluster periodically syncs to your raw Heap data.

As you define new events and modify existing events, we seamlessly update your cluster schema. This ensures Heap SQL is always a clean, accurate representation of your raw data.

In particular, the following changes can happen on each sync:

Performance Optimization

Under the hood, we tune your Redshift schema to ensure that Heap SQL is as performant as possible. In particular, this means:

To learn more about concepts such as column compression and node distribution, take a look at the Amazon Redshift docs.

FAQ

Why is a given event_id included more than once on the all_events table?

A given event_id across all Redshift tables corresponds with a unique event recorded by Heap. That said, the same event_id may exist in multiple event tables as Heap provides the flexibility to create multiple event definitions that may correspond to the same raw event. For instance, you may define the following two events in the product:

If a user clicks the CTA on the homepage, a new event will be recorded on both event tables that correspond with both event definitions. As a result, two events with the same event_id will be included in the all_events table as it contains every recorded instance of all defined and custom events.

Why are constraints not applied to the tables?

Primary and foreign key constraints are not applied to the Heap SQL tables because Amazon Redshift does not enforce constraints.