Last modified January 14, 2021 by Shelly Wolfe

Automating user DB imports

Tutorial objective: Automate the download and import of User DB data sets from Swrve.
If you are using the Swrve Insights hosted solution, you can skip this tutorial as Swrve already loads all of this data.

This tutorial provides examples of how to automate the download and import of User DB data sets from Swrve.

The below examples demonstrate how to load a User DB using the Swrve Export API for User DBs. The APIs are called from scripts that can be run periodically. Swrve supports both Amazon Redshift and MySQL databases.


Prerequisites

The following prerequisites are required to complete this tutorial:

  • Experience using the command line shell.
  • Experience with basic SQL commands such as SELECT, WHERE and JOIN. As the data format is straightforward, advanced SQL knowledge is not essential.
  • You have installed MySQL or have an active Amazon Redshift Cluster.

Scripted import

You can easily create a script to download the user database files for a given app and populate a Redshift or MySQL database. This script should complete the following steps:

  1. Create the required database tables.
  2. Download the CSV data files to your local computer.
  3. Import the CSV data files into the database.

Step 1: Create the required database tables

Create a database called “app_12345” for Swrve app 12345 using the following command:

Redshift

createdb --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345

MySQL

mysqladmin -u mysql_user -p create app_12345

You can download the Schema file from the User DB Downloads screen. To automate this, download the Redshift schema file as referenced by the Export API.

This article references Swrve’s URLs for all data and content stored in both our US and EU data centers. Use the relevant example based on your app configuration. For more information, see How do I configure the Swrve SDK for EU data storage?

US

https://dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>

EU

https://eu-dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>

More details on the UserDB Export API is available in the Export API guide.

From the JSON supplied by this API, you are given the address of your schema file. Next, you need to download this file. In pseudo-code (again for Swrve app 12345):

Redshift

download(user_db_info["schemas"]["redshift"] + '?api_key=<app api_key>&personal_key=<personal_key>')

MySQL

download(user_db_info["schemas"]["mysql"] + '?api_key=<app api_key>&personal_key=<personal_key>')

This file is a SQL document, which you can execute to create the required tables for your CSV data:

Redshift

psql --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345 < all-users_12345_mysql.sql

MySQL

mysql -u mysql_username -p app_12345 < all-users_12345_mysql.sql

Step 2: Download the CSV data files to your local computer

You can download the CSV data files from the User DB Downloads screen. To automate this, download the CSV data files using the Export API.

US

https://dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>

EU

https://eu-dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>

Redshift

Redshift requires that the files being imported exist in an Amazon S3 bucket.

Upload each of these files to an S3 bucket. If you have enabled raw event exports, you can use s3://swrveexternal-<company_name>/app-<app_id>/userdb/. If not, use a bucket of your choosing and adjust the below commands appropriately.

These files can be uploaded as compressed (that is, as is), as Redshift supports gzip compression.

s3cmd put <file_name> s3://swrveexternal-<company_name>/app-12345/userdb

MySQL

The data for your users is contained within compressed CSV files. Decompress each of these CSV files. MySQL import does not support gzip compress and therefore you must first decompress before use.

gunzip *.gz

Step 3: Import the CSV data files into the database

Redshift

From the psql terminal, run this command per data set.

psql --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345
COPY <table_name>
FROM 's3://swrveexternal-<company_name>/app-12345/userdb/all-users_12345_<yyyy-mm-dd>_<table_name>.1_0'
CREDENTIALS 'aws_access_key_id=<access_key_id>;aws_secret_access_key=<secret_access_key>'
DELIMITER ','
IGNOREHEADER 1
ESCAPE
GZIP;
Avoid duplicate data. Redshift does not enforce UNIQUE constraints. If you run the above commands multiple times, data will be duplicated. We suggest dropping all relevant tables before importing new data.

MySQL

From the MySQL terminal, run this command per data set.

mysql -u mysql_username -p app_12345
LOAD DATA LOCAL INFILE '<csv filename>'
INTO TABLE <table name>
FIELDS
 TERMINATED BY ','
 OPTIONALLY ENCLOSED BY '\''
IGNORE 1 LINES;
Because of foreign key constraints, you must import the swrve_properties table before importing the other tables.

Next steps

That’s it! With user properties loaded, you can query the current state of every user who has used your app in the last year.


Need a hosted solution?
Most Swrve customers can self-host this pipeline; all you have to do is follow the steps in these tutorials. However, if you prefer a turn-key hosted solution, we do offer one as a professional service. There is an additional fee associated with this service. For more information, contact your CSM at support@swrve.com.
Need help with Queries?
Swrve support can help you with basic setup and configuration of the pipeline described above. If you need help with your queries, contact our Data Services team at support@swrve.com. They will help you get the most out of your data with dedicated support, pre-built and custom reports, and dedicated data science hours.