Automating user DB imports
This tutorial provides examples of how to automate the download and import of User DB data sets from Swrve.
The below examples demonstrate how to load a User DB using the Swrve Export API for User DBs. The APIs are called from scripts that can be run periodically. Swrve supports both Amazon Redshift and MySQL databases.
Prerequisites
The following prerequisites are required to complete this tutorial:
- Experience using the command line shell.
- Experience with basic SQL commands such as SELECT, WHERE and JOIN. As the data format is straightforward, advanced SQL knowledge is not essential.
- You have installed MySQL or have an active Amazon Redshift Cluster.
Scripted import
You can easily create a script to download the user database files for a given app and populate a Redshift or MySQL database. This script should complete the following steps:
- Create the required database tables.
- Download the CSV data files to your local computer.
- Import the CSV data files into the database.
Step 1: Create the required database tables
Create a database called “app_12345” for Swrve app 12345 using the following command:
Redshift
createdb --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345
MySQL
mysqladmin -u mysql_user -p create app_12345
You can download the Schema file from the User DB Downloads screen. To automate this, download the Redshift schema file as referenced by the Export API.
US
https://dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>
EU
https://eu-dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>
More details on the UserDB Export API is available in the Export API guide.
From the JSON supplied by this API, you are given the address of your schema file. Next, you need to download this file. In pseudo-code (again for Swrve app 12345):
Redshift
download(user_db_info["schemas"]["redshift"] + '?api_key=<app api_key>&personal_key=<personal_key>')
MySQL
download(user_db_info["schemas"]["mysql"] + '?api_key=<app api_key>&personal_key=<personal_key>')
This file is a SQL document, which you can execute to create the required tables for your CSV data:
Redshift
psql --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345 < all-users_12345_mysql.sql
MySQL
mysql -u mysql_username -p app_12345 < all-users_12345_mysql.sql
Step 2: Download the CSV data files to your local computer
You can download the CSV data files from the User DB Downloads screen. To automate this, download the CSV data files using the Export API.
US
https://dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>
EU
https://eu-dashboard.swrve.com/api/1/userdbs.json?api_key=<app api_key>&personal_key=<personal_key>
Redshift
Redshift requires that the files being imported exist in an Amazon S3 bucket.
Upload each of these files to an S3 bucket. If you have enabled raw event exports, you can use s3://swrveexternal-<company_name>/app-<app_id>/userdb/. If not, use a bucket of your choosing and adjust the below commands appropriately.
These files can be uploaded as compressed (that is, as is), as Redshift supports gzip compression.
s3cmd put <file_name> s3://swrveexternal-<company_name>/app-12345/userdb
MySQL
The data for your users is contained within compressed CSV files. Decompress each of these CSV files. MySQL import does not support gzip compress and therefore you must first decompress before use.
gunzip *.gz
Step 3: Import the CSV data files into the database
Redshift
From the psql terminal, run this command per data set.
psql --host=<redshift cluster> --port=5439 --username=<redshift master> app_12345
COPY <table_name> FROM 's3://swrveexternal-<company_name>/app-12345/userdb/all-users_12345_<yyyy-mm-dd>_<table_name>.1_0' CREDENTIALS 'aws_access_key_id=<access_key_id>;aws_secret_access_key=<secret_access_key>' DELIMITER ',' IGNOREHEADER 1 ESCAPE GZIP;
MySQL
From the MySQL terminal, run this command per data set.
mysql -u mysql_username -p app_12345
LOAD DATA LOCAL INFILE '<csv filename>' INTO TABLE <table name> FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\'' IGNORE 1 LINES;
Next steps
That’s it! With user properties loaded, you can query the current state of every user who has used your app in the last year.
- Run some simple sample user DB queries.
- For more advanced sample queries, see Top five user lifecycle queries and Top five monetization queries.
Most Swrve customers can self-host this pipeline; all you have to do is follow the steps in these tutorials. However, if you prefer a turn-key hosted solution, we do offer one as a professional service. There is an additional fee associated with this service. For more information, contact your CSM at support@swrve.com.
Swrve support can help you with basic setup and configuration of the pipeline described above. If you need help with your queries, contact our Data Services team at support@swrve.com. They will help you get the most out of your data with dedicated support, pre-built and custom reports, and dedicated data science hours.