The Definitive Setup Guide for AWS Athena Analytics

The 60 Second Setup, Zero Administration Guide to AWS Athena

Openbridge
Published in
6 min readFeb 12, 2018

--

Since Amazon introduced Athena, an interactive query service that uses standard SQL to query data directly within an S3 data lake, the traction to a serverless database and query technologies has gained momentum.

Teams are attracted to a solution with no infrastructure to maintain and uses pay-per-query pricing. Also, teams are doing some innovative and cost-effective deployments with Tableau by building serverless business intelligence stacks with Apache Parquet, Tableau, and Amazon Athena.

This article explains how to quickly set up the service to save time and money on data optimization.

What is AWS Athena with Zero Administration Data Pipelines?

If you frequently query a large amount of data, there are performance optimization and cost reduction techniques recommended by Amazon.

We launched our code-free, zero administration AWS Athena data pipeline service to simplify the adoption of AWS Athena. It automates those AWS Athena optimization techniques and a few more:

  • 60-second setup
  • fully automated configuration with database and table creation
  • data partitioning
  • conversion to columnar format (Parquet)
  • data compression

Get Started with Automated AWS Athena Data Pipelines

Get started is super easy and quick. There are two steps to getting set up. The first is setting up your Amazon account, and the second is configuring Openbridge.

Step 1: Setup Your Amazon Account

First, create an S3 bucket to be used for Openbridge and Amazon Athena

Log into Amazon: https://console.aws.amazon.com. If you already have a bucket you want to use, skip to Step 2.

  1. Name and region: Create an S3 Bucket with a name like “mycompany001-openbridge-athena”. This can be anything you want, but please know that the bucket names should be unique (for more info about bucket names, visit AWS docs). When you are ready, click next…

2. Set properties: No additional properties are required for us. If you want to set them for your own purposes, please feel free to do so. When you are ready, click next…

3. Set permissions: No additional permissions are required at this step. When you are ready, click next…

4. Review: Take a look at the setup, and if all looks well, you can select “Create bucket.”

Next, set up the IAM Policy.

Sign in to the IAM console at https://console.aws.amazon.com/iam/ to create IAM User with username “openbridge-athena.” Next, select the access type of “Programmatic access.” Once complete, click “Next…”

Skip “2. Permissions” and go to “3. Review.”

You may see an alert stating that the user has no permissions; ignore that. Click “Create User.”

Make sure to download the credentials .csv file. Also, keep this file in a safe place. The .csv contains your AWS Secret Key and AWS Access Key for the user you just created. You will need this later.

Last, link Your Policy, User, and S3 Bucket.

It would be best if you still were in IAM, so go to IAM > Users
https://console.aws.amazon.com/iam/home#/users

Find the openbridge-athena user and click to “Add inline policy.”

Name this policy as “openbridge-athena-policy.”

In the dialog screen, you want to paste the policy to ask for “Policy Document.”

Here is the policy document to paste.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::mycompanyname-openbridge-athena/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:CreateBucket",
"s3:ListBucket",
"s3:ListBucketMultipartUploads"
],
"Resource": [
"arn:aws:s3:::mycompanyname-openbridge-athena"
]
},
{
"Effect": "Allow",
"Action": [
"athena:*"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": [
"glue:CreateDatabase",
"glue:DeleteDatabase",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:UpdateDatabase",
"glue:CreateTable",
"glue:DeleteTable",
"glue:BatchDeleteTable",
"glue:UpdateTable",
"glue:GetTable",
"glue:GetTables",
"glue:BatchCreatePartition",
"glue:CreatePartition",
"glue:DeletePartition",
"glue:BatchDeletePartition",
"glue:UpdatePartition",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition"
],
"Resource": [
"*"
]
}
]
}

IMPORTANT: Remember to change the “mycompanyname-openbridge-athena” in the policy document to the AWS S3 bucket you set up in Step 1.

Click “Apply Policy.”

Step 2: Configure Amazon Athena in Openbridge console

  1. Log in to your Openbridge account (if you don’t have one, create for free and navigate to the Amazon Athena product page and click the “Configure Amazon Athena” button.
  2. On the next page, start the configuration by entering descriptive Openbridge Storage Name as you won’t rename it.

Continue configuration. Enter Athena connection details, which can be found in your AWS console. You will need to provide the following details:

  • Region and bucked name you have configured for Athena in Step 1.
  • Provide a name for your database that Openbridge will create for you to use with Athena. Note that Athena table, database, and column names allow only underscore (_) special characters and cannot contain any other special characters. See Amazon docs for reference.
  • Provide your Amazon Athena Access Key ID and Secret Access Key details from the .csv you downloaded in Step 2.
  • Lastly, review your order, read, agree to terms of services, and click “Connect Storage”!

That’s it! Automated, Zero Administration AWS Athena Data Pipeline is configured!!!

Taking Advantage of Data Integration Marketplace with AWS Athena

With fully-managed Amazon Athena in place, you can leverage our rich catalog of social media, advertising, support, e-commerce, analytics, and other marketing technology categories. Send data to Athena from 600+ data sources like Google Analytics 360, DoubleClick, Instagram, YouTube, Adobe Analytics, Facebook, Salesforce, Marketo, Zendesk, HubSpot, and many more and start querying!

With data in AWS Athena, you can use your favorite data analysis tools, visualization, reporting, and analysis like Tableau Software, Looker, AWS QuickSight, and many others.

DDWant to discuss how to leverage AWS Athena for your organization? Need a platform and team of experts to kickstart your data and analytic efforts? We can help! Getting traction adopting new technologies, especially if your team is working in different and unfamiliar ways, can be a roadblock for success. This is especially true in a self-service only world. If you want to discuss a proof-of-concept, pilot, project or any other effort, the Openbridge platform and team of data experts are ready to help.

Reach out to us at hello@openbridge.com. Prefer to talk to someone? Set up a call with our team of data experts.

Visit us at www.openbridge.com to learn how we are helping other companies with their data efforts.

--

--

Openbridge is a Data Logistics Platform (DLP) designed to collect, discover and act upon data simply, quickly and smartly