Front-end Google BigQuery with an 4.5 application - Part 1


This three part article shows how to set up a Google BigQuery project, how to front-end that project with a sample application available for download as a GitHub repository, and how the engineering behind that sample application works.

Part one here explains how to set up a Google Cloud Platform account, and how to build a basic BigQuery project.

Part two will first describe the sample application user experience, and will then focus on application configuration and the security set-up.

Part three will explain the engineering behind the application.

I want to say thank you to Eyal Peled, of Google, for taking time to answer some questions I had about BigQuery.
Google Cloud Platform offers a wide range of products that move computing from an ownership model to a rental model. In the Platform product line, Google BigQuery handles large data volumes and queries. Although BigQuery lacks some features of products like Oracle or SQL Server, it's perfect for many of the problems traditionally solved by these products, without the hassles. It avoids the hardware and dedicated DBA overhead, at prices competitive with more "established" RDBMS tools. Part one here explains how to set up and load data into a Google BigQuery account.

Relational database products traditionally group tables, stored procedures, etc. in "databases." In contrast, BigQuery groups tables in a user-defined "dataset" within a user-defined project. Note that a dataset has only tables, not the other machinery RDBMS products provide. The application we'll explore in Part 2 uses this dataset for its queries. While BigQuery provides some sample datasets, I will show how to load a sample CSV file into the dataset. Note that for user-sourced data, BigQuery requires project billing activation.

I started at the Google BigQuery page and I signed in with a Google account. Just above the black up arrow towards the top of the page I clicked "Go To My Console". 

1. Set up Google Cloud Platform - first page
To set up the BigQuery project for this article, I clicked "CREATE PROJECT" in the Console

2. Create Project
to open this window:

3. Set up the new BigQuery project
Cloud Platform maps billing data to the paying customer at the project level. I named the project "BigQueryDemoApp", I read / agreed to the terms, and I clicked Create. The Developers Console auto-generated "glass-mantra-589" as the Project ID. I could have changed this value, but I left it as is. This Project ID value will become important later. The Project Dashboard opened next:

4. Project Dashboard
At the left, I clicked "APIs & auth" to open the list of Google APIs I could add to the BigQueryDemoApp project:

5.  Add one or more Google APIs to the Google Cloud Platform project
Finally, I toggled off all defaulted picks except BigQuery API, although I could have picked other APIs. In Figure 5, I clicked "Settings" at the upper left to open this page:

6. Set up project billing
Here, I clicked "Enable Billing" as indicated. This opened a page to set up the billing machinery for the "glass-mantra-589" project. Google Billing sends an email address confirmation email to the customer contact address entered in this step. With appropriate changes, we can use this approach to configure a project with any Google Cloud Platform product.

Next, I loaded a data file into BigQueryDemoApp. BigQuery accepts CSV or JSON files at this step. I used zip file "Complete ZIP Code Totals Filefrom the U.S.Census Bureau for the data. It unzips into CSV file "zbp11totals.txt". This 3 meg CSV file has about 39 K rows - enough to show the basics of BigQuery without hitting the structural limits or price points. At the Project Dashboard, I clicked "BigQuery" to the left of the black arrow here:

7. To load the data, first pick BigQuery at the Developers Console
I clicked the indicated blue down arrow here:

8. Pick the BigQueryDemoApp project
and in the popup, I picked Create new dataset. This opens a popup where I type the name of the new dataset. I called it "BigQueryDemoAppDS". The new dataset appears to the left of the page here:

9. Drill down to the BigQueryDemoAppDS dataset
When I mouse-hovered over the dataset, a blue plus appeared; I clicked this plus to open a new popup to handle the data load process. First, I typed the name of the BigQueryDemoAppDS target table: "zbp11totals". Next, I picked source file "zbp11totals.txt". After that, the popup wanted the source data schema, in this form:

name:string, age:integer

and fortunately, the Census Bureau provided the source ZIP Code Totals data file layout here. This layout maps to a format compatible with BigQuery:

ZIP:string, NAME:string, EMPFLAG:string, EMP_NF:string, EMP:integer, QP1_NF:string, QP1:integer, AP_NF:string, AP:integer, EST:integer, CITY:string, STABBR:string, CTY_NAME:string

I proceeded through the Create and Import steps, and at Advanced Options as shown here I set the "Header rows to skip" value to 1.

10. Configure the BigQuery data import
I accepted the other default values and I clicked Submit.

This completed the Google Cloud account and BigQuery project set-up. Part two will cover the sample application that front-ends that project.