Build Google BigQuery "Stored Procedures" With Google Cloud SQL - Part 1

______________________

To analyze large data volumes, Google BigQuery is a great tool. Through 
Google Apps Scripts, we can easily build universal web applications to front-end BigQuery. The JavaScript engineering behind these web applications certainly works well enough, but a major pain point remains: BigQuery does not handle stored procedures.

This article describes a sample Google Apps Script solution that front-ends both BigQuery and Cloud SQL, using Cloud SQL to simulate BigQuery stored procedures. In this approach, a JavaScript function takes a Cloud SQL / MySQL stored procedure name and an array of stored procedure arguments as parameters. The function returns a BigQuery result set in a JavaScript array. Of course, when / if Google provides a BigQuery stored procedure feature, it will have a completely different user / developer experience and completely different engineering. However, this new approach works really well and it makes BigQuery front‑end development much easier.

This article focuses on a solution with the same front-end user experience as the one I described in this earlier article. However, this new solution scraps the earlier dynamic SQL approach. This article and sample solution show that a Google Apps Scripts application can front-end and integrate multiple Google Cloud products, and at the same time mirror the web form behavior on a Google Drive file.

This article builds on two articles I published here at BitVectors. The
first article shows how to build a Google Apps Script application to front-end BigQuery. The second article shows how to build a Google Apps Script application to front-end Google Cloud SQL.

Part one here will describe the Google Cloud SQL database the sample application uses for the BigQuery "stored procedure" technique.

Part two will briefly describe important details about the BigQuery project behind the solution.

Part three has an overview of the Google Apps Script solution engineering.

Part four will explain the JavaScript objects behind the Google Apps Script solution.

Download the five Google Apps Script component files and scripts for the MySQL stored procedure and the MySQL user-defined function from the GitHub repository.
________________

First, set up a Cloud SQL instance as described here. Note that the necessary procedures and steps to do this might have diverged somewhat from what I described earlier, and some of the web setup forms and panels might now look and work somewhat differently. However, the overall ideas and concepts have stayed the same.

In the Cloud SQL instance, create a database named "DEMO". Next, point a MySQL workbench to the Cloud SQL instance, and configure that workbench as needed. With that workbench, first build stored procedure usp_return_BigQuery_SP in the DEMO database. This stored procedure takes one comma-delimited string parameter holding seven comma-delimited values. Later sections of this article will explain these values. This stored procedure has a simple goal: starting with the input parameter, extract the comma-delimited values and build a BigQuery query string that will query the BigQuery project associated with the sample solution. The stored procedure input parameter

     COMMA_DELIMITED_PARAM

will have these example formats

3,10 AND 100,100 AND 1000,ANY,0 AND 9, ,2

3,10 AND 100,ANY,ANY,10000 AND 100000,54,4

which map to this structure:

FIRST_ZIP_DIGIT
EMP
QP1
AP
EST
QUANTILES_VALUE
     DDL_QUERY_VALUE

In this list, the last item, DDL_QUERY_VALUE, spans an integer value range of 1 to 4 [1 . . 4]. A value of 4 maps to the BigQuery Quantiles function, called "Quantiles Employee Count" in this application. The Quantiles function requires a quantiles value as a separate integer. In this list, item six, QUANTILES_VALUE, covers this requirement. Here, parsed item six (the quantiles value) will have a non-blank value only if parsed item seven (the drop down list query value) has a value of 4, because 4 maps to the BigQuery Quantiles function as "Quantiles Employee Count" in the web page dropdown. The value for QUANTILES_VALUE stays blank for any other value for DDL_QUERY_VALUE. The first five parameter items can have a value of "ANY".

Stored procedure usp_return_BigQuery_SP first declares a number of VARCHAR variables. Then, it uses MySQL string functions to parse input parameter COMMA_DELIMITED_PARAM into those declared VARCHAR variables. The stored procedure begins to assemble the finished BigQuery query string at line 78. The line 87 CASE statement maps the DDL_QUERY_VALUE value to a BigQuery function within the BigQuery query string, covering a quantiles function pick at lines 98 and 99. Line 108 adds the "FROM" clause to the query string. The CASE blocks between lines 134 and 168 place the first five variable values in the query string. The blocks will place a value in the query string only if a variable has a value different from "ANY". This way, "ANY" can wildcard the variable values.

The CASE blocks use the CONCAT string function to add the variable values to the assembled query string. Starting with the EMP variable block, the CONCAT functions call user-defined function endOfQueryString. As the CASE blocks build the query string, this user-defined function takes the assembled query string as a parameter, returning " AND " if the query string does not end with "WHERE ".


At line 170, the stored procedure adds a ";" character and returns the assembled BigQuery query string.

_______________

Part one here explained the Cloud SQL / MySQL part of the solution. Part two will describe important details about the BigQuery project for the solution.