flowfunctions structure is designed to be as simple, clear and robust as possible, mirroring BigQuery functions but permitting programmatic definition of all arguments. It abstracts away any need for string templating or complex variable structure definition for data exchange.
The code structure separates configuration from execution, which is often not possible with the native BigQuery functions. An initialisation section (also known as the header) is also required, which is where script variables are declared and system variables are set.
Naming conventions are applied consistently across all functions to minimise abiguity. The
_name suffix is used to refer to the lowest level name for a resource, and the
_id suffix is used when referring to the full address which identifies a resource, e.g.
project_id= the top level identifier for a project
dataset_name= the name for a dataset
table_name= the name for a table
function_name= the name for a function
table_function_name= the name for a table function
Datasets are named either according to the action taken by the function (e.g.
DELETE) or by the functional area (e.g.
Functions within datasets are named so that the
function_id mirrors the BigQuery statement (e.g.
LOAD DATA), except in datasets relating to a functional area where the
function_name describes the action (e.g.
Arguments consistently use the following prefix naming conventions:
source_for data sources
destination_for data destinations
target_for functions operating on a resource, but for which
destination_is not applicable as no data is added to the target table (e.g.
Function arguments are implemented in different ways, depending on whether they are required or optional.
Required arguments are named, separate typed variables. In the following function the
export_uri are all required in
CALL EXPORT.DATA( query_statement STRING, export_format STRING, export_uri STRING, optional_arguments JSON );
Optional arguments are implemented as JSON variables, from which the argument values are parsed. This has a number of benefits:
- It allows a single function to be used for every possible logical variant
- It clearly communicates which arguments are required
- It allows forward-portability in case additional variables are added in the future
- It enables pre-validation as improperly formatted JSON will error prior to function execution
- It allows us to pass arbitrary additional variables to the function to vary function behaviour (e.g. test mode described below)
If no optional arguments are needed, then an empty JSON string should be passed, i.e.
Running any query in test mode will return but not execute the query, which is useful for development and testing. This is accessed by setting the boolean argument
test_mode to the value
"test_mode":true in the root of the
optional_arguments JSON)and including it in the optional_arguments JSON. This will
SELECT the compiled SQL for inspection. In all other cases (i.e.
test_mode does not exist or is not
true) then the query will execute using the
EXECUTE IMMEDIATE statement.
Since the optional arguments are passed as JSON it is useful to be able to combine JSON variables in a simple and readable manner. The JSON function
JSON.MERGE_VARIABLES() function performs this action, and returns a JSON variable which is the combination of the input variables, merged at the root. The syntax to call this function on one example input JSON variable
static_arguments and two different dynamic_arguments
dynamic_argument2, which are dynamically computed and converted to JSON as part of the script, would therefore be:
SET optional_arguments = JSON.MERGE_VARIABLES('static_arguments', ['dynamic_argument1', 'dynamic_argument2'])
Note that this can take an arbitrary array of JSON variables, which will be merged in order of declaration.
Query Execution Location
Also note that you may have to manually set the query location in the BigQuery console (
More -> Query Settings -> Advanced Options -> UNCHECK: Automatic Location Selection -> SELECT: Location type -> SELECT: Region or Multi-region). This needs to match the location of your data.