Getting Started With BigQuery

Robert Thas John
4 min readAug 1, 2018

--

BigQuery is Google’s cloud offering for data warehousing. You can find a non-technical introduction here.

In this article, we will go through how to get started with BigQuery.

You can access BigQuery from the Google Console. From the menu, scroll down to the Big Data section, and click on BigQuery. You will have a window similar to the one below.

BigQuery Window

The Query editor is where you type your queries. The left side of the window has useful features. Below that, under the search box, is your project which is highlighted in blue. The drop-down lets you select datasets. By clicking on your project name, the section below the Query editor lets you create datasets. The dataset is the equivalent of a database on other platforms. The window for creating a dataset is shown below.

Create Dataset

After creating a dataset, you can go ahead and create tables. You will need to click on the name of the dataset. You will be presented with the option to create a table.

Create table

When creating a table, you can specify how you would like to create the table. You can specify a table schema and leave the table empty, connect to a file on Google Cloud Storage, upload a file, or connect to a file on Google Drive.

In this example, we will upload a CSV file from our computer.

Upload CSV file

We will need to specify our Source data. We will also need to fill in the name of our table in the space provided under Destination table. We will specify our table schema, which is the names of the columns and their types.

Under Advanced options, we can specify how our text file is delimited. The number of rows to skip from the top of the file is also specified, which is useful for handling table headers. When done, we can click on Create table.

You can upload files to BigQuery from the command line. You can do this from Cloud Shell using bq load. For example:

bq load --source_format=NEWLINE_DELIMITED_JSON $DEVSHELL_PROJECT_ID:cpb101_flight_data.flights_2014 gs://mybucket/domestic_2014_flights_*.json ./schema_flight_performance.json

The above states that the file format is JSON, and specifies the file location to be a Cloud Storage bucket. Be sure to check the documentation for more about the CLI commands. You can check for the tables in your dataset using the following command:

bq ls $DEVSHELL_PROJECT_ID:cpb101_flight_data
Job History

The Job History link will show you any jobs that you have initiated. This is useful for when you are uploading really large files.

Viewing a table

On the left of the window, you can see your tables under your dataset. Clicking on a table name will let you see the details of the table, the schema, and a preview. You can also export a table.

Create bucket while exporting table

You typically export a table into Google Cloud Storage. If you don’t have any buckets, you get the option to create one. You can fill in the required details, and click on Create.

Export file to Cloud Storage

You can export your data from the command line using the following format:

BUCKET="<your unique bucket name (Project ID)>" 
echo $BUCKET
bq extract cpb101_flight_data.AIRPORTS gs://$BUCKET/bq/airports2.csv

When done, you will find your file in the specified bucket. Have fun using BigQuery.

--

--

Robert Thas John
Robert Thas John

Responses (1)