Copying BigQuery Data#

There are two relational databases that we will use for the first part of this course. They are stored in my instance of BigQuery and you must copy them to your BigQuery project so you can use them.

🚨 Permissions Needed! 🚨#

You MUST have permission to copy the data! Ensure you complete the “Google Cloud Platform Setup” in Brightspace or you WILL NOT be able to proceed with these instructions.

Copy the Datasets#

These instructions are specific to the class_demos dataset used in classroom demonstrations. Repeat these instructions for each dataset you will use. Simply substitute the name of the dataset where necessary in the instructions below.

Instructions#

Step 1: Access BigQuery Console#

  1. Navigate to the BigQuery in the Google Cloud Console: https://console.cloud.google.com/bigquery

Step 2: Go to Data Transfers#

In the left navigation pane:

  1. Click “Data Transfers” (under the “Pipelines & Integrations” section)

Screenshot of BigQuery navigation
  1. Then click Create a Transfer

Screenshot of Data Transfers screen

Note

If you are asked to enable the Bigquery Data Transfer API, you should do so

Step 3: Configure the Transfer#

In the Create transfer dialog, do the following:

  1. In the Source dropdown, choose “Dataset Copy”

Choose Dataset Copy from the Source dropdown
  1. Enter a value for “Transfer config name”

  2. In the Schedule options dropdown, choose “On-demand”

  3. In Destination settings, click “Dataset” and then “CREATE NEW DATASET”

Create transfer settings as specified in text

Step 4: Create a Dataset#

In the Create dataset dialog, do the following:

  1. Enter a Dataset ID a. Use class_demos for the dataset in this example. (You will update this for later data transfers.)

  2. For Location type, choose “Region”

  3. In the Region * dropdown, choose us-central1 (Iowa)

  4. Click “Create dataset”

Create dataset settings as specified in text

Step 5: Select your Dataset#

Back in the Create transfer dialog, do the following:

  1. Choose class_demos from the Destination settings dropdown

Select `class_demos` from Destination settings dropdown
  1. For Source dataset enter class_demos

  2. for Source project enter elliott-purdue-development

  3. Click Save

Destination settings as specified in text

Step 4: Run the Transfer#

On the Transfer details page, do the following:

  1. Click the “Run transfer now” button at the top of the screen

  2. In the “Run transfer now” dialog, choose “Run one time transfer”

  3. Click OK

Run transfer now button

Step 5: Verify the Copy#

  1. In the Explorer panel, navigate to your own project

  2. Expand your project to see the datasets

  3. Verify that the class_demos dataset (or whatever name you chose) appears

  4. Click on the dataset and expand it to confirm all tables have been copied

  5. You can click on individual tables and preview the data to ensure the copy was successful

Troubleshooting Tips#

If you cannot see the instructor’s project:

  • Verify you have been granted BigQuery Data Viewer access

  • Check that you’re using the correct project ID: elliott-purdue-development

  • Try refreshing the BigQuery console

If the copy fails:

  • Check that you have sufficient permissions in your destination project

  • Ensure your project has BigQuery API enabled

  • Verify you have enough storage quota in your project

  • Check that the destination dataset name doesn’t already exist

If tables appear empty after copying:

  • The copy job might still be in progress - check Job History

  • Refresh the BigQuery console

  • Check the “Details” tab of each table for row count

Important Notes#

  • The copied dataset will belong to your project and will incur storage costs based on your project’s billing

  • Any queries you run on the copied dataset will be billed to your project (Remember! You are using GCP credits and the billing will be free to you! Also, the costs for this project will be minimal.)

  • The copy creates a snapshot at the time of copying - it won’t sync future updates from the instructor’s dataset