Cloud Data Fusion Private Instance Guide


Pricing for the service is broken down into:

  • Cloud Dataproc VMs to execute the transformations prescribed by Cloud Data Fusion

How to Create a Private Instance

Before creating a Data Fusion private instance, we need to create a VPC network and a private sub-network. Private Google Access is required by Cloud Data Fusion to establish a private connection with Dataproc cluster. To do so we need to allocate the IP range, to do so follow the steps mentioned below:

  1. Click on the Private Service Connection tab.
  2. If asked, enable Service Networking API.
  3. Allocate an IP range of size /22 by clicking on the Allocate IP Range button.

Command to create an instance

Export the following variable for ease of use. Refer these variable in actual commands:
export PROJECT = <project-id>
export LOCATION = <region> Example: us-east1, asia-east1

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://$DATA_FUSION_API_NAME/v1beta1/projects/$PROJECT/locations/ $LOCATION/instances?instanceId=<INSTANCE_NAME> -X POST -d '{"description": "Private CDF instance created through REST.", "type": "BASIC", "privateInstance": true, "networkConfig": {"network": "VPC_NETWORK", "ipAllocation": "IP_RANGE"}}'

Peering With Cloud Data Fusion Network

Cloud Data Fusion uses VPC Peering to provide private instances. A VPC Peering requires peering to be set up on both ends (networks) independently. A peering is automatically set up from the Cloud Data Fusion tenant project network to your network. You must set up the peering to Cloud Data Fusion network from your network to be able to connect to the private instance.

Creating VPC Peering

Steps to create a VPC Peering with the tenant project are as follows:

  1. Select VPC Network Peering
  2. Click on Create Connection
  3. Give a name to your peering ex: datafusion-peering
  4. Make sure that Your VPC network lists the network which you selected while creating Cloud Data Fusion instance.
  5. In Peered VPC network select In another project Provide the tenant project ID in the Project ID field
  6. In VPC network name provide <instance-region>-<instanceid>. Please note the network
  7. name in tenant project is of the format <instance-region>-<instanceid> i.e why you are providing the above name.
    Click on Exchange custom routes and select both Import custom routes (so you can access CDF UI) and Export custom routes (so CDF can access on-prem connected to your VPC network)
  8. Click on Create and wait for the operation to complete.



