Google DataFlow utility pipelines( File Conversion and Streaming data generation)

Dataflow Streaming Data Generator

Source: google official documentation

Running the Streaming Data Generator template

  1. Go to the DataFlow page in the Cloud Console .
  2. Click Create job from template.

File Format Conversion

  • Csv to Avro
  • Csv to Parquet
  • Avro to Parquet
  • Parquet to Avro

Pipeline Requirements

  • Input files in the GCS bucket are accessible to the Dataflow pipeline.
  • Output GCS bucket exists and is accessible to the Dataflow pipeline.

Running File Format Conversion Pipelines

--

--

7x GCP | 2X Oracle Cloud| 1X Azure Certified | Cloud Data Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Harshad Patel

7x GCP | 2X Oracle Cloud| 1X Azure Certified | Cloud Data Engineer