Open in app

Sign In

Write

Sign In

Harshad Patel
Harshad Patel

17 Followers

Home

About

Jan 26, 2021

GCP Data Engineer Certification Prep Part-1

This post explains step by step guide for preparing the Google Professional Data Engineer Certification. A Professional Data Engineer enables data-driven decision making by collecting, transforming, and publishing data. A Data Engineer should be able to design, build, operationalize, secure, and monitor data processing systems with a particular emphasis on…

Data Engineering

4 min read

Data Engineering

4 min read


Jan 20, 2021

Data Pipeline Orchestration

Google Cloud Workflows Orchestrate and automate Google Cloud and HTTP-based API services with server-less workflows. You can use Workflows to create serverless workflows that link series of serverless tasks together in an order you define. Combine the power of Google Cloud’s APIs, serverless products like Cloud Functions and Cloud Run, and calls to…

Orchestration

2 min read

Data Pipeline Orchestration
Data Pipeline Orchestration
Orchestration

2 min read


Aug 16, 2020

Google DataFlow utility pipelines( File Conversion and Streaming data generation)

Dataflow Streaming Data Generator This pipeline takes in a QPS parameter, a path to a schema file, and generates fake JSON messages (sample messages used for load testing and system integration testing) matching the schema to a Pub/Sub topic at the QPS rate. JSON Data Generator library used by the pipeline allows various faker…

Dataflow

2 min read

Google DataFlow utility pipelines( File Conversion and Streaming data generation)
Google DataFlow utility pipelines( File Conversion and Streaming data generation)
Dataflow

2 min read


Aug 14, 2020

BigQuery New Features

Multi-tab query editing 99.99% SLA Pricing recommendation to select models (on-demand, Flat-rate, flex) Native query Admin Console UI Real-time Resource information with INFORMATION_SCHEMA Automated slot management with Bigquery Slots Autoscaling. …

Bigquery

1 min read

Bigquery

1 min read


Jun 26, 2020

Committed Use Discount on Google Cloud SQL

Fully managed relational database service for MySQL, PostgreSQL, and SQL Server. Features: Ensure business continuity with reliable and secure services backed by 24/7 SRE team Reduce maintenance cost with fully managed relational databases in the cloud Automates database provisioning, storage capacity management, and other time-consuming tasks Easy integration with existing…

Google Cloud Platform

2 min read

Committed Use Discount on Google Cloud SQL
Committed Use Discount on Google Cloud SQL
Google Cloud Platform

2 min read


Jun 19, 2020

Cloud Dataflow

If you’re new to Cloud Dataflow, I suggest starting here and reading the official docs first. Develop locally usingDirectRunner and not on Google Cloud using the DataflowRunner. The Direct Runner allows you to run your pipeline locally, without the need to pay for worker pools on GCP. When you want to shake-out a pipeline on a Google Cloud using the DataflowRunner, use a subset of data…

Google Cloud Platform

4 min read

Google Cloud Platform

4 min read


Jun 12, 2020

BigQuery Fun Facts!

If you’re new to BigQuery, I suggest starting here and reading the official docs first. Export all your audit and billing logs back to BigQuery for analysis. I don’t know how many times this pattern has saved my butt. Don’t be lazy with your SQL. Avoid SELECT * on big…

Bigquery

5 min read

Bigquery

5 min read


Jun 12, 2020

Python: Getting started with pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python Installation or Setup Installing pandas with Anaconda Installing…

Python

3 min read

Python

3 min read


May 30, 2020

Cloud Data Fusion Private Instance Guide

Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. With a graphical interface and a broad open-source library of preconfigured connectors and transformations, Cloud Data Fusion shifts an organization’s focus away from code and integration to insights and…

Google Cloud Platform

3 min read

Cloud Data Fusion Private Instance Guide
Cloud Data Fusion Private Instance Guide
Google Cloud Platform

3 min read

Harshad Patel

Harshad Patel

17 Followers

7x GCP | 2X Oracle Cloud| 1X Azure Certified | Cloud Data Engineer

Following
  • Ammett W

    Ammett W

  • Lak Lakshmanan

    Lak Lakshmanan

  • Rishi Raj Singh

    Rishi Raj Singh

  • Prashanta Paudel (prashantapaudel.com.np)

    Prashanta Paudel (prashantapaudel.com.np)

  • Guillaume Laforge

    Guillaume Laforge

See all (19)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech