Migrating a Hadoop infrastructure to GCP

Tudip Technologies
1 min readJun 14, 2021

The migration of an on-premises Hadoop solution to Google Cloud requires a shift in approach. A typical on-premises Hadoop system consists of a monolithic cluster that supports many more workloads across multiple business areas. As a result, the system becomes more complex. It can require administrators to make compromises to get everything working in the monolithic cluster. When you prepare to migrate Hadoop infrastructure to Google Cloud, you can reduce the administrative complexity. However for simplification and to get the most efficient way for processing in Google Cloud with minimal cost, you need to rethink how to structure your data and jobs.

The Dataproc service of GCP runs Hadoop, using a persistent Dataproc cluster to replicate your on-premises setup as it seems the easiest solution. However, there are some limitations to that approach:

  • Keeping your data in a persistent HDFS cluster using Dataproc is more expensive than storing your data in Cloud Storage, which is what we recommend, as explained later. Keeping data in an HDFS cluster also limits your ability to use your data with other Google Cloud products.
  • Augmentation or replacement of some open-source-based tools with other related Google Cloud services can be more efficient or economical for particular use cases.
  • Using a single, persistent Dataproc cluster for your jobs it is more difficult to manage than shifting to targeted clusters that serve individual jobs or job areas.

Read more: https://tudip.com/blog-post/migrating-a-hadoop-infrastructure-to-gcp/

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

No responses yet

Write a response