Triunity Software logo

GCP Data Engineer

Triunity Software
Full-time
On-site
Phoenix, Arizona, United States


Nature:     Day One On-site
Duration:   24 Months
Candidates Required:   04
Experience:           5 to 8 Years

Following is the Job description for the role of a Data Engineer,

Mandatory Skill Set: Apache Spark, Hive, Hadoop, BigQuery, BigTable, Cloud Composure, Dataflow, Google Cloud Storage, Python, SQL, Shell Scripting, Git.
Good to have Skill Set: CI/CD, Jenkins, Security and Networking, Scala, GCP Identity and Access Management (IAM).
Responsibilities:
1.   Data Processing: Design, develop, and maintain scalable and efficient data processing pipelines using technologies such as Apache Spark, Hive, and Hadoop.
2.   Programming Languages: Proficient in Python, Scala, SQL, and Shell Scripting for data processing, transformation, and automation.
3.   Cloud Platform Expertise: Hands-on experience with Google Cloud Platform (GCP) services, including but not limited to BigQuery, BigTable, Cloud Composer, Dataflow, Google Cloud Storage, and Identity and Access Management (IAM).
4.   Version Control and CI/CD: Implement and maintain version control using Git and establish continuous integration/continuous deployment (CI/CD) pipelines for data processing workflows.
5.   Jenkins Integration: Experience with Jenkins for automating the building, testing, and deployment of data pipelines.
6.   Data Modeling: Work on data modeling and database design to ensure optimal storage and retrieval of data.
7.   Performance Optimization: Identify and implement performance optimization techniques for large-scale data processing.
8.   Collaboration: Collaborate with cross-functional teams, including data scientists, analysts, and other engineers, to understand data requirements and deliver solutions.
9.   Security and Networking: Possess basic knowledge of GCP Networking and GCP IAM to ensure secure and compliant data processing.
10. Documentation: Create and maintain comprehensive documentation for data engineering processes, workflows, and infrastructure.
Qualifications:
1.   Proven experience with Apache Spark, Hive, and Hadoop.
2.   Strong programming skills in Python, Scala, SQL, and Shell Scripting.
3.   Hands-on experience with GCP services, including BigQuery, BigTable, Cloud Composer, Dataflow, Google Cloud Storage, and Identity and Access Management (IAM)
4.   Familiarity with version control using Git and experience in implementing CI/CD pipelines.
5.   Experience with Jenkins for automating data pipeline processes.
6.   Basic understanding of GCP Networking.
7.   Excellent problem-solving and analytical skills.
8.   Strong communication and collaboration skills.