Data Engineer – Java

Data Engineer – Java

ProViso Consulting

Story Behind the Need:

• Business Group: Global Data & Analytics Solutions is seeking a Data Engineer for developing & maintaining ETL data pipelines developed on Hadoop & Google Cloud Platform
• 60% maintaining current data sets
• 40% working on new data sets
• Backfill

Typical Day in the Role:

• Hands-on Engineering of Data Processing applications
• Experience win automated deployment of code using CI/CD tools like Jenkins, Bitbucket, Docker, Artifactory
• Experience with Big Data technologies, assisting clients in building software solutions that are distributed, highly scalable and across multiple data centers
• Hands on experience in architecting Big Data applications using Hadoop technologies such as Spark, Map Reduce, YARN, HDFS, Hive, Impala, Pig, Sqoop, Oozie, HBase, Elastic search, Cassandra.
• Database setup/configuration/troubleshooting
• Automating work schedules with Airflow
• Strong Experience with event stream processing technologies such as Spark streaming, Storm, Akka, Kafka
• Experience with at least one programming language (Java, Scala, Python)
• Extensive experience with at least one major Hadoop platform (Cloudera, Hortonworks, MapR)
• Experience working with Business Intelligence teams, Data Integration developers, Data Scientists, Analysts and DBA’s to deliver well-architected and scalable Big Data & Analytics eco-system
• Proven track record of architecting Distributed Solutions dealing with real high volume of data(petabytes)
• Strong troubleshooting and performance tuning skills.
• Experience with SQL and scripting languages (such as Python, R)
• Deep understanding of cloud computing infrastructure and platforms. Google Cloud preferred.
• Good understanding of Big data design patterns
• Ability to analyze business requirement user stories and model it to domain based services
• Experience working under agile delivery methodology
• Experience of designing scalable solutions with proficiency in use of data structures and algorithms
• Experience in cloud-based environment with PaaS & IaaS
• Capability to architect highly scalable distributed data pipelines using open source tools and big data technologies such as Hadoop, HBase, Spark, Storm, ELK, etc. (Hadoop & Spark mandatory, HBase and ELK are nice to haves)

Must Have Skills:

• Data Engineer – 8+ years of professional experience
• 3+ years’ experience deploying data engineering solutions in a production setting
• Experience in Python, Java or Scala. (Production level coding) (python preferred)
• 3+ years of hands on experience with HDFS, MapReduce, YARN, Pig, Hive, HBase, Zookeeper, Oozie
• 3+ years of hands on ETL Experience, including:
o Data Modeling (Basic Level) experience
o Data Standardization/Canonical Data Format definition
o Data Integration Patterns
o Data Quality/Lineage Solutions
o Performance Benchmarking and Tuning
o Data Conversion and Migration between different databases
• 3+ years of hands on GCP/Azure Platform Experience (GCP preferred)
• 3+ years of hands on experience with Spark, Spark Streaming
• 3+ years of hands on experience with Storm, Kafka, Flume

Nice to have Skills:

• Tableau, QlickView
• RabbitMQ
• Oracle, MySQL, SQL, Stored Procedures
• Experience working with Google Analytics & Digital Marketing

Soft Skills:

• Work iteratively in a team with continuous collaboration
• Strong communication skills – written & verbal (engage in requirement gathering, architecture review sessions)


• BE/BTech in computer science Or MCA with a sound industry experience (10-14 years) in various roles.

Job Details



6 Months



Latest Blogs

© 2020 ProViso Consulting - Toronto Recruitment and Staffing Agency

Send this to a friend