Roles & Responsibilities
Experience in all phases of Software Development Life Cycle (SDLC) with skills in data analysis, design, development, testing and deployment of software systems. Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning, Possess excellent communication, project management, documentation, interpersonal skills.
- Strong experience, working on components like MapReduce, HDFS, HBase, Hive, Pig, Oozie, Zookeeper, Flume, Spark, Python with CDH4&5 distributions and EC2 cloud computing with AWS. Well conversed and hands on experience in UNIX, LINUX and NoSQL
- Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
- Strong in Developing MapReduce applications, Configuring the Development Environment, Tuning Jobs and Creating MapReduce Workflows. And Experience in performing data enrichment, cleansing, analytics, aggregations using Hive and Pig.
- Knowledge in Cloudera Hadoop distributions and few other majorly used distributions like Horton works and MapR.
- Hands on experience in working with Cloudera CDH3 and CDH4 platforms.
- Proficient in big data ingestion and streaming tools like Flume, Sqoop, Kafka, and Storm.
- Experience with different data formats like Json, Avro, parquet, RC and ORC and compressions like snappy & bzip.
- Experienced in analyzing data using HQL, PigLatin and extending HIVE and PIG core functionality by using custom UDFs.
- Good Knowledge/Understanding of:
- NoSQL data bases and hands on work experience in writing applications on NoSQL databases like Cassandra and MongoDB.
- Various scripting languages like Linux/Unixshell scripting and Python.
- Data ware housing concepts and ETL processes.
- Involved in importing Streaming data using FLUME to HDFS and analyzing using PIG and HIVE.
- Experience in importing streaming data into HDFS using Flume sources, and Flume sinks and transforming the data using Flume interceptors.
- Configured Zookeeper to coordinate the servers in clusters to maintain the data consistency.
- Used Oozie and Control – M workflow engine for managing and scheduling Hadoop Jobs. Diverse experience in working with variety of Database like Teradata, Oracle, MySql, IBM DB2 and Netezza.
- Good knowledge in understanding Core Java and J2EE technologies such as Hibernate, JDBC, EJB, Servlets, JSP, JavaScript, Struts and spring.
- Experienced in using IDEs and Tools like Eclipse, Net Beans, GitHub, Jenkins, Maven and IntelliJ.
- Implemented POC to migrate map reduce programs into Spark transformations using Spark and Scala. Ability to spin up different AWS instances including EC2-classic and EC2-VPC using cloud formation templates.
Source link