Job Description :
Responsibilities
Develop sustainable data driven solutions with current new gen data technologies to meet the needs of our organization and business customers
Help develop solutions for streaming, real-time, and search-driven analytics
Must have a firm understanding of delivering large-scale data sets solutions and SDLC best practices
Transform complex analytical models in scalable, production-ready solutions
Utilizing programming languages like Java, Scala, Python
Manage the development pipeline of distributed computing Big Data applications using Open Source frameworks like Apache Spark, Scala and Kafka on AWS and Cloud based data warehousing services such as Snowflake.
Leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test Driven Development to enable the rapid delivery of working code utilizing tools like Jenkins, Maven, Nexus, Terraform, Git and Docker
Basic qualifications
Bachelor Degree
At least 5 years of experience with the Software Development Life Cycle (SDLC)
At least 3 years of experience working on a big data platform
At least 2 years of experience working with unstructured datasets
At least 2 years of experience developing microservices: Python, Java, or Scala
At least 1 year of experience building data pipelines, CICD pipelines, and fit for purpose data stores
At least 1 year of experience in cloud technologies: AWS, Docker, Ansible, or Terraform
At least 1 year of Agile experience
At least 1 year of experience with a streaming data platform including Apache Kafka and Spark
Preferred qualifications
1+ years of experience with Identity & Access Management, including familiarity with principles like least privilege & role-based access control
Understanding of microservices architecture & RESTful web service frameworks
1+ years of experience with JSON, Parquet, or Avro formats
1+ years experience in RDS, NOSQL or Graph Databases
1+ years of experience working with AWS platforms, services, and component technologies, including S3, RDS and Amazon EMR
Source link