Job Description :
Job Title : Data Engineer III
When your part of the team at Thermo Fisher Scientific, you’ll do important work, like helping customers in finding cures for cancer, protecting the environment or making sure our food is safe. Your work will have real-world impact, and you’ll be supported in achieving your career goals.
Location/Division Specific Information
Data Engineer III, based out of our Bangalore location, plays a key role in the Operations of Data Science program, providing business continuity for critical business processes, IT systems and IT solutions through project implementations, enhancements, documentation and operational support.
How will you make an impact
Being part of an organization that provides analytics driven data solutions for all businesses across Thermo Fisher Scientific, you will be instrumental in helping our business partners and customers with their data and analytics needs.
What will you do

  • 6+ years working experience in data integration and pipeline development.
  • BS degree in CS, CE or EE.
  • 2+ years of Experience with AWS Cloud on data integration with Apache Spark, EMR, Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS, MongoDB/DynamoDB ecosystems
  • Strong real-life experience in python development especially in pySpark in AWS Cloud environment.
  • Design, develop test, deploy, maintain and improve data integration pipeline.
  • Experience in Python and common python libraries.
  • Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
  • Strong experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.
  • Databricks and Apache Spark Experience.
  • Data lake and Delta lake experience with AWS glue and Athena.

How will you get here

  • Bachelor’s degree in Computer Science with at least 6 years of Data Engineering experience.
  • Certifications including AWS Certified Data Analytics, CCA Spark and Hadoop Developer or CCP Data Engineer is a plus.
  • Full life cycle project implementation experience in AWS using Pyspark/EMR, Athena, S3, Redshift, AWS API Gateway, Lambda, Glue and other managed services
  • Strong experience in building ETL data pipelines using Pyspark on EMR framework
  • Hands on experience in using S3, AWS Glue jobs, S3 Copy, Lambda and API Gateway.
  • Working SQL experience to troubleshoot SQL code. Redshift knowledge is an added advantage.
  • Strong experience in DevOps and CI/CD using Git and Jenkins, experience in cloud native scripting such as CloudFormation and ARM templates
  • Experience working with Python, Python ML libraries for data analysis, wrangling and insights generation
  • Experience using Jira for task prioritization and Confluence and other tools for documentation.
  • Experience in Python and common python libraries.   
  • Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.  
  • Experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.  
  • Strong understanding of AWS Data lake and data bricks.
  • Exposure to Kafka, Redshift, Sage Maker would be added advantage
  • Exposure to data visualization tools like Power BI, Tableau etc.
  • Functional Knowledge in the areas of Sales & Distribution, Material Management, Finance and Production Planning is preferred

Knowledge, Skills, Abilities

  • Full life cycle implementation experience in AWS using Pyspark/EMR, Athena, S3, Redshift, AWS API Gateway, Lambda, Glue and other managed services
  • Experience with agile development methodologies by following DevOps, Data Ops and Dev Sec Ops practices.
  • Manage life cycle of ETL Pipelines and other cloud platform tools, including GitHub, Jenkins, Terraform, Jira, and Confluence.
  • Excellent written, verbal and inter-personal and stakeholder communication skills.
  • Ability to analyze trends associated with huge datasets.
  • Ability to work with cross functional teams from multiple regions/ time zones by effectively leveraging multi-form communication (Email, MS Teams for voice and chat, meetings)
  • Excellent prioritization and problem-solving skills.
  • Action Oriented: Have a sense of urgency, high energy and enthusiasm in managing Systems and Platforms
  • Drives Results: Consistently achieving results, even under tough circumstances.
  • Global Perspective: Takes a broad view when approaching issues; using a global lens.
  • Learn and train other team members
  • Communicates Effectively: Provide timely and consistent updates and recommendations on BI Operational issues and improvements to stakeholders.
  • Drive to meet and exceed BI Operational SLAs for Service Now incidents, Major Incidents, xMatters alerts, Employee Experience Metrics and BI application /process availability metrics.

At Thermo Fisher Scientific, each one of our 80,000 extraordinary minds has a unique story to tell. Join us and contribute to our singular mission—enabling our customers to make the world healthier, cleaner and safer.
Apply today!
Thermo Fisher Scientific is an EEO/Affirmative Action Employer and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other legally protected status.


Source link