Develops, enhances and tests data and analytics pipelines to support critical business functions. Should have experience across diverse development platforms, software, hardware, ETL/ELT technologies and tools. Participates in the design, development and implementation of complex applications, often using new technologies such as Airflow and (get)dbt. Should have proficiency in work activities specific to operationalizing data pipelines such as Release/Deployment, Operational Readiness, Capacity/Availability Management, Application Monitoring, Service Analytics, Reporting, Production Governance and Change/Configuration Management, etc. May provide technical direction and system architecture for individual initiatives. Serves as a fully seasoned/proficient technical resource. Will not have management responsibility for direct reports yet will lead projects and direct activities of a team related to special initiatives or operations. Possesses strong SDLC experience and manages appropriate levels of systems documentation as required. Should be able to accurately estimate effort based on the requirements.
Job responsibilities:
• Create and maintain optimal data pipeline architecture
• Collaborate with business analysts, data modelers and subject matter experts to design and implement data pipeline solutions
• Deliver data pipelines and related components as specified in the design, functional and non-functional requirements, within established budget, time and quality standards
• Perform unit testing of data pipeline jobs and related components and document test results
• Performance tuning of data pipeline jobs and related components
• Provide resolution to an extensive range of complicated data pipeline related problems, proactively and as issues surface
• Perform code version control activities.
Required Skills:
• Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
• Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
• Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and workload management.
• A successful history of manipulating, processing and extracting value from large disconnected datasets.
• Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores such as Snowflake.
• Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning and troubleshooting
• Problem-solving skills with minimal supervision
• Ability to work independently or in a team environment with geographically dispersed teams at various times
• Has played an ETL Lead/architect role (customer facing) in at least 1 implementation
• 3+ years of hands-on experience using and extending Airflow. Expert python experience is preferable
Desirable Skills:
• 5 year of current and hands-on experience with complex SQL and high-volume Relational databases and SQL Query performance tuning
• 1+ year(s) of experience developing large scale data applications on Snowflake
• 1+ year(s) of experience in data ops and specifically data kitchen or (get)dbt
• 3+ year of current and hands-on UNIX shell scripting experience
• 3+ years of current and hands-on experience with an enterprise job scheduler, preferable Autosys
Source link