Vacancy expired!
- Conceptualize and own the design, implementation and operations of the next-generation data lake containing complex Genentech data from any Genentech business areas and operating within a global environment
- Conceptualize and own the data architecture for multiple large-scale projects across analytics, automation, data science and artificial intelligence, while evaluating design and operational cost-benefit tradeoffs within systems across different partnered business areas
- Solve challenging data integration problems for complicated and diverse data sets, utilizing and coding optimal ETL patterns, frameworks, query techniques, and sourcing / merging from structured and unstructured data sources
- Implement and optimize data pipelines, data quality metrics, and systems to facilitate easier development of data artifacts and support machine learning systems at scale
- Design and develop MLOps Infrastructure to build, train and deploy ML Models using cloud native services
- Design and develop secure, scalable, high-performance and reliable data and analytics solutions in a cloud platform (preferably AWS)
- Define and manage SLAs for all data sets in designated areas of ownership
- Collaborate with engineers, business managers, and data scientists from different business areas globally to understand complex data needs and provide key data insights in meaningful ways
- Identify and evaluate new technologies to improve performance, maintainability, and reliability of data systems.
- 5-8 years of experience as a Data Engineer, or in a similar role
- Experience designing, evolving and running data lakes with complex data and multiple customer groups
- Experience developing data pipelines using data from heterogeneous data sources (eg. AWS, Oracle, SAP, Salesforce, etc.)
- Experience with data modeling, data warehousing, and building data pipelines
- Experience implementing and monitoring data quality metrics in a data governance environment
- Experience with MLOps
- Experience with AWS, AWS Data Engineering and MLOps tools and services (ie: S3, Redshift, Athena, Glue, Lambda, SageMaker etc)
- Experience with software development and version control systems
- Experience designing and building large-scale technology products
- Experience leading engineering discussions around technology decisions
- Ability to understand complex problems and collaboratively develop ad hoc solutions in rapidly changing environment
- Strong technical communication skills
- Experience with secure data warehousing and robust ETL techniques
- Knowledge and experience in analytics, data science, machine learning and/or NLP
- Strong proficiency in SQL, Python and PySpark
- Experience understanding the perspectives of team members from other disciplines or backgrounds
- Experience analyzing business needs and opportunities
- ID: #49460468
- State: California Sanfrancisco 94101 Sanfrancisco USA
- City: Sanfrancisco
- Salary: $70 - $75
- Job type: Contract
- Showed: 2023-03-13
- Deadline: 2023-05-05
- Category: Et cetera