Data Engineer

21 Feb 2025

Vacancy expired!

Requisition ID # 145115

Job Category: Information Technology

Job Level: Individual Contributor

Business Unit: Information Technology

Work Type: Hybrid

Job Location: Oakland

Team Overview

The Decision Products team strives to utilize best in class modeling techniques and industry leading data science to drive PG&E's transition to the sustainable energy network of the future through data driven decision making. This work moves beyond descriptive reporting and is focused on pushing the business forward through applied statistics, predictive and prescriptive analytics, and insightful tool design. The cornerstone of these high value analytics is one of the largest smart meter usage databases in the industry, that when combined with billing, program engagement, customer demographic, grid, and other data sources has unprecedented potential.

Current and past projects include:
  • Deployment of computer vision algorithms in tools that accelerate and automate asset inspections processes
  • Predicting electric distribution equipment failure before it occurs allowing for proactive maintenance
  • Optimizing renewable resource portfolios, including location and resource adequacy considerations
  • Supporting asset strategy decision making including, where should PG&E underground electrical assets
  • Supervised and unsupervised machine learning models using Python and Spark, trained on AWS, deployed on Palantir Foundry

Position Summary

We are looking for a savvy and driven Data Engineer to join our growing team of analytics experts. In this role you will work as part of cross functional teams, including data scientists, other data engineers, technology experts, and subject matter experts to develop data driven solutions. Successful candidates will be responsible for building, expanding, and optimizing our data, data storage, and data pipeline. This individual will support team members (data scientists, software developers, etc.) and decision products to ensure that data delivery is reliable and optimized. They will be supporting the data needs of multiple teams, systems, and products. This role will help the team continue its history of success. Qualified candidates will have a unique opportunity to be at the forefront of the utility industry and gain a comprehensive view of the nation's most advanced smart grid. It is the perfect role for someone who would like to continue to build upon their professional experience and help advance PG&E's sustainability goals.

PG&E is providing the salary range that the company in good faith believes it might pay for this position at the time of the job posting. This compensation range is specific to the locality of the job. The actual salary paid to an individual will be based on multiple factors, including, but not limited to, specific skills, education, licenses or certifications, experience, market value, geographic location, and internal equity. We would not anticipate that the individual hired into this role would land at or near the top half of the range described below, but the decision will be dependent on the facts and circumstances of each case.

A reasonable salary range is:

Bay Area Minimum: $98,000.00

Bay Area Mid-point: $122,000.00

Bay Area Maximum: $146,000.00

&/OR

California Minimum: $93,000.00

California Mid-point: $116,000.00

California Maximum: $139,000.00

This position is hybrid, working from your remote office and your assigned work location based on business need. The assigned work location will be within the PG&E Service Territory.

Responsibilities
  • Enhance and maintain our current data pipelines and associated infrastructure
  • Assemble large, moderately complex data sets that meet functional / non-functional business requirements.
  • Engage with different stakeholder teams to troubleshoot various database systems
  • Build and maintain tools that monitor data and system health
  • Identify, design, and implement internal process improvements to optimize production of results and enable cost savings.
  • Performance tune and optimize data pipeline on Spark
  • Create and maintain documentation describing data catalog and data objects

Minimum Requirements
  • Bachelor's degree in computer science, an engineering field, or equivalent work experience in an engineering field
  • 3 years of experience with data engineering/ETL ecosystem, such as Palantir Foundry, Spark, Informatica, SAP BODS, OBIEE

Required Skills
  • Experience with data engineering/ETL ecosystem, such as Palantir Foundry, Spark, Informatica, SAP BODS, OBIEE
  • Database design fundamentals
  • Experience with Python, Pandas and APIs
  • Knowledge of Time Series data set development.
  • Demonstrated commitment to teamwork and enabling others
  • Proven ability to translate business desires into technical requirements
  • Ability to communicate with various stakeholders and leadership
  • Ability to breakdown an ambiguous problems

Desired Skills
  • Experience with Scikit Learn, PySpark or equivalent big data processing framework, CI/CD tool
  • Experience with an infrastructure as code tool, writing production-level code, writing health checks, unit tests, integration tests, schema validations
  • Familiarity with cloud computing security fundamentals
  • Experience with the Palantir Foundry platform
  • Experience working with data scientists and machine learning engineers
  • Familiarity with model deployment
  • Front end tools: PowerBi, Tableau

  • ID: #49329798
  • State: California Oakland 94617 Oakland USA
  • City: Oakland
  • Salary: USD TBD TBD
  • Job type: Permanent
  • Showed: 2023-02-21
  • Deadline: 2023-04-22
  • Category: Et cetera