Vacancy expired!
- Work with Data Platform Enablement team
- Responsible for Walmart’s data platform, data processing, data integrations and data solutions working with internal and external partners. The broader team is currently on a transformation path, and this role will be instrumental in enabling the broader team’s vision.
- system administration, security compliance, and internal tech audits
- Responsible for operational excellence initiatives which include efficient use of data platform resources, identifying optimization opportunities, forecasting capacity, etc.
- Design and implement different flavors of architecture to deliver better system performance and resiliency.
- Identify opportunities to build automated processes and tools to improve efficiency.
- Develop capability requirements and transition plan for the next generation of data enablement technology, tools, and processes to enable Walmart to efficiently improve performance with scale.
- Drive best practices and standards around the usage of data platforms and tools
- Implement data governance practices. Handle business and technology issues related to management of enterprise information assets and approaches related to data protection.
- Administering Dataproc and Airflow. Ability to create, maintain, scale, and debug production ephemeral and long-run Dataproc clusters as a Dataproc administrator
- Deep understanding of data center architectures, networking, storage solutions, and scale system performance
- Technical knowledge of big data analytics, optimization techniques, and data pipeline acceleration. Experience deploying and maintaining large-scale data pipeline in production. Experience deploying data science models and reporting solutions at scale, preferably with building Data tools from the ground up
- Understanding of Cloud platforms such as Google Cloud Platform (preferred) and Azure and the difference between IaaS, CaaS, PaaS, etc.
- Strong experience with Apache ecosystem especially Spark, Hadoop, Hive, Kafka, Tez, Airflow and different data formats such as parquet, orc, avro, etc.
- Familiar with DevOps best practices and cloud native technologies
- Programming experience in SQL, Python (preferred), R, Scala, Java, or Bash
- Experience with BigQuery, Presto, CloudSQL, MSSQL, Cassandra, and Mongo DB is a plus
- Experience with PySpark, SparkSQL, MLlib, and Spark Rapids on GPUs is a plus
- Experience setting up logging and monitoring tools, and helping to debug complex data pipelines
- 5+ years of relevant experience in roles with responsibility over data platforms and data operations dealing with large volumes of data in cloud based distributed computing environments.
- Graduate degree preferred in a quantitative discipline (e.g., engineering, economics, math, operations research).
- Proven ability to solve enterprise level data operations problems at scale which require cross-functional collaboration for solution development, implementation, and adoption.
- ID: #49136808
- State: California Sanbruno 94066 Sanbruno USA
- City: Sanbruno
- Salary: Depends on Experience
- Job type: Permanent
- Showed: 2023-02-11
- Deadline: 2023-03-24
- Category: Et cetera