As a Specialty Engineer - Data,  you have the unique opportunity to challenge the status quo, aiding the growth and defining the future of cloud data consumption for Enterprise Organizations.While cloud technology has accelerated organizational change, the process of operationalizing, governing, and deriving value from data pipelines has been slow to evolve. This is your opportunity to be at the forefront of defining the next chapter of modern Data Engineering—spanning traditional batch workloads to cutting-edge real-time data streaming patterns on Azure and GCP.The successful candidate will evangelize the benefits of a modern, ELT-first approach, driving continuous, iterative improvement to our data platform. You won't just build pipelines; you will architect data-driven solutions that directly inform executive decision-making and operational excellence.Job Requirements:Design and Development: Design, build, and optimize robust and scalable ETL/ELT data pipelines to ingest and process large volumes of data using Azure and GCP services (e.g., Azure Data Factory, Azure Synapse/Databricks, GCP Dataflow, Cloud Data Fusion, or Cloud Functions).Data Modeling and Warehousing: Develop and maintain optimized data models (e.g., dimensional, vault) within multi-cloud data warehouse solutions (e.g., Google BigQuery or Azure Synapse Analytics) or data lakes to support BI, reporting, and analytical workloads. This includes ensuring data structures are optimized for consumption by tools like Power BI and Looker.Performance Tuning: Monitor, troubleshoot, and optimize the performance of data warehouse queries and compute resources (e.g., BigQuery slots, Azure Synapse SQL pools, or Databricks/Dataproc clusters) to ensure cost-efficiency and fast data retrieval.AI Data Foundation:Feature Engineering: Collaborate with Data Scientists to design and implement feature stores and pipelines to prepare and serve data for ML model training and inference.Vector Database Integration: Develop and maintain pipelines for transforming unstructured data (text, documents) into embeddings and loading them into vector databases (e.g., Azure Cosmos DB, GCP Vertex AI Vector Search, or dedicated vector stores) to support RAG solutions.Data Orchestration: Implement workflows (e.g., using Apache Airflow, Google Cloud Composer, or Azure Data Factory/Logic Apps) to automate the end-to-end data lifecycle for AI/ML processes, including data refresh and model retraining.MLOps and Productionization:Model Deployment: Work with Data Science teams to containerize, deploy, and manage machine learning models in production environments (e.g., using GCP Vertex AI, Azure Machine Learning, or AKS/GKE).Monitoring and Logging: Implement robust monitoring and logging solutions for production ML pipelines and models to track performance, data drift, and model decay using Azure Monitor or GCP Cloud Monitoring.CI/CD for ML: Integrate model training, testing, and deployment into CI/CD pipelines to ensure rapid, reliable, and automated updates to production ML services.Security and Governance: Implement and manage security best practices across Azure, GCP, and Snowflake, including access controls, role-based security (RBAC), IAM policies, and data encryption.Coding and Automation: Write complex, efficient SQL queries and develop scripts in Python (or other relevant languages like Scala/Java) for data manipulation, process automation, and pipeline orchestration.Collaboration: Work closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver high-quality, actionable data solutions, including setting up data sources and datasets for BI tools like Power BI and Looker.Documentation: Create and maintain technical documentation for data models, data flows, and ETL/ELT processes.