Vacancy expired!
- Should have the ability to communicate data insights to all organizational levels, concluding, defining recommended actions, and reporting results across stakeholders.
- Should work on integrating data from different data sources.
- Should be working on pre-processing large datasets to build machine learning models, automating, deploying, and maintaining them into production.
- Should be able to understand how the deployed models run correctly.
- Should develop, test, and deploy data structures using Entity-Relationship Diagramming, and data modeling tools.
- 1+ years of hands-on experience on Flask and Rest-API, model deployment.
- 3+ years of hands-on experience with Python, MySQL, and SAS (SAS Enterprise), R, Tableau, SPSS, STATA.
- 5+ years of experience in data science specialization, including statistical data analysis and/or machine learning in an enterprise-scale environment.
- Deep understanding of common database technologies, such as SQL Database/Server, SQL Data Warehouse, Oracle, DB2, Netezza, MySQL, and other data sources, such as Azure Data Lake Storage and Azure Blob Storage.
- Experience working with distributed computing tools (Hadoop, Hive, Spark, etc.)
- Expert in Docker, CI/CD deployment, writing YMAL files to implement code and functions as service.
- Experience with Cloud Platforms using Google Cloud Platform/Azure/AWS.
- Hands-on experience with real-time streaming processing as well as high volume batch processing, and skilled in Advanced SQL, Amazon S3, Apache Kafka, Data-Lakes, etc.
- Experience with Tableau is a plus.
- Experience with large scale data mining tools such as Spark
- Advanced understanding of best practices for structuring and organizing Data Lake file systems for large volumes of data.
- Experience with ML models automation and deployment to production.
- Experience performing advanced data pipelines, data structure and modeling, data processing, data extraction, joining, manipulation cleaning, analysis, and presentation for medium to large datasets.
- Experience developing models for forecasting, classification, clustering, regression analysis, recommendations, variable selections, and natural language processing.
- Experience with scientific computing and analysis packages such as NumPy, Pandas, Scikit-Learn, SciPy, and ggplot2.
- Experience with Deep Learning frameworks like PyTorch, TensorFlow, and Keras.
- Experience with automated feature engineering/feature extraction and reduction.
- Experience with data visualization libraries such as Matplotlib, Seaborn Pyplot, ggplot2.
- Strong grasp of experimental design, A/B testing, and advanced statistical analysis
- Experience with Git, GitHub, and Linux administration.
- Experience leading end-to-end data science project implementation including training, testing, and deploying machine learning models in production environments.