Sr. Big Data Platform Developer

22 May 2024

Vacancy expired!

Sr. Big Data Platform

Developer

Duration: 12 Months (long term)

Location: Tampa, FL or Irving, TX

We are looking for a

Sr. Big Data Platform

Developer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for collaborating with different stakeholders/team for required development.

Responsibilities
  • Implementing

    ETL process using defined framework
  • Monitoring performance and advising any necessary infrastructure changes
  • Create/modify tables, views in hive
  • Write Shell scripts to execute hive on spark jobs
  • Automate the shell scripts on job scheduling tool - Autosys
  • Improve job performance by implementing hive parameters, spark configuration level changes and spark optimization techniques
  • Create/modify hql script to retrieve data from hive tables or to use hql script for data processing
  • Working with team to

    defining data retention logic as per business requirements
  • Perform and oversee tasks such as

    writing scripts, writing T-SQL queries and calling APIs
  • Customize and oversee

    integration tools, warehouses, databases, and analytical systems
  • Design the

    data flow, create data flow diagrams and implement design level changes
  • Design and

    implement data stores that support the scalable processing and storage of our high-frequency data
  • With the help of

    Admin/support team solve any ongoing issues with operating the cluster

Qualifications and Skills:
  • Bachelor’s or master’s degree in computer/data science technical or related experience
  • 7+ years of hands-on years of relevant data engineering experience with data warehouse, data lake, and enterprise bigdata platforms required
  • Experience working in an agile/iterative methodology required
  • Working experience with

    Bigdata-Hadoop ecosystem:, NoSQL-Hive, Impala, Spark, Scala, shell scripting and RDMBS-MS SQL server required
  • Experience with

    integration of data from multiple data sources with full load, incremental load and real time load
  • Working

    experience with development/deployment tool: Jira, Bitbucket, Jenkins, RLM
  • Experience

    with Spark, Hadoop v2, MapReduce, HDFS required
  • Good knowledge of

    Big Data querying tools, such as Pig, Hive, and Impala required
  • At least

    2 years of relevant experience with real-time data stream platforms such as Flume, Kafka and Spark Streaming
  • Experience with

    various ETL techniques and frameworks required
  • Excellent analytical, problem-solving skills and have excellent communication skill
  • Ability to solve any ongoing issues with operating the cluster

“Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of – Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans.”

  • ID: #41363660
  • State: Texas Irving 33601 Irving USA
  • City: Irving
  • Salary: Depends on Experience
  • Job type: Permanent
  • Showed: 2022-05-22
  • Deadline: 2022-07-19
  • Category: Et cetera