Data Developer

Job Description

The Data Engineer position will focus on designing, developing, and supporting our Hadoop data solutions in Spark and Python (PySpark) while working with other components of the Hadoop ecosystem such as HDFS, Hive, Hue, Impala, Zeppelin, Jupyter. A successful candidate will work closely with business and portfolio leads to understand requirements then design and build innovative data solutions.

Job Duties & Responsibilities.

  • Design and development centered around PySpark, Python and Hadoop Framework.
  • Working with gigabytes/terabytes of data and must understand the challenges of transforming and enriching such large datasets.
  • Provide effective solutions to address the business problems – strategic and tactical.
  • Collaboration with team members, project managers, business analysts and QA teams in conceptualizing, estimating and developing new solutions and enhancements.
  • Work closely with the stake holders to define and refine the big data platform to achieve sales, product, and strategic objectives.
  • Collaborate with other technology teams and architects to define and develop cross-function technology stack interactions.
  • Read, extract, transform, stage and load (ETL) data to multiple targets, including Hadoop and Oracle.
  • Ingest and streamline incoming files of various layouts/formats as part of Source Prep process.
  • Develop scripts around Hadoop framework to automate processes and existing flows.
  • Modify existing programming/code for new requirements.
  • Estimate work, and track progress through SDLC with JIRA/Confluence
  • Unit testing and debugging. Perform root cause analysis (RCA) for any failed processes.
  • Convert business requirements into technical design specifications and execute on them.
  • Participate in code reviews and keep applications/code base in sync with version control (GIT/Bitbucket).
  • Effective communication, self-motivation, and ability to work independently while remaining fully aligned within a distributed team environment.
  • Required Skills

  • Bachelor's or Master's degree in Computer science (or Engineering equivalent).
  • 3+ years of experience with big data ingestion, transformation and staging.
  • Analysis, design and implementation experience with Hadoop distributed frameworks, including Python & Spark (SparkSQL, PySpark), HDFS, Hive, Impala, Hue, Cloudera Hadoop, Zeppelin, Jupyter, etc.
  • Extensive experience handling large volumes of data (measured in Terabytes/Billions of Transactions)
  • Proficient knowledge of SQL with any RDBMS
  • Familiarity with RDD and Data Frames within Spark
  • Working knowledge of data analytics
  • Troubleshooting and complex problem-solving skills
  • Knowledge of Oracle databases and PL/SQL
  • Working knowledge of Linux/Unix environments and comfort with Unix Shell scripts (ksh, bash)
  • Basic Hadoop administration knowledge.
  • DevOps Knowledge is an advantage
  • Ability to work within deadlines and effectively prioritize and execute on tasks
  • Strong communication skills (verbal and written) with ability to communicate across teams, internal and external at all levels
  • Preferred Skills

  • Working knowledge of Oracle databases and PL/SQL.
  • Hadoop Admin & Dev-Ops.
  • ETL Skills (Familiarity with Talend or other ETL tools a plus.
  • Good analytical thinking and problem-solving skills.
  • Ability to diagnose and troubleshoot problems quickly.
  • Motivated to learn new technologies, applications, and domains.
  • Possess appetite for learning through exploration and reverse engineering.
  • Strong time management skills.
  • Ability to take full ownership of tasks and projects.
  • Team player with excellent interpersonal skills.
  • Good verbal and written communication.
  • Possess Can-Do attitude to overcome any kind of challenges.
  • Preferred Certifications (Any of these)

  • CCA Spark and Hadoop Developer.
  • MapR Certified Spark Developer (MCSD).
  • MapR Certified Hadoop Developer (MCHD).
  • HDP Certified Apache Spark Developer.
  • HDP Certified Developer.
  • Additional Information

    About Epsilon

    Epsilon is a global advertising and marketing technology company positioned at the center of Publicis Groupe. Epsilon accelerates clients' ability to harness the power of their first-party data to activate campaigns across channels and devices, with an unparalleled ability to prove outcomes. The company's industry-leading technology connects advertisers with consumers to drive performance while respecting and protecting consumer privacy. Epsilon's people-based identity graph allows brands, agencies and publishers to reach real people, not cookies or devices, across the open web. For more information, visit epsilon.com.

    When you're one of us, you get to run with the best. For decades, we've been helping marketers from the world's top brands personalize experiences for millions of people with our cutting-edge technology, solutions and services. Epsilon's best-in-class identity gives brands a clear, privacy-safe view of their customers, which they can use across our suite of digital media, messaging and loyalty solutions. We process 400+ billion consumer actions each day and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Positioned at the core of Publicis Groupe, Epsilon is a global company with more than 8,000 employees around the world. Check out a few of these resources to learn more about what makes Epsilon so EPIC:

  • Our Culture : https: // www. epsilon.com/us/about-us/our-culture-epsilon
  • Life at Epsilon : https: // www. epsilon.com/us/about-us/epic-blog
  • DE &I: https: // www. epsilon.com/us/about-us/diversity-equity-inclusion
  • CSR : https: // www. epsilon.com/us/about-us/corporate-social- responsibility
  • Great People Deserve Great Benefits

    We know that we have some of the brightest and most talented associates in the world, and we believe in rewarding them accordingly. If you work here, expect competitive pay, comprehensive health coverage, and endless opportunities to advance your career.

    Epsilon is an Equal Opportunity Employer. Epsilon's policy is not to discriminate against any applicant or employee based on actual or perceived race, age, sex or gender (including pregnancy), marital status, national origin, ancestry, citizenship status, mental or physical disability, religion, creed, color, sexual orientation, gender identity or expression (including transgender status), veteran status, genetic information, or any other characteristic protected by applicable federal, state or local law. Epsilon also prohibits harassment of applicants and employees based on any of these protected categories. Epsilon will provide accommodations to applicants needing accommodations to complete the application process.



    Job details


    Data Developer




    United States


    June 11, 2024

    Application deadline

    January 08, 2024

    Job type



    Marketing & Sale

    About the employer

    Similar jobs

    Recent blogs