Description de l'emploi

Niveau d'expérience: Experienced Hire

Catégories:

  • Engineering & Technology

Emplacement(s):

  • Quay Building 8th Floor, Bagmane Tech Park, Bengaluru, IN
  • We are seeking a highly experienced and motivated Senior Data Engineer to join our dynamic team. The ideal candidate possesses a strong software engineering background and deep expertise in designing, building, optimizing, and maintaining scalable data pipelines and infrastructure. You will leverage your extensive experience with Apache Spark, Apache Kafka, and various big data technologies to process and manage large datasets effectively. Working within an Agile/Scrum environment, you will take ownership of complex tasks, delivering high-quality, well-tested solutions independently.
    Responsibilities:
  • Design, develop, implement, and maintain robust, scalable, and efficient batch and real-time data pipelines using Apache Spark (Python/PySpark and Scala) and Apache Kafka.
  • Work extensively with large, complex datasets residing in various storage systems (e.g., data lakes, data warehouses, distributed file systems).
  • Build and manage real-time data streaming solutions to ingest, process, and serve data with low latency using Apache Kafka.
  • Optimize data processing jobs and data storage solutions for performance, scalability, and cost-effectiveness within big data ecosystems.
  • Implement comprehensive automated testing (unit, integration, end-to-end) to ensure data quality, pipeline reliability, and code robustness.
  • Collaborate closely with data scientists, analysts, software engineers, and product managers to understand data requirements and deliver effective solutions.
  • Actively participate in Agile/Scrum ceremonies, including sprint planning, daily stand-ups, sprint reviews, and retrospectives.
  • Take ownership of assigned tasks and projects, driving them to completion independently while adhering to deadlines and quality standards.
  • Troubleshoot and resolve complex issues related to data pipelines, platforms, and performance.
  • Contribute to the evolution of our data architecture, standards, and best practices.
  • Mentor junior engineers and share knowledge within the team.
  • Document technical designs, processes, and implementation details.
    Required Qualifications:
  • Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related field (or equivalent practical experience).
  • 10+ years of professional software engineering experience with a proven track record of building complex, scalable systems.
    Significant hands-on experience (typically 5+ years) specifically in data engineering roles.
  • Expert-level proficiency in designing and implementing data processing solutions using Apache Spark, with strong skills in both Python (PySpark) and Scala.
  • Demonstrable experience building, deploying, and managing data streaming pipelines using Apache Kafka and its ecosystem (e.g., Kafka Connect, Kafka Streams).
  • Solid understanding and practical experience working with big data technologies and concepts (e.g., Hadoop ecosystem - HDFS, Hive, distributed computing, partitioning, file formats like Parquet/Avro).
  • Proven experience working effectively in an Agile/Scrum development environment, participating in sprints and related ceremonies.
  • Demonstrated ability to work independently, manage priorities, and deliver end-to-end solutions with a strong focus on automated testing and quality assurance.
  • Excellent problem-solving, debugging, and analytical skills.
  • Strong communication and interpersonal skills.
  • Preferred Qualifications:
  • Experience with cloud-based data platforms and services (e.g., AWS EMR, S3, Kinesis, MSK; Azure Databricks, ADLS, AWS Glue).
  • Experience with workflow orchestration tools (e.g., Airflow, Dagster, Prefect).
  • Experience with containerization technologies (Docker) and orchestration (Kubernetes).
  • Familiarity with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery).
  • Experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation.
  • Knowledge of CI/CD practices and tools (e.g., Jenkins, GitLab CI, GitHub Actions) applied to data pipelines.
  • Experience with data modeling and database design (SQL and NoSQL).

Instructions de demande

Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!

postuler en ligne