Position title
Principal - Site Reliability Engineer
Description

About Digit88

Digit88 is a niche product engineering consulting company based out of Bangalore with experience in building offshore development centers for US startups and MNCs over the last 7+ years. The founding team has 50+ years of product engineering and services experience out of India, China, and the US.

The Opportunity:

Digit88 manages and is expanding the dedicated offshore product development team for its US (Bay Area, NYC) based NLP/Chatbot platform development partner, that is building a next-generation AI/NLP/Chatbots based customer engagement platform. The candidate would be joining an existing team of 30+ engineers and help expand and lead the Platform Engineering, Production Support and Monitoring services for our client, assume technical leadership of the highly scalable and available messaging platform that connects the conversational AI platform to human agents and, own and drive the technical excellence in the offshore development team.

Our Hunt:

Digit88 is seeking an enthusiastic Principal Site Reliability Engineer (DevOps) with 10-12 years of hands-on experience to join the platform engineering and DevOps team for our Bay Area based partner. Experience with a fast-paced India/US product/engineering services company in a DevOps engineer role, setting up and maintaining a high-availability, high-performance real-time system is mandatory. Applicants must have a passion for engineering with accuracy and efficiency, be highly motivated and organized, able to work as part of a team, and also possess the ability to work independently with minimal supervision.

Responsibilities:
In this role, you'll get to:

  • Build and maintain cross-team platform components: infrastructure based on Infrastructure-as-Code, CI/CD pipelines, application/infrastructure monitoring, and automation of other development-related processes
  • Design and Deploy Automation of Container Applications using Kubernetes and Docker
  • Setup application/system monitoring
  • Work with Developers/QA to build and validate containerized applications
  • Manage geographically deployed server farms
  • Document Deployment Processes, Services and Environments

Requirements:

  • BE/BTech with CS or related discipline
  • 7+ years of experience as a DevOps, Site Reliability Engineer (SRE) or Systems Engineer
  • Advanced understanding of AWS services and components - VPC, IAM, EC2, ALB, ECS/EKS
  • Strong background in Linux Shell Programming.
  • Strong experience with SQL and NoSQL (Cassandra or DynamoDB)
  • Strong experience in Distributed Streaming Platform (Kafka/ Spark)
  • Strong experience in Docker Containers
  • Hands on experience with one or more of Java/Python/Go/NodeJS languages
  • Implementation Experience in automation tools and frameworks (CI/CD pipelines) like Git (Source Repo), Maven/Gradle (build tool), Jenkins/Teamcity and Docker.
  • Hands-on experience in Kubernetes
  • Experience in Package Management Tools like npm.
  • Experience with automation/configuration management tools like Salt/ Puppet/Chef/ Ansible.
  • Ability to use a wide variety of open source technologies and cloud services (experience with AWS is
    required) for application deployment.
  • Knowledge of best practices and IT operations in an always-up, always available service
  • Experience in troubleshooting production issues and co-ordinate with the development team to streamline code deployment.
  • Proven experience in optimizing the company’s computing architecture.

Good to have:

  • AWS certification (Architect, Operations) is a plus
  • Monitoring tools like Grafana/Prometheus and Appdynamics is a plus

Additional Project/Soft Skills:

  • Strong verbal and written communication with ability to articulate problems and solutions over the phone and emails.
  • Strong sense of urgency, with a passion for accuracy and timeliness.
  • Ability to work calmly in high pressure situations and manage multiple projects/tasks.
  • Ability to work independently and possess superior skills in issue resolution.

Benefits & working @ Digit88

  • Comprehensive health and accident insurance
  • Attractive pay package
  • Creative, flexible and rewarding work environment
  • Opportunity to work with a founding team of serial entrepreneurs with multiple successful exits to their credit. The learning will be immense just as will the challenges

Apply now

    Your Name*

    Your Email*

    Your Phone No*

    Make your pitch*

    Job Location
    Pune, India
    Employment Type
    Full-time
    Experience

    10-12 Years

    PDF Export
    Close modal window

    Thank you for submitting your application. We will contact you shortly!