SME DevOps, MLOps and Platform

I get things running, I keep them running

About Me Download PDF

  • Engineer by profession, writer at heart.
  • I have been building Data Streaming platforms with Apache Kafka, Internal developer platforms and stnadard SRE and DevOps (with Infrastructure as Code).
  • And in doing that I ended up working with Elasticsearch, Cassandra, Kubernetes and a lot more with data pipelines.
  • Part time (Technical) Product Management and full time engineering led to a role I never imagined I’d play in my career all while trying to expanding developer adoption to home baked systems
  • Currently working on in-house MLOps setups and Kubernetes native dev platforms.

Summary

  • Designing REST and Event Driven solutions on kubernetes
  • In-org customer interactions and product ownership
  • Designing automations using serverless and Chatops patterns
  • Scalable applications in Cloud (Hybrid with AWS primary)
  • Kafka/ES/Cassandra at scale (1M+ tps), alongwith the infrastructure and operations for all of it.
  • Setting up, automating and standardizing SRE and Devops deployment platforms

Professional Skills

Apache Kafka AWS Microservices Event Driven Architecture Elasticsearch Cassandra Kafka Connect Chef Packer Terraform Kubernetes Haystack OpenTracing Java Python Golang ArgoCD Argo Workflows CircleCI Sonarcloud SLI, SLO, SLA Scorecard MLOps Databricks

Organisations I Have Worked With

Org Title (Role) Duration
Teikametrics Staff Site Reliability Engineer (Technology Leader and Mentor) Jan 2022 - Present
Expedia Group Software Development Engineer (Technology DevOps Lead) July 2013 - Jan 2022
IBM Systems Engineer (Application Developer) Feb 2012 - July 2013

Projects

Devops -Teikametrics

Core Devops/Platform team for Teikametrics, overhauling and standardizing the devops methodologies as the Series B startup grows at a rapid pace

Aiven Kafka Postgres Terraform CircleCI Arog Workflows ArgoCD Kubernetes Python AWS MLOps Hybrid Cloud Datadog Opensearch Elasticache

  • Working closely with the tech leadership for adoption of industry standard Devops and SRE practices
  • Application and system design with Dev teams
  • Automated application bootstrap services
  • Scorecard services for fully automated ops compliance
  • Custom kubernetes deployment platform wrapped over opensource tools like Argo and Helm
  • ChatOps implementation using Slack and Argo Workflows
  • Implemented SLI/SLO/SLA across org
  • Chaos Engineering and Game Day Exercises
  • Data sync initiatives for robust pre-release environments
  • Managing Aiven Kafka and Connect using Terraform
  • DataOps, MLOps around Databricks using terraform

Conversations Platform - Expedia Group

The next-gen virtual agent chatbot platform, that can be adopted across business domains with initial focus on travel industry

Terraform Python Confluent Kafka Jenkins SLI SLO SLA Scorecard Opensearch AWS Datadog Faust

  • Developed app health scorecard to measure prod-readiness based on application health, SLOs, cost and other dimensions
  • Monitoring, operations and automations around Confluent CLoud
  • Conversations Platform migration from REST to Event Driven Architecture
  • Generating real time operational insights over chat data using Faust processors
  • Migrating Infrastructure from EC2 and ECS to Kubernetes
  • Chaos Engineering and Game Day Exercises

Streaming Platform - Expedia Group

Built a data streaming platform with custom tooling around Confluent-OSS, Apache Kafka and Kafka Connect which acts as a central nervous system/data lake across EG brands viz. BEX, HCom, etc.

Kafka Java Chef Ruby Terraform Python AWS Datadog

  • Achieved ~99.99% availability of core components in accordance with expected SLOs.
  • Global Streaming data platform with an in-house stream registry
  • REST based data ingestion into the EG’s Kafka Clusters. Facilitated data conversion from raw JSON to avro and Protobuf, hence, easing the integration into already running microservices.
  • Built custom Sink Connectors using Kafka Connect framework for S3 Sink, HTTP Sink, Elasticsearch Sink, Cassandra Sink.
  • Built in-house automations for provisioning and maintaining Kafka and Kafka Connect clsuters.
  • Deployment and e2e management and operations of high volume Apache Kafka/Connect (100+ nodes).
  • Automated Kafka Connecter monitoring and remediation

Doppler - Expedia Group

Developed and supported a high performance real-time analytics platform to monitor all business and system events in Expedia. Consuming over 10 billion messages a day and generating 50 million+ trends, the system was able to monitor booking, traffic and other events including clickstream over 100s of dimensions.

Java Kafka Elasticsearch Cassandra KStreams EFS AWS EC2 Grafana Seyren Splunk

  • Ingestion via Rest Service into Kafka ecosystem
  • Time Series aggregation, Anomaly Detection & notification services.
  • Scalable microservices with Kafka, ES, Cassandra as core components, spanning across 500+ EC2 nodes.
  • Deployment and e2e management and operations of high volume Apache Kafka(10B+/day), Cassandra(100TB+) and Elasticsearch clusters.
  • Two in-house time-series data storage methodologies for real-time and batch data.
  • Interactions with global teams for migration and adoption
  • Support for seasonal and straight trends in anomaly detection using vanilla statistical models and custom algorithms.
  • Custom Datastore over EFS to support hourly calculations on millions of data points using historic data.
  • Alert notifcations are sent to Email, Slack and integration with Trello and Jira via Amazon SNS.

IOTA/Stratus - Expedia Group

Fully-automated creation of deploy and release pipelines(CI/CD) using micro services & Jenkins and underlying AWS services, i.e. Cloudformation, ECS and EKS.

Packer Chef Python Java Ruby Splunk Grafana Jenkins AWS

  • Automations over AWS infrastructure with Chef, Ruby and AWS SDK.
  • Platform for automated creation and management of bootstrapping applications longwith their CI/CD pipelines over Git and Jenkins.
  • Automations using Python, Ruby.
  • Advanced Splunk User, Graphite, Grafana and similar monitoring systems.
  • Interactions with global teams for migration and adoption the new AWS based deployment platforms

Common Commerce Engine - IBM

SHOP@IBM.COM powered by WCS to automate the store creation for sale of IBM’s servers.

Java Websphere Linux

  • App support and dev with IBM proprietary technologies like WCS
  • Deployment over websphere in data center servers
  • Consistent delivery of monthly release patches.

My Academia

Institution Qualification Score
Swami Keshvanand Institute of Technology affiliated to Rajasthan Technical University B.Tech (hons.) - Information Technology 70.5%
Seedling Public School Higher Secondary (CBSE) - XII 77.6%
Seedling Public School Senior Secondary (CBSE) - X 82.4%

Languages

Language Proficiency
English Native or Bilingual Proficiency
Hindi Native or Bilingual Proficiency
Sindhi Limited Working Proficiency

Personal Interests

Reading Writing Poetry Anime Comic Books History Mythology