Srihari Kookal

Senior Software Engineer · Frankfurt am Main, Germany

Srihari Kookal.

Github: srih4ri · Linkedin: srih4ri

📈 As a leader, I actively contribute to shaping the company objectives, plan roadmaps, launch Taskforces and guide initiatives to achieve goals of the Organisation.

🧑‍💻 I am an experienced Manager, plan and execute projects, leading a team of engineers.

🥷 I can code, I can move fluidly between Infrastructure, Data, Backend and Frontend functions.

🌻 Regarded as an Industry expert by my colleagues, I can mentor my fellow engineers to grow in their profession, and keep them motivated in their work.

Tools and Skills: Ruby, Rails, PostgreSQL, Redis, Kubernetes, Docker, Terraform,Terragrunt, Prometheus, Grafana, Datadog, AWS, Kibana, Elasticsearch, Sidekiq, Git, CI/CD, Javascript, EmberJS, React, OKR Framework, Agile methodologies, Test Driven Development.

Experience

I have over 15 years of experience in software development, with a strong focus on backend development, system architecture, and team leadership. I have worked in various roles, from individual contributor to technical leadership positions, across multiple industries including fintech, insurance, and community engagement.

Senior Software Engineer, Omnia Retail Technical Leadership

Sep, 2025 - Present

Member of Team Daedalus, the Market Data collection team behind Omnia Retail’s pricing-intelligence product for retailers. The team owns a large Scala/Python data pipeline running on Kubernetes.

  • Contributed to the migration of the team’s production workloads onto a new EKS cluster managed via Flux GitOps — pairing with teammates to fix ingress/healthcheck mismatches, IPv6 dual-stack networking, Karpenter node-pool selectors, and KEDA-based autoscaling during cutover.
  • Drove the migration of the cluster’s largest workload class to ARM64/Graviton nodes, including parallel multi-arch CI/CD pipelines and right-sizing of Kubernetes resources across worker types — recovering significant cluster capacity and reducing compute spend.
  • Operated and hardened a high-throughput data-collection subsystem — designing horizontal sharding, autoscaling, observability, PodDisruptionBudgets, and Flux image-update automation around a Redis-backed supporting component.
  • Added OpenTelemetry tracing across the Scala services — OTel Java-agent integration, span helper traits, and a Jaeger / OTel-collector docker-compose for local testing.
  • Built a shared AI-agent workspace for the team — wiring Claude Code into Jira, Confluence, log search, dashboards, the data warehouse, and internal databases via MCP servers, with reusable skills for incident investigation, dashboards, alerts, and PR review.
  • Built an alert-investigation toolkit that combines log search, Victoria Metrics, kubectl, and an LLM-driven agent — with trend awareness, alert fingerprinting, contract-impact tracking mapped to product KPIs, and threaded Slack notifications.
  • First responder for production incidents on the pipeline — owning RCA, follow-up fixes, and customer-facing communication.
  • Authored Python diagnostic tooling for data-freshness and routing analyses, drove the team’s Grafana migration from AWS to Azure Managed Grafana with Azure AD, and contributed to hiring, coding principles, and onboarding documentation.

Technologies Used

Scala (sbt, Java 8), Python (uv), Kubernetes (EKS), Karpenter, KEDA, Flux GitOps, Kustomize, Helm, Docker (multi-arch ARM64/x86), Terraform, AWS, Redis, PostgreSQL/TimescaleDB, MongoDB, Elasticsearch, Kibana, OpenTelemetry, Jaeger, Victoria Metrics, Prometheus, Grafana (Azure Managed), Databricks, Slack/Jira/Confluence APIs, Claude Code, MCP servers.


Staff Engineer, Clark Technical Leadership

Feb, 2023 - Aug, 2025

  • Maintained the 10-year-old Rails monolith running 90% of the company’s backend services.
    • Actively planned and executed Ruby and Rails upgrades to ensure compliance with software supply chain standards.
    • Maintained an architecture based on Domain-Driven Design, ensuring efficient code ownership and agility for all product teams.
    • Reviewed solution designs for all major changes to backend architecture.
    • Mentored backend engineers as the code owner of critical components, reviewing pull requests from all product streams of the company.
  • Acted as the first responder for major incidents in Clark’s applications and dependent services, resolving multiple incidents within SLA and minimizing impact on business operations.
  • Collaborated with engineering managers and the director to set and achieve annual engineering goals for Clark’s German business division.
  • Contributed to Clark’s Terraform, Terragrunt, Helm, and Docker codebases, managing infrastructure across six businesses and four regions.
  • Proposed and built a resilient, highly scalable signup system that elastically scaled to handle high-traffic events, such as Clark-sponsored prime-time TV events.
  • Conducted performance tests using GrafanaLab’s k6 and made performance improvements to Ruby on Rails deployments in Kubernetes clusters. Fine-tuned Puma, Nginx, and Kubernetes variables to achieve a distributed system that scaled elastically to handle peak traffic.
  • Ensured 100% uptime during peak traffic events, including a prime-time national TV event viewed by millions across Germany. Planned and executed multiple initiatives over a quarter to ensure system resilience and high availability.
  • Led the migration of the background job processor from Delayed Job to Sidekiq, improving background job performance by 20% and reducing overall costs. Coordinated a cross-team effort to migrate close to 500 background jobs across all environments.
  • Implemented vertical pod autoscaling and horizontal pod autoscaling, reducing monthly cloud costs by 10%. Created dashboards to monitor resource utilization, CPU CFS throttling, and memory pressure, and made key decisions to reallocate compute and memory provisioned across tens of Kubernetes resources.
  • Setup read-replica database, and wrote a custom middleware to migrate all GET requests to read-replica databases, reducing the load on primary databases by 30%. This was a key initiative to ensure high availability and performance during peak traffic events.
  • Set up SLOs and monitoring using Prometheus and Grafana, and application monitoring using Datadog.
  • Implemented IRSA for various backend applications deployed on the Kubernetes cluster.
  • Established DORA-compliant processes for deployment, rollback, and incident response.
  • Reduced error rate by 60%. Identified and eliminated noise in error reporting channels, reducing alert-fatigue. 
  • Created a process to distribute errors across multiple teams. Actively triaged new errors and assigned them to relevant teams. Mentored and enabled multiple engineers to monitor and resolve errors.
  • Reduced overall monitoring alarms by 70%, cleaned up alerting channels and alerts, thereby reducing alert fatigue and increasing incident response readiness.
  • Fixed bottlenecks in continuous delivery pipelines, reducing the end-to-end deployment pipeline by one hour.
  • Worked closely with the database administrator to optimize database storage, reducing storage by 20%.
  • Carried out zero-downtime database upgrades for Clark Germany’s database across major PostgreSQL versions. Identified AWS Blue-Green deployments and planned and executed database migrations across environments within a week.
  • Saved cloud costs and improved codebase maintainability by retiring unused cloud services like SQS.
  • Maintained and developed internal tools written in Go, created to interact with various APIs like  github, Sentry and JIRA
  • Launched a RAG based command-line code assistant using Google Gemini, Langchain and vector database.

Technologies Used Ruby on Rails, Kubernetes , Javascript, Go, Python, Jenkins, Groovy, Nginx,AWS DMS, AWS Lambda, AWS SQS, RDS, Cloudwatch monitoring, Aurora, EKS, ECS, EmberJS, React, Shell Scripting, Swift, Datadog, Prometheus, Thanos, Grafana, Grafana Labs K6, Apache Bench, OpenAI, Ollama, Gemini API, Langchain

Tech Lead, Clark Technical Leadership

Feb, 2021 - Feb, 2023

  • De-facto Engineering Manager for the Marketing Tech team, collaborted with Marketing Team liason to ideate, plan and execute quarterly initiatives and goals.
  • Conducted Agile Sprint ceremonies, achieved Lead and Cycle time goals of the company.
  • Proposed and Executed Automated Voucher Payouts, increasing customer satisfaction and increase in NPS.
  • Improved the go-live time of Marketing Campaigns from 2 weeks to almost instant. Built an in house Marketing Campaign Management tool, automating all marketing reward programs in Clark.
  • Proposed and Implemented the Automated Voucher Payouts Feature, increasing customer satisfaction and increasing in NPS.

Senior Software Engineer, Clark Individual Contributor

Nov, 2020 - Feb, 2021

  • Collaborated with Marketers to improve the conversion of primary Signup funnel.
  • Enabled rapid experimentation by building in-house A/B testing framework re-usable across backend and front-end applications. This used a highly scalable weighted hash algorithm for assigning variants to users.
  • Decommissioned legacy codebases, reducing the overall complexity of the system.

Associate Vice President - Technology, Scripbox Leadership

Feb, 2020 - Oct, 2020

  • Worked closely with the Founders, CTO and CPO to lead the fifteen member product team.
  • Managed a multi talented team of Software Engineers, Quality Analysts and Business Analysts, conducting one-on-ones, setting up career and growth trajectories process.
  • Won the company’s annual Best Manager Award
  • Collaborated closely with product to plan and execute the annual vision and quarterly roadmaps. 
  • Collaborated with Company Leadership to setup a skill based performance and appraisal system for engineering department
  • Worked closely with VP Operations to achieve KPIs  for Order Processing. 
  • Launched projects to achieve the Operational KPIs for in-house Call Center. This enabled the Customer Success team to achieve a very high NPS
  • Successfuly completed the customer data migration project after first major merger and acquisition operation, paving the way for a smoother Acquisitions in future.
  • Built a portofolio growth calculation software, that calculated ~2 million datapoints under 30 minutes. Technologies used to achieve this was Elixir, Postgres, matrix computation using CBLAS.

Product Engineer, Scripbox Technical Leadership + Individual Contributor

Feb, 2017 - Feb, 2020

Mentoring
  • Lead the backend team for scripbox’s core services.
  • Onboarded a team of four rails developers to react, switched to react from handlebars with zero impact on delivery times.
  • Nurtured a culture of ownership and pride.
Ops and Site Reliability
  • First responder for incidents in scripbox’s core services.
  • Provided On Call support and ensured uptime of scripbox’s call center.
  • Rolled out rails upgrade from 4.3 to 5.1 with near zero downtime.
Product
  • Built a Business Process Management software from scratch. Over the last one year, this software makes scripbox’s processes measurable, efficient and fast.
  • Built a highly concurrent Portfolio Reporting System in a short span of a month, used elixir and postgres.
  • Built key components for a paperless onboarding process - Reduced time to invest in a mutual fund to hours from days.

Technologies used

AWS, kubernetes, docker,ruby, elixir,elasticsearch, postgres, mysql,LaTex, Redash/Metabase


Software Consultant, Big Binary Individual Contributor

July 2015 - Dec 2016

  • Worked on a rails 2.3 codebase, took over the deployment process that deployed the app to 200+ clients.

  • Improved multiple steps in deployment. Resolved customer support issues needing Engineering support in a speedy manner, increasing Customer satisfaction.

  • Optimised slow running reports, re-vamping it with an effeicient caching and pre-calculation stages. 

  • Practiced Test Driven Development 

  • Worked on the project to re-write the company’s flagship product into a SAAS multi-tenant application, this re-write was launched on time.

  • Collaborated with a front-end engineer bootstraping a new product for the company.

Technologies Used

Linux, Bash,Ruby on Rails, Javascript, CoffeeScript,Webkit Engine,Mysql, Redis


Software Architect, Bang The Table. Individual Contributor

Jan, 2013 - July 2015

Worked as a Rails and Javascript developer for building two products in Community Engagement.


Software Engineer, Foradian Technologies Individual Contributor

July, 2010 - May, 2012

  • Worked as a Rails and Javascript developer on Fedena, the open source school management system. 
  • Built ‘Sampoorna’ , the school ERP used by Kerala Government used in all public schools in kerala . This software processes all school related data for 400k new students every year.
  • Implemented CI/CD pipelines using SVN, Git. One of the earliest companies to self-host gitlab at their Launch.
  • Developed a hybrid data synchronization architecture to sync data between the centralized SAAS installation and on-premise installation of the Ruby on Rails ERP software.
  • Added a “Update Now” feature for the on-premise ruby on rails application.
  • Packaged the Ruby on Rails application as a debian package to release a custom Ubuntu installer for on-premise installations
  • Hosted and configured a git server to deploy code on git pushes.
  • Developed an integration with FET (Timetable software) with a custom XML interface, that communicated with an application running on a headless X server.
  • Re-wrote PDF generation using prawn library 
  • Handled multiple high-traffic events 

Technologies Used

Ruby on Rails,Python,Javascript, Backbone.js,Git, SVN,Gitlab, BigTuna,Shell scripting,Prawn, FET


Projects

Product: XPNS

XPNS - A personal finance app to track expenses, income and investments. Built using Ruby on Rails, EmberJS, PostgreSQL and deployed on Cloudflare and Hetnzer. Blog post about building XPNS

Consulting: Foaps

Foaps is a Startup in the food delivery space, I worked as a consultant to build their backend systems. The project involved building a highly concurrent system to handle food orders, payments and customer support. The system was built using Ruby on Rails, PostgreSQL and Redis. Notable achievements include:

  • Reduced load times of Daily Business Reports to under 5 seconds from 10 minutes.
  • Built a Whatsapp Food Delivery system, the first of its kind in the industry.

Open Source: Debian Ruby

Worked with debian-ruby team to package Ruby Gems for Debian as a part of Diaspora project.

Education

July, 2010

Btech - Electronics and Communication, Lal Bahadur Sasthri College of Engineering.