Discover the latest trends and best practices impacting data-intensive applications. Register for access to all 60+ sessions available on demand.
⚠️ It looks like a privacy blocker is preventing the form from loading. Please disable it for this page or click here to access the form directly.
At ShareChat's scale, a seemingly simple incident revealed a 100% client-side error rate masking only 5% server-side failures, exposing dangerous gaps in observability and failure handling. This talk walks through the debugging journey that uncovered misaligned timeouts, inconsistent circuit breaking, and service mesh interactions, along with the fixes applied and practical techniques for improving resilience in distributed systems.