Resolved -
We have monitored the system closely since implementing the fix and can confirm that performance has remained stable. The issue was caused by a malfunctioning node in our AKS cluster, and affected workloads were successfully moved to healthy nodes.
The incident is now resolved. Thank you for your patience and understanding throughout.
Apr 19, 14:57 CEST
Monitoring -
We've identified the issue as a malfunctioning node in our AKS cluster. The affected workloads have been successfully moved away from the faulty node, and performance has returned to normal levels.
We're now moving from investigating to monitoring and will remain in this state for a while to ensure stability before marking the incident as resolved.
Thank you for your patience and understanding.
Apr 19, 12:59 CEST
Update -
Performance remains slightly degraded as our team continues to investigate the issue.
We sincerely apologize for any inconvenience this may be causing and appreciate your patience and understanding as we work to resolve it.
Apr 19, 12:23 CEST
Update -
We are observing slightly better performance since 11:00 but are continuing to investigate the issue.
Apr 19, 11:15 CEST
Investigating -
Since 09:50 we are observing elevated response times in Storm API in our multi-tenant production environment. We're currently investigating the issue.
We sincerely apologize for any inconvenience this are causing and appreciate your understanding as we continue to investigate.
Apr 19, 10:12 CEST