Issues with Queue Login and Incoming Calls

Incident Report for Zisson.no

Postmortem

Please see below for posmortem incident Report – “Message System Outage November 12, 2026 (Evening)”

Summary
On November 12, 2026, at 21:20, the Zisson Interact platforms experienced a disruption in the internal message system. During this incident, agents were unable to log in or out of queues, receive or make calls using the softphone, change their queue status, or process ongoing queue traffic. The issue was mitigated through manual intervention, and services were gradually restored until full functionality was confirmed around 23:00 on one and 23:30 on the other.

Description
At 21:20, alarms were triggered indicating instability in the internal message system on Interact platforms. Operations initiated troubleshooting immediately. Agents were unable to interact with queues or use the softphone for inbound or outbound calls. The issue was not caused by abnormal traffic or external factors, but by a synchronization failure within the platform’s message cluster. To restore stability, traffic was redirected to a single functioning message server. However, other servers on the platform experienced connection issues, requiring manual reconnection and verification to ensure proper communication between components. The platform gradually stabilized, and all services were fully operational by 23:00.

Timeline of Events

  • 21:20: Alarms triggered – message system instability detected on one Interact platform.

  • 21:22: Immediate troubleshooting initiated by Operations team.

  • 21:30: Situation Room established; engineers began isolating the affected servers.

  • 21:40: Traffic redirected to a single functioning message server to stabilize the system.

  • 21:50–22:45: Manual reconnection and synchronization of servers to restore communication between components.

  • 23:30: All services confirmed operational, normal traffic resumed.

Root Cause
A synchronization failure occurred between servers within the message system cluster, preventing normal communication between components.

Further investigation revealed that a new network configuration was implemented by our hosting partner before the incident.

This change caused the message servers within the message system cluster to lose communication with each other, leading each node to operate independently (a “split-brain” condition).

As a result, multiple servers assumed the master role simultaneously, which generated message loops and excessive load on the system. The issue was not caused by an external network event, but by an unintended consequence of the configuration change, which disrupted internal synchronization between the cluster nodes.

Actions Taken

  • Immediate investigation initiated by Operations and Support teams.

  • Traffic redirected to a single operational message server to restore stability.

  • Manual synchronization and reconnection of affected servers.

  • Verified full system functionality by 23:00.

  • Hosting partner rolled back the change.

Next Steps

  • Our hosting partner has updated procedures for implementing and validating infrastructure changes to ensure full synchronization across servers before returning services to production.

We recognize the impact such disruptions have on our customers and sincerely apologize for the inconvenience caused.

— Zisson Operations Team, 2026-11-12

Posted Nov 13, 2025 - 16:27 CET

Resolved

The issue has now been resolved.

Feilen er nå løst, og alle systemer fungerer som normalt igjen.
Posted Nov 13, 2025 - 08:07 CET

Monitoring

The issue appears to be resolved, and all systems are functioning normally again.
We are continuing to monitor the situation.

Feilen ser ut til å være løst, og alle systemer fungerer som normalt igjen.
Vi fortsetter å monitorere situasjonen.
Posted Nov 12, 2025 - 23:30 CET

Update

We are still experiencing issues with queue logins, and incoming calls are still not being delivered to agents.
Our team is actively working on resolving the problem, and we will share more updates as we make progress.

Vi opplever fortsatt problemer med pålogging til kø, og innkommende samtaler blir fortsatt ikke levert til agenter.
Vi jobber aktivt med å løse problemet og oppdaterer fortløpende.
Posted Nov 12, 2025 - 23:12 CET

Identified

We are currently experiencing issues with logging into queues, and incoming calls are not being delivered to agents.
We are working to resolve the issue and will provide updates as we progress

Vi opplever for øyeblikket problemer med å logge på kø, og innkommende samtaler blir ikke levert til agenter.
Vi jobber med å løse problemet og oppdaterer fortløpende.
Posted Nov 12, 2025 - 22:24 CET
This incident affected: Zisson Interact.