It was not funny, Microsoft's multifactor authentication system (MFA), used for Azure, Office 365 and Dynamics, is down for the second time this month, just hours after the launch. conclusions concerning an interruption of 14 hours on 19 November.
Azure Active Directory multifactor authentication services were taken offline just before 05:00 UTC and remained down until shortly before 19:00 UTC. The servers initially involved were those serving the Europe and Middle East region and the Asia-Pacific region; as these regions woke up and tried to authenticate, the servers were overloaded and broke down. Microsoft tried to redirect some authentication attempts to US servers, but it simply overloaded them as well.
The subsequent analysis of the company showed that three individual bugs came together to cause the problems. On November 19, a code change gradually rolled out over the previous six days caused a cascade of failures. Above a certain level of traffic, the new code has resulted in a significant increase in latency between front-end servers and cache servers. This in turn revealed a critical competition situation in the main servers, forcing them to reset the front-end servers several times. This then revealed a third problem: Back-end servers were creating more and more processes, eventually depriving themselves of resources and leaving them inactive.
The problems of today are still under study. The MFA servers have been expiring since 14:25 UTC, which resulted in unsuccessful login attempts when MFA is used. At present, the company believes that the resolution of an earlier DNS error has generated a multitude of authentication attempts, essentially flooding the MFA system with more demands than it can. manage it.