Security and Business Continuity for SWIFT Connections

And the Alexander Hamilton Gold Award in Operational Risk Management goes to ... Microsoft. Congratulations!

For more than a decade, Microsoft has used SWIFT to connect internal systems with external financial institutions. As other processes became increasingly automated over the years, the SWIFT connectivity became vital to the company’s operations.

Today, the Microsoft treasury group manages 1,300 accounts in 85 banks around the world. “We send about $300 billion a year in payments through SWIFT, and we receive status messages in return,” explains Lisa Wagner, group manager for cash operations within Microsoft treasury. “Our accounts payable [A/P] and supply-chain payments move an additional $60 billion to $100 billion through SWIFT every year. On top of that, treasury also gets electronic bank statements on a daily basis, and the statements are large—an individual statement may include thousands of transactions.”

When a bank statement comes in, it auto-posts to Microsoft’s general ledger. The balance on the customer accounts reflect the payments on the statement, which frees up customer credit and enables Microsoft to make additional sales to that customer. If this functionality were unavailable for a period of time, the outage might hamstring the sales team’s ability to sell to certain customers. Moreover, having real-time access to bank statement information enables Microsoft treasury to optimally fund all 1,300 accounts, to maximize the company’s global investment returns.

“We’re using SWIFT messages for liquidity analysis, daily positioning, FX [foreign exchange] exposures, and messages we’re moving for trade finance,” Wagner says. “SWIFT is the conduit to move information from our systems to our banks and from the banks’ systems to ours. If a statement failed to come through, or if the system went down, the potential impact to our business would be really significant.”

Concerned about the possible ramifications of a failure in SWIFT-related communications, Microsoft’s treasury engineering group set out to develop a solution that would ensure process continuity in the event of a system failure, natural disaster, or other workflow disruption. Treasury engineers worked with different groups that used the SWIFT connection to map out data flows, as well as auditability requirements in the event of a crisis. These included treasury and A/P payments; the supply-chain management team’s invoices; and treasury compliance, audit, and general ledger (G/L) posting processes.

The treasury engineering team determined that they needed to implement additional circuits between SWIFT and the company’s data centers. “When we started the project, we had a single path between our internal systems and SWIFT,” explains Aaron Brooks, senior service engineer. “There was no redundancy, so if that one path went down—even when we were doing updates to the system—the whole service was offline. To solve this problem, we spun up two new paths, in two different data centers. All the services that rely on SWIFT connectivity can seamlessly fail over among the three paths.”

Microsoft intentionally avoided setting up the SWIFT infrastructure as one primary connection supported by disaster recovery links. “The challenge with a configuration like that is that the disaster recovery path doesn’t often get exercised,” Brooks says. “It’s treated as a second-class citizen, so it’s not surprising that when the organization finally needs to use it, they usually run into problems. We took the approach that we want all three paths to be equal, and we want to exercise all of them all the time.” The treasury engineering team built a fully redundant end-to-end process for sending treasury payments, A/P payments, and supply-chain invoices, as well as for receiving files such as MT messages and electronic statements from Microsoft’s banks.

“Our strategy is to have multiple paths capable of running in production at every point in time,” says Wumi Fagbami, senior service engineer. “That way, we can move between paths depending on what our needs are at the time. That makes it easy to adjust to any changes in our environment. As an engineer, I think of my role as making sure the service is always running, and for any outages within our environment to be invisible to the business.”

Microsoft also transitioned to an environment in which its SWIFT platform runs in a virtual private network (VPN) built in the treasury data center. This setup enables treasury to take daily snapshots of their servers as a backup, which means that they can easily recover their SWIFT environment in the event of an unexpected failure. “Previously, our SWIFT application was isolated from the rest of our systems, requiring system operations to be performed directly in front of a console,” says Fagbami. “Since we have built in much more automation and security checks around the application, along with the redundant paths for SWIFT communications, we have saved more time and caused less worry.”

The treasury engineering group performs security checks on wire transfers, looking for tampering or duplication. At the same time, the team performs penetration testing, through which internal security staff simulate cyberattacks to test whether security controls are working properly. “The average wire transaction for treasury payments is significantly large,” Fagbami says, “and recovering funds from third parties can be difficult.”

Now, Microsoft treasury operates with the confidence that automated bank communications are both secure and prepared for disaster. Even if a data center were to go entirely offline, payments and supply-chain invoices would continue to flow to Microsoft’s banks, while statements and SWIFT messages would continue to flow back. This means that whatever is happening in the external environment, customers’ accounts will always reflect their latest payment status, and decisions around hedging and account balances will be based on accurate and timely information.

Getting to this point required a substantial commitment from teams across Microsoft, as well as external software and networking vendors, and the company’s supply-chain partners. “So many people worked together to get this done,” Fagbami says.

Wagner agrees. “Clear communication was key to the success of this project,” she says. “Every team has its own priorities. Moving this project forward required us to understand what was going on with our partners at every stage of the project, and to utilize tools that fostered open communication with them. We leveraged those tools to make sure everyone involved in the project understood the vision for the initiative and the potential business impact if the SWIFT system went down. That was crucial to making sure the project was properly prioritized across all the groups that needed to be involved.”


See also: