Can AI fix Wall Street’s ‘spaghetti code’ crisis? Microsoft and IBM are betting that it can

There aren’t a lot of businesses built around tape cassettes or floppy disks, so a lack of experts who can repair those decades-old technologies is rarely a problem. That’s not the case with Cobol—a 64-year-old programming language that Wall Street and the federal government rely on to process tens of trillions of dollars worth of transactions annually. As Cobol gets older, those massive organizations have been hard-pressed to find people who can update their ancient systems.

When something does go wrong, many firms turn to 82-year-old Bill Hinshaw, the “Cobol Cowboy.” Hinshaw works out of his home office in northern Texas, where he oversees a remote team of some 600 aging Cobol engineers—some of whom cut their teeth as programmers in the ’60s and ’70s. Every week, the cowboys respond to emergency calls—including one in 2021 from Superior Welding Supply, a 93-year-old Iowa firm whose only in-house Cobol expert died just before the company’s software crashed.

Firms of all sorts are wrestling with how to maintain old code, typically Cobol, that still runs but that is often poorly documented and hard to modify—part of a sprawling problem that programmers call “spaghetti code.” For years, government agencies, the media, and large banks have sounded the alarm over their aging technical infrastructure. Now tech giants like IBM and Microsoft think they may have found a powerful tool to wean us off Eisenhower-era tech: generative AI.

“Spaghetti code”

Cobol is out of sight and mind when we transfer money from savings to checking, but it is critical for banks’ day-to-day operations. These also include adding customers to a database, enabling ATM transactions, and payroll processing. It’s even pervasive throughout the federal government, which has used the same system since 1973—built with approximately 1 million lines of Cobol code—to process student aid applications.

There are somewhere between 220 billion and 800 billion lines of Cobol still in use today, according to OpenText, an IT company. “The vast majority of major banks, when you get down to the actual core banking systems, you get down to Cobol,” said Michael Abbott, the global banking lead at Accenture.

After programmers designed it in 1959, Cobol quickly became the de facto language for data processing, which meant that banks, insurers, government agencies, and any other company dealing with terabytes of information wrote and maintained millions of lines of Cobol code.

Legacy Cobol, however, lacks the efficiency and versatility of upstarts like C, Java, and Python that emerged in the ’70s and later. These languages were built differently and allowed engineers to better structure and reuse previously written code. Eventually, startups and Silicon Valley ventures began building with Java or Python rather than Cobol.

Generative AI could help wean corporate America off Eisenhower-era tech.

As Cobol has fallen out of favor, maintenance costs have soared. Moreover, since legacy Cobol code is often poorly documented, repairs and upgrades take longer. A major overhaul may not be viable in a fast-paced economic environment, where interest rate changes, for example, force banks to rapidly update the products they offer customers.

“It might take them nine to 12 months to launch by the time they untangled the spaghetti and updated the product inside of it,” Abbott said. “With a modern architecture, that can be done in weeks.”

The costs of an upgrade, meanwhile, can run into the hundreds of millions. When Commonwealth Bank of Australia finished replacing its core banking system in 2017, the process had taken five years and cost almost $750 million. And, if done poorly, upgrades can result in catastrophe. In 2022, a U.K. financial regulator fined TSB, a British bank, more than $60 million for a failed migration to a new IT platform after thousands of customers were unable to make online payments for weeks.

Code assistants

Ruchir Puri, chief scientist at IBM Research, jumped up to a whiteboard on a recent afternoon at the company’s leafy campus north of New York City. With two markers, the mustachioed executive, wearing a ball cap and white dress shirt, sketched out how financial institutions can use watsonx Code Assistant to translate millions of lines of code from Cobol to Java.

Yes, he said, code transpilers, or translators, have existed for decades, but these older systems translate spaghetti Cobol into spaghetti Java. In other words, a poorly documented and hard-to-understand legacy language becomes a poorly documented and hard-to-understand modern language.

IBM’s solution, he said, goes beyond existing transpilers. Through generative AI, or the same technology that powers OpenAI’s ChatGPT, IBM can turn poorly documented Cobol into structured, easy-to-parse Java. Puri said IBM’s tools can increase the speed at which companies modernize their code bases by a factor of up to 10.

And this could save Wall Street—and the government—billions. In February 2018, for example, the Air Force finished modernizing computing systems responsible for the management of supplies and equipment for war-fighting missions. The three-year process, which included translating millions of lines of Cobol into Java, saved the government $25 million in annual computing costs, according to a 2019 report. Imagine, then, how much taxpayers could have saved if generative AI cut the three years needed to modernize the Air Force’s systems into months.

Unsurprisingly, this has translated into a major business opportunity for IBM. Puri said big companies are clamoring to use its Cobol-to-Java translation software. “The majority of Fortune 100 companies are our clients, and all of them, every one of them, is engaged at this point,” he noted.

For Thomas Dohmke, the CEO of GitHub, the Microsoft-owned software platform for developers, Cobol is also top of mind. “Cobol still running on mainframes is a much bigger societal problem than we think,” he recently posted on X. He said he’s heard more about Cobol in the past year than he has in the past 30 years.

And just like with IBM, customers are reaching out to GitHub to see how its generative AI–powered assistant, GitHub Copilot, can help modernize their legacy infrastructure.

Powered by the same algorithms that drive ChatGPT, Copilot functions like auto-complete for email. As programmers type—the model is trained on all programming languages that appear in public GitHub projects—Copilot suggests line edits. Dohmke added that programmers can also use the tool to highlight swaths of Cobol code and ask Copilot to explain what that code actually does. And just like IBM’s own coding assistant, Copilot can translate swaths of Cobol into Java or any other programming language.

“Generative AI and Copilot will make our lives easier in maintaining those old code bases,” he said, “and ultimately modernizing them.”

Riding into the sunset

The hype of AI has reached such a crescendo that even Hinshaw, the Cobol cowboy, is advising a startup on its own AI and Cobol product. Hinshaw, whose leather hat hangs on his office door in Texas, is not afraid, though, that AI will force him to ride off into the proverbial sunset. “At three o’clock in the morning you may get a call saying this program fails,” he said. “How do you get AI involved in that, so you can have it up and running in that hour?”

The executives at IBM and GitHub also took pains to say humans should be involved in the process of code translation and modernization every step of the way. And both concede that AI-generated code isn’t perfect, and should therefore, just like human-generated code, go through a battery of tests. Stanford researchers found, for example, that developers who used OpenAI’s coding assistant often wrote less secure code than those who didn’t.

But despite the risks, those enmeshed in the financial industry, like

Abbott from Accenture, hope that products like IBM’s or GitHub’s can untangle the spaghetti code that so often bogs down big banks. “I will tell you,” he said of generative AI, “I think it holds enormous promise.”

Tech’s aging building blocks

Many businesses and the federal government still run quite well on these 20th-century coding languages.

1957 | Fortran

John Backus leads a team at IBM to create Formula Translation, or Fortran, one of the first programming languages to incorporate natural language (e.g., “if,” “read,” “write”) into its syntax.

1959 | Cobol

A group of computer scientists, including famed programmer Grace Hopper, prompts the development of Common Business-Oriented Language, or Cobol, which becomes the de facto language for data processing.

1972–73 | C

At Bell Labs, Dennis Ritchie develops C,

which still consistently ranks among the five most popular programming languages.

1983 | Ada

French programmer Jean David Ichbiah and his team devise Ada, which briefly becomes the Department of Defense’s programming language of choice in the ’90s.

1987 | Perl

American programmer and linguist Larry Wall releases Perl, which sees a mini-boom in the early 2000s before Python supplants it.

This article appears in the October/November 2023 issue of Fortune with the headline, "Can AI fix Wall Street’s ‘spaghetti code’ crisis?"

This story was originally featured on Fortune.com

Advertisement