top of page
  • Writer's pictureWilliam Webster

AI in Finance: Tackling Interest Rate Risk with GPT 4

Updated: Jan 11

In this article, I explore the capabilities of GPT-4 in assessing a firm's interest rate risk. My objective is to provide a practical perspective on how advanced models like this can be applied in real-world scenarios. The insights garnered along the way not only shed light on future trajectories but also highlight potential pitfalls to be wary of.

The Starting Point

I've put together a straightforward spreadsheet that illustrates the typical interest rate exposure a small bank or building society might encounter. This often takes the form of long-dated fixed-rate assets funded by shorter-dated floating-rate liabilities. In such a setup, profits materialize if interest rates fall, while losses ensue if they rise.

Such exposure is typically managed using various techniques. One of the most basic is the "gap report," which relies on a basis point value or delta calculation. For this analysis, we'll keep things uncomplicated. I believe a simpler approach allows us to better comprehend how the model can assist us.

The initial step involves asking GPT-4 to reformat the information, ensuring it's presented in a manner the model can easily interpret and engage with.

GPT 4:

Next, I prompt GPT-4 to perform a DV01 analysis. It's important to note that I provided no additional cues, purely because I wanted to gauge its intuitive understanding of my request. Impressively, GPT-4 not only grasped the essence of what I intended but also laid out a set of assumptions it would be operating under.

GPT 4:

I ask it to continue.

GPT 4:

GPT-4 not only presents the DV01 calculation but also articulates its results in plain English, bringing to light some potential limitations of such an analysis.

However, an unexpected discrepancy arises. Based on my calculations, I anticipated a risk figure around £106,000, but GPT-4 returns an estimate of £138,000. This raises the question: is there an error in our methods?

A closer examination reveals methodological differences. GPT-4 employs a textbook duration approach, whereas I utilize a more hands-on discount function method. This latter technique involves adjusting the yield curve by a basis point. To determine if this variance in methods is the root of our discrepancy, I prompt GPT-4 to emulate my approach and provide its results.

GPT 4:

Our results are now closely aligned, though slight differences persist. Without diving too deep, these variances likely stem from GPT-4's preferred approach versus mine. Admittedly, this can sound rather technical, but there's an essential lesson to be drawn from it.

GPT-4 has adopted a methodology that, while not incorrect, might diverge from prevalent industry practices. It has behaved akin to a diligent academic. This highlights a need to watch out for such outputs: they're not wrong, just potentially misaligned with expectations. Once such disparities are identified, GPT-4 can be directed to adjust its response in the context of specific industry practices, which it does commendably. In a novice's hands, GPT-4's output might be accepted without scrutiny, but an experienced practitioner can ensure the proper alignment.

This raises a contemplation: given such a straightforward example, could LLMs potentially replace the conventional software tools used for risk management? It's a question I aim to delve into in the future, but the winds of change seem to be stirring.

Having conducted some technical analysis, let's pivot to an area where GPT-4 truly shines: writing. Could it draft a memo to my supervisor elucidating the risk we've discussed? Let's find out.

GPT 4:

As anticipated, the memo is succinct and zeroes in on the primary exposures, setting a clear agenda for subsequent actions. Impressively, this was achieved in a single attempt by GPT-4. The memo effectively highlights the main concerns, suggests potential resolutions, and points to further analyses that might be beneficial. To provide some perspective, the quality and depth of the memo are such that I would expect it from someone with several years of experience in the field.

What are the implications?

The ability to rapidly produce high-quality documents undoubtedly boosts productivity. This efficiency implies we can either accomplish more with our current workforce or, in the future, require fewer personnel for the tasks we handle today. Many treasurers and risk managers—no offence intended—aren't always the most adept at drafting reports. Many would welcome a tool that both expedites and elevates the writing process. In fact, with GPT-4's assistance, one might be tempted to entrust the role of a risk analyst to an eager new graduate. Yet, I'd advise caution, especially recalling our earlier hiccup with the calculation method. While GPT-4 can certainly elevate the capabilities of less experienced individuals, it doesn't replace the nuanced understanding and seasoned judgment that veterans bring. It's a balance that organizations must thoughtfully consider as they move forward.

Considering the complexities, it might be beneficial for management to grasp the nuances between the two risk calculation methodologies. To this end, I decided to test GPT-4's adeptness in elucidating the distinctions between the two approaches.

GPT 4:

While the output from GPT-4 provides a solid foundation, it may not be immediately ready for distribution. Some fine-tuning would be necessary, but the overarching essence and direction are promising. With more precise prompts, the quality of the output could be enhanced swiftly.

To further illustrate GPT-4's capabilities, I tasked it with crafting a high-level paper. This paper would explain the rationale for shifting from one risk methodology to another—a task treasurers and risk managers often confront. It becomes evident that GPT-4 offers an efficient and streamlined approach to draft content tailored for diverse audiences.


It’s short and to the point but not entirely what I was expecting as it’s now brought in issues (like non-parallel shifts) that I wasn’t anticipating. But it was a one-shot request that could easily be refined by further prompting.

Key Takeaways from the Exercise:

  1. Capability and Application: GPT-4 can craft risk reports reminiscent of real-world standards. While this exercise isn't exhaustive proof of handling a real-life balance sheet, there's a strong indication that, given the right input and guidance, GPT-4 can produce results comparable to traditional systems.

  2. Understanding Computer-Generated Output: Accepting computer-generated information unquestioningly can lead to pitfalls. GPT-4's edge lies not just in generating numerical data but in framing it within coherent, natural language narratives. This offers remarkable benefits but requires users to approach with experience and discernment, ensuring they don't fall into potential traps.

  3. GPT-4's Writing Prowess: As touched upon in prior sections, GPT-4 excels in articulation. It can swiftly render intricate explanations in chosen formats. This proficiency, especially in saving time and elevating report quality, presents a compelling case for adopting the enterprise version of the model. Considering the time saved and the enhanced clarity in reports, the cost of GPT-4 can be quickly justified.

  4. Future Outlook: GPT-4 and LLMs are still in their nascent stages, with the latter being accessible for under a year. Yet, the transformative potential of these models is evident. They herald a shift in risk evaluation and reporting. Compared to the rigid algorithms of traditional systems, LLMs' ability to process and query data in natural language, especially given banks' vast unstructured data, signals an imminent paradigm shift. This evolution might unfold more rapidly than anticipated, perhaps even outpacing the transformative journey of the internet.

2 views0 comments

Recent Posts

See All


bottom of page