Large Language Models – Revolutionising Compliance – [FULL INTERVIEW]

As Joseph Twigg from Aveni explains the accuracy and accessibility of speech analytics has significantly improved with the introduction of large language models. This has caused speech analytics adoption to skyrocket, particularly in high-value and complex interactions like financial services.

Automating the process of evidence and understanding customer circumstances can lead to a scalable and efficient operating model, providing additional insights.
Large language models can perform compliance monitoring, quality assurance, and consumer duty assessment for every single interaction, which can help identify high-risk interactions and flag them for review. The future may see all conversations recorded, making it easy to remember what was said.

While training large language models on broad data sets requires additional domain expertise and controls, they are designed to generate documentation in the voice of the customer, saving time and effort.

Overall, large language models will fundamentally change the way businesses operate, offering opportunities for scalability and efficiency.

Find out more about Aveni -> Here.

Interview Transcript

Hi, everyone, I’m here with Joseph Twigg. Today he’s the CEO of Aveni in the Natural Language Processing space with expertise in reg tech and compliance. So Jospeh, thanks very much for joining me today. Really appreciate it.

Thanks for having me. So we were just chatting earlier a little bit about the journey we’ve been on around analysing conversations and alleging messaging and analysing interactions. Yeah, I think it’s been a really interesting evolution. Like you said, starting off with recording the majority of calls, the different regulations coming in over the years that was the catalyst behind that. And then automatic speech recognition, really sort of starting the process of utilising the customer voice for analytics. If you go back seven or eight years, you probably had the first wave of sort of speech analytics capability. And I think that largely disappointed, you know, speech recognition was probably not good enough, too many mistakes in transcripts, too much too many issues, and to end so the promise and the hype largely underwhelmed in that sort of first fears of adoption. I think over the last couple of years, in particular, that’s fundamentally changed, you know, the

underlying speech recognition capability has dramatically shifted, it’s been commoditized. By big tech, it’s available to anyone with cloud infrastructure very easily. And the underlying scientific progress around different discrete parts of language processing is been really amazing, the field has moved so quickly. So from a sort of product perspective, the challenge has been taking bleeding edge research, and converting that into a useful tool in situ and part of a workflow or process. And yeah, and I think the technologies really come of age, I think speech analytics, as a sort of as a sector as has been growing pretty rapidly. There’s widespread adoption, and things are just about to get very crazy, thanks to large language models, introduction of GPT, Bard and others, why do you sort of see the state of evolution in terms of where the majority of people aren’t given that some people are both ends?

Worth as you said, it’s a sort of spectrum, isn’t it? I mean, what we typically see now,

all large banks outsourced contact centres, where there are typically high volume, potentially low value, interactions, speech analytics is used to drive efficiencies into the process. Number one, there’s been a huge push, obviously, to try and push customers down different channels, where the cost to serve via web or chat bot is materially less than the cost of serve via contact centre. But when a contact centres hit, yeah, you’re seeing the vast majority of those larger high volume type environments using speech analytics, where you see very little adoption of speech analytics is in the high value slightly, maybe lower frequency interactions, you know, more complex, whether it’s just advice, financial advice, things of that nature. There’s relatively low adoption in that space. And this sort of quite a big opportunity, I think, like, how much do you think is now becoming expected to be able to be able to do it? It’s almost like I mean, people are sort of expecting now to what does listen back to the call, we’re in that conversation, particularly about consumer duty that’s going on in the background, do you think this is going to be expected from an evidencing point of view to be able to, to be able to recall the conversations and search the conversations, and not just conversations, also messaging and any of the interactions that kind of happen? It feels like we’re going back kind of way.

100%? I think the question becomes, why wouldn’t you do it when you know, the requirements to evidence your interactions to identify circumstances where harm could occur, to really evidence and understand the changing in customer circumstances? The sort of things to do manually, is really challenging to assess the outcome for a customer across their whole customer journey, when they first took our product or service to their current circumstances, is a really, really challenging process probably requires interaction with multiple systems. If you don’t introduce degrees of automation, the impact of that is actually going to be a reduced assurance problem and potentially an unintended consequence for assurance and governance around consumer up you know,

People have to spend 10 times more time assessing one case, then, you know, unless you’re going to get 10 times more people, the total coverage of assurance is going to get it’s going to be, there’s going to be less. So yeah, the simple answer is you will act as is material catalyst for the adoption of this technology. And companies will reap the reward for that. There’s not just about ticking a box for compliance. This is about automating process, really driving a scale eatable operating model and understanding your customers and driving us uniquely from that additional insight that you can get at one level, it’s almost like this look at the power of search, isn’t it the fact you can search every single interaction and find out what’s going on just like we can with the internet, but being able to do that with all our interactions, right? It’s just, it’s incredibly powerful, isn’t it? And there’s an element of that, I think, I mean, that that capability from a from a sort of customer contact perspective, perspective is already here, you can search, any interaction that you’ve had, since you started using our platform through one one search bar. And that can be quite an intelligent search as well. I think

the recent evolution in this technology is moved that one step further on. So we call it machine assessment, a machine line of defence, if you will, where essentially, natural language processing models are performing your compliance monitoring for you that they’re performing your quality assurance for you, they’re performing your consumer duty assessment for you. And that is done for every single interaction across the board. And instead of having to search through your sort of backlog of customer interactions to find something, the models are triaging the things that you want to search for. So the high risk interactions, the indications of harm the gems in circumstances, things of that nature, they’re flagging those to your team’s vulnerable customers, you need to take a look at this customer, their circumstances have just changed that goes to the top of your pile from an assurance perspective.

I asked you about 100% Call Recording. It’s like why would you not do it? Why would you not do 100% analysis, interaction analysis? And we haven’t from a business point of view, I suppose the scary thing is what would it be like if we were going to have it from a personal point of view? We already see things for example, like cameras going on people’s on people’s chests are recording much more video, everything can now be recorded. Do you think do you think we’ll ever get to the point where personally, we end up recording all our conversations? And because it’d be very helpful for me in terms of actually remembering what I said?

See also  From Legacy to Leading: Tech for Financial Excellence

Who knows, I think, yeah, you can definitely see an environment where you’re in the sort of workplace of the future.

As part of your onboarding,

you have to potentially read a script, so your voice can be recognised and associated with you. And then every meeting room, every every interaction you have, is by standard by standard recorded with meeting notes and actions automatically generated, with follow on meetings automatically put into everyone’s diary. And that all done by essentially recording meetings and the interactions, again, the capability to do that, like nothing needs inventing to make that happen in all the capabilities here, but you can see scenarios like that it definitely emerging. And that also equalises a little bit the sort of the hybrid work pattern, being able to capture both sides of your work life, both in the office and via VC, I mean, chat gtps, Quora come up, I think it’s the fastest growing launch has ever, ever taken place. He’s getting time I think it went to like a minute. First time 2 million customers was reading today in satellite in five days, I think it was over a million million users in five days, it’s been the fastest ever. There’s been a lot of chatter about that. And other so suppose GPT, or generative text, pro programmes, tools that have been out there, including like Microsoft, etc. What What’s your kind of view in terms of where that, where that, that technology goes?

Yeah, it fundamentally changes everything. You know, I think the last time we’ve seen something like this, in the context of impact was probably the Internet. The boat, then the internet seemed to take sort of 1015 years to hit a common standard of adoption for businesses. And it took quite a while.

This has been this has gone from sort of zero to 100 for most people in the space of a few months. And you know, you fast forward 10 years, and look at the way businesses operate. There won’t be many things that people can do in in the service industry in particular, but across lots and lots of industries that you can that these large language models won’t really outperform. Think the CEO of open AI who’s

was the br in chat GPT clearly stated like within that time for him, he would expect large language models to be able to outperform humans at almost every economically valuable task. So and then you look at companies that the stature of Google and Microsoft, getting so flustered, you know, Microsoft, Google wiped off $170 billion from its share price in a week due to the botched release of bards having some sort of mistake in the marketing materials, it’s not often you see companies of that size and scale, get so flustered off completely ripping up its approach to releasing products, like absolutely out of character for for Microsoft to rush the release of being GP and just let it go out in the wild with a million users. They did that basically to get user feedback and research on a scale that they couldn’t possibly do themselves. But they also really risking their brand doing that. So you see companies getting of that scale, the biggest tech companies in the world, that should give you an indication of what their think, potential is here. Do you think it is like the internet a little bit in terms of what like when you have that we had that initial growth? The internet was almost like magic at first, right? It’s a bit like chat GDP is it’s kind of like it’s, it happens first, we could find everything. Can you spend us to surf the internet? Remember that used to surf the internet? No one does that part of the life. We went through that wave and then you had crash? And then you got to the real sort of use cases, do you think we’re still in that change curve where it’s almost like heavy adoption? And we’re awed by the magic of it, getting us I don’t know, composing poems and the start of Shakespeare and those kinds of things as much as the actual use cases, is it got a bit of maturity to go, do you think, I


the demonstration of the underlying capability, which is fundamental to so many processes, so the ability to understand context and return the understanding for additional questions, the question answering approach, the possession and tax generation, these things, these capabilities can be used in so many things from PR generation chatbots, through to writing code, that the applications are so broad. And then I think the demonstration of that capability has been so good and really surprised on the upside with chat GPT, I don’t think there’s much more to go from that context. I think the core capability has been demonstrated in a way that is very clear, like this is sort of ambiguous what the potential is in terms of adopting the technology for specific use cases, there is a very long way to go. So the live example I got access to being GPT. As you do, the first thing you ask is Who is Joseph twig? And

it wrote a summary of me and my experience, and it said Josie twig is the CEO of of any formerly Joseph was a senior consultant at Accenture.

I’ve never worked at a centre. Everything else that got in that statement was absolutely bang on perfectly right. You just got this one bit that said that word sent you up never had any relationship never done anything was essential. It Loose near that. That level of reliability, required for specific use cases is quite a challenge. These things are trained on very broad data sets. You’ve got to add additional domain system, domain expertise, demand systems, additional information retrieval, additional controls and guardrails resolve the adoption gap for specific use cases. And this sort of middle layer the gap between a large language model not necessarily GBT, there’s multiple in the pipes. And the specific adoption, these specific use cases is going to be where lots of time and effort is spent. Think right now you can see loads of use cases that can be used for generating marketing material generating PR, any sort of copywriting giving humans a new starting point from which they start their job, but to an AI adviser giving financial advice, without any human intervention. I don’t think anything new needs inventing to do that. But we’re a long way off in terms of providing the assurity That is correct. But one of the things that are concepts I’ve been chatting to a few people about is whether the fact if you’re talking to a human and you’re getting advice from a human, do you have a different expectations around its accuracy, and it’s on both sides, whether you’re given the advice or not, you have different interpretation around the accuracy or the expectation or an accuracy versus if you’re talking with a computer and we’ve pretty much been ingrained around the computer is always right right even when it’s not. The expectation is the information is going to be blocked down and it should be almost all regret if you’re talking with the human past. If I made a mistake

You might say, Well, no, I didn’t do that. And I’ll say, Okay, well, I made a mistake. But the computer probably doesn’t say that. Right. So I mean, our expectation is different, and probably rightly so. But we had, is that something we’re going to have to deal with? Do you think with particularly talking about probabilistic type models in terms of generating text and those kinds of things?

Yeah, is this is a very well understood field in human computer interaction. There’s some really good examples of it over the last few years, if you take auto pilot at Tesla, if there’s a crash on autopilot, it’s it’s headline news. The same day, there were probably 1000 crashes in exactly the same scenario that were human error that don’t even don’t make the news at all. And it’s all because of who do you blame? Like, whose fault is it? Where does the liability lie? So yeah, human computer interaction, the the sort of instinct not to trust a compute, like compute derived output, if it’s if there’s any ambiguity, oops, wrong. It’s really strong. We see all the time. If there’s an error in a transcript, people like oh, that, you know, what’s that? They make that mistake. That’s not correct. And then they’re listened to the audio? Yeah. Oh, god. Yeah. Now I can see why it said that, because that sounds, it sounds like, you know, the output of the speech recognition is what the audio sounds like. But this is instinct to be like, Oh, it’s wrong. Whereas you don’t really get that with humans. So yeah, I mean, that burden of proof is high in each. Look, I think when you’ve got something that in principle is infinitely scalable. You know, you’ll never want human making a mistake. It’s that one person, if you can, if you could magnify that mistake across the total population, it’d be a much bigger problem. And that’s the potential of systems like these. So you’ve got you need that. You need the guardrails, you need the controls, you need the assurance to make sure the systems are performing. Yeah, back to your example, around meetings, or even in terms of like conversations that might be happening, of course, like that, that transcript of what actually happened. But then also, the summarization, I think is particularly interesting, you’re in a meeting. And you use something like a GPT model to basically say, Well, what are the summary bullet points, who are the actions, those kinds of things, I mean, it feels like that that’s tantalisingly close, and could really save us a lot of time with so right now, you can summarise a meeting GPT, but it will give you a generic summary of a meeting that might be useful that might not, that might be accurate, that might not and that’s what I’ll give you. We’ve developed a solution already using large language models, one of them powered by GPT, that is designed to take the output of a financial advice meeting and generate all the documentation required. summarise the output in the context with the voice of the customer, capture all the risk assessment, the income, the goals, summarise that in a way that a financial adviser would write it, populate a suitability report. So when you finished a meeting, you can press generate email, and it will generate an email summarising the conversation and send that you can press generate notes for a paraplanner it’ll generate the notes for a power planner, you can press generate suitability, and it will populate and Sue Sue bill in the last probably five or six hours of low value administration that you could resolve immediately the journey to get there is being super, super challenging, like the so we had one example really good example. These are all, obviously, some demo scenarios. But literally based on very real meetings, the client mentioned in their assets that they own the house, but they didn’t talk about the value of the house.

See also  Digital Innovation: Handling the digital channel explosion - [FULL INTERVIEW]

GPT by itself

hallucinate the value for the house. In the summarization, it just gave it a value. Like, wow, where’s that come from? So the process of prompt engineering. So what questions you need to ask these large language models? What restrictions do you need to say, you need to literally tell them if you’re not sure don’t put it into, you know, it’s that level of interaction. And you have to go through this process of refining the way you interact with these models of adding additional restrictions, additional systems, that it pulls information, gold standard information from before you can rely on it. But when you get it’s fascinating, I mean, you can literally we could finish this meeting. And we could have notes, actions, emails generated, if you want to write me a letter to thank me, you know, that can be generated and that’s instant that’s done. That really feels like almost like an initial use case outside, you’re certainly the ones that I would be close to is because it just saves so much time and you can generate it and you can get around any kind of like bias to a certain extent because it’s summarization of what’s actually happened. So it’s almost like here’s the transcript or here’s the audio and here’s the transcript if you really want to go back to but here’s a summary

To basically save your time, and it feels like that’s almost like the first, our first really valuable use case, I’m chomping at the bit to get to be honest, well, the real IP and magic sauce, and what we’ve done in this particular use case is you can highlight any section of the Summary. And they will give you a list of data points where that data is coming from. So you can validate immediately any material data points by just highlighting them, and it will take you back to the source of that data. And that traceability component is absolutely essential to meet the sort of the reliability sort of requirements. And even the perception requirements, you think about the human computer interaction chat we just had, you’ve got to give the user confidence that this is getting it right. And we can highlight, we can give confidence levels, we can do all of that. And that gives it trust traceability, and that enables the use, we’re still we’re still we’ll put our hands up and say this is this has to be human plus, right now, I wouldn’t attach that to a robo advisor and take humans completely out of the loop. You know, you still want a human to read the email, on validate, you want them to we can extract from the conversation to populate a CRM system. Read that first, make sure you’re happy with it before you populate in it, read the suitability letter. But your starting point for this admin has just moved from zero to 95. You know, and if you multiply this of

companies with large teams, the savings here can be material. And this is this is relatively

a relatively simple use case, in the context of the power of these models. It’s logically the first thing to tackle but you can throw documents into that all of a sudden, you can ask GPT a question about multiple documents and do a case assessment, you know, these things, like I said, are on the cusp, but they’re pretty much ready now. So one of the things I did find, which is because of interesting around the actual maths behind it, and because it’s there’s probability and maths and as I kind of understand it as if you sort of throw summaries of summaries of summaries into chat GDP, then it comes out with an average of all of them, right? So you go from quite sort of specific, if you do all that first level, we don’t get a second a third level, it becomes the averages of the averages, and you get very generic kind of content. And also, if you ask the same question, every time it comes back with the same answer, unless you’ve changed one of the variables slightly, it feels like so there’s always like that element of creativity and making mistakes that humans do, which sort of generates different kinds of outcomes. I mean, is that something you’ve got to be careful of with, even with language models, like you’ve got to stay quite close to date, otherwise, it becomes almost like bland to a certain extent or just generic?

Yeah, I’d say that’s the difference between hearing a product with this sort of technology, versus using the sort of retail off the shelf version is highlighted very much by what you give there. So I mean,

there are multiple parameters you can set within the model in terms of the level of precision and things of that nature, the length of the summary. But really, you can you can set a whole range of prompted rules that define the output in the way that you want it. I think the the honest truth is at the moment, and the whole world is in the same position determining what the gold standard, we refer to that as prompt engineering.

See also  Do we need a new set of KPIs - for contact?

Determining what the gold standard prompt engineering is, for each output within each use case is a matter of trial and error experimentation. Adding different tooling in this is what every company that’s trying to adopt GPT has pretty much doing in a moment trying to figure out the best, most reliable way to use the underlying capability for the use cases. But you don’t want

I think, for the majority of the use cases that we’re currently working on

the actual knowledge, the output is not coming from GPT the text generation and the summarization and the contextual understanding all that all that stuff is GPT. But GPT is pulling in knowledge from separate systems to give us that guarantee that the output is accurate. But I suppose it just like humans, you can have a copywriter but if you want to have a copywriter writing stuff, specifically around specific industry, you have to be tweaked essentially, you have a specialist person doing it, who knows a little about the industry. So you have to blend the two together almost exactly is that domain expertise needs to be reflected as a knowledge base in the in the system. I think

From a business perspective, especially working in a regulated industry, where the hurdles for use are going to be quite high from a regulatory perspective, reliability control is essential. So turn off

any creativity and any sort of licence to hallucinate, I mean, that is super important. The other half, which is just sort of general usage and creativity, it is mind blowing how how you can utilise what essentially is a mathematical model, and trend on lots of data to be so great if you can, you know, missing examples of people write a song from an artist that no longer writes songs, write a song right away says, I’ve just got back together, the brothers now love each other. They’re releasing an album they’ve appointed you, GBT is the chief songwriter, right? The big hit of their album, and it will write a song. And it will write that song from a combination of the language. That’s in other words, the songs, and it will do it. Within seconds. It you know, it’s mind boggling is mind boggling. And the scary thing is, and this is where you hear the Elon Musk’s of the world, continually highlighting the dangers of this technology. For all intents and purposes, we’re at the sort of start of this journey, the power of these models, in fact, next week is going to materially improve on the release of GPT for

and, you know, overcoming years, I don’t think anyone working in this as yet found a ceiling.

There’s no sort of Moore’s law, there’s nothing limiting the growth at the moment, they’re throwing their models, and it’s continually improving them. I’m sure there will be a cat before they refined. But you know, we’re at the start of the journey with these models. And that’s a genuinely scary thing when they’re so good. Notwithstanding the challenges of adoption, or so I have a last question that we’ve come a huge way. And you can see just the excitement and the amount of work that’s going on underneath the scenes getting to the immediate future. It’s incredibly interesting. But where do you think we go from here? I mean, what’s what do you think the trends are going to be in terms of the next five years trends in terms of implementation and what we’re going to see particularly from, from a business point of view, in terms of like implementations on the floor? I mean, if your timeframes for that question is five years, I would expect

every single role pretty much every single role in a company in a company within the financial services to have a new starting point.

Powered by large language models, I like things that have been historically difficult if you think about compliance assurance, for every single customer interaction having a comprehensive assessment of vulnerability consumed, you know, suitability, changing circumstances, all driven by large language models, you know.

So, I think, I think

adoption of these models, the functional areas and role and jobs within the business, that powering will be pervasive, within five years, it’ll be everywhere, will that result in a material change in

the structure of the workforce within that timeframe? I don’t think so. I think it will be human plus, quite a lot of the time, I think it’d be new experiences for customers, levels of automation, new levels of insight, new products, new areas of competition, I don’t think it would materially change the structure of the workforce, but you will open that temporary amount, another five or 10 years, and at least the potential for that is there, essentially, to a marketing department that used to have 10 people in it might only need two, just because you the majority of the underlying work when a large language model can is tuned into your internal knowledge bases, is trained on the tenor and tone of your corporate voice.

When you’re generating materials, whether in text, video, images, your starting point is going to be 99% of the way there. So you might find that traditional functions can be supported by far less people. But actually, the output is broadly the same as similar process to what’s happening now. But yeah, that I would I would expect the impact to be huge. I think regulation is going to be key here. You know, is regulation, going to speed things up, slow things down? When are they going to catch up? And I think actually, you know, national sort of governments are quite important in this space as well. I mean, at what point

does those all of UK business being powered by models that are hosted and created by a different country?

become something that a government should really think about. Yeah. So it’s going to be a very interesting space, when you see that with evidence in and the fact that it can be done means now that you now are just going to be asked for it to be done. And it’s that sort of the trend is for more involvement and more use of it. I think it kind of feels it’s going that way.

I think so I think there is, there needs to be a sober assessment beyond the hype, a very sort of steady controlled adoption road, and things have to be done in the right way, you know, we can see the propensity for headlines in this space. And I’ve got no doubt that there will be some attention grabbing headlines over the next year or two with things going wrong. So it’s all about doing things in the right way, taking appropriate control and not trying to go from you know, stage one to stage five, putting the appropriate guardrails in place.

Well, Joseph, thanks, thanks very much for the time and explaining explaining it all to me as well. So I and this was where we’re currently sitting because it’s it is fascinating in terms of where it is. And I think some of the things that you’re at the forefront of in terms of the compliance and how you interpret it, it sounds very exciting in terms of just being able to do that search function, right and really understand it and work out what’s going on and what and how to improve things for customers. So it’s a very exciting space, and I appreciate you taking the time. So thanks very much. Pleasure. Thanks, Chris.

RO-AR insider newsletter

Receive notifications of new RO-AR content notifications: Also subscribe here - unsubscribe anytime