We Speak Games

As the games market continues to grow and yearly revenues are predicted to pass $200 billion in 2022, the UK’s first conference dedicated to voice in gaming – We Speak Games – launches in Bristol this month. Ben Ackland, the brains behind the operation, tells us more…

Over three billion people worldwide play video games, and the number of gamers continues to rise. Fuelled by the increase in smartphones and accessibility of online streaming platforms, video games are by far the most lucrative sector in the entertainment industry.

This month, creators, developers and strategic thinkers will be taking a step into the future of the games industry at the first We Speak Games summit. This eagerly anticipated event, which is expected to be hosted annually, will take place on 27 September in Berkeley Square and will be streamed live online to delegates all over the world.

“A few years ago Siri and Alexa didn’t exist – they weren’t even part of our vocabulary,” explains Ben Ackland, the brains behind We Speak Games. “Today hundreds of millions of people worldwide talk to these personal assistants on a daily basis.” Ben believes voice in games is set to follow a similar trajectory.

A creative technical officer at Bristol-based studio, Web App Services, who are hosting the not-for-profit event, Ben works with an experienced team delivering solutions for web and mobile. Web App Services is already an established innovator in voice interfaces and conversational artificial intelligence (AI) for gaming and interactive entertainment.

“Voice control and conversation has a huge potential to create more interesting and deeper ways of interacting with games,” he explains. “But it’s in its infancy in the UK. There have only been a handful of talks on it, and no significant events that would give innovators and early adopters the chance to come together. So I decided to run an event myself!”

Here, we caught up with Ben to find out more…

TBM: Can you tell us about your background and experience in the industry and how you became involved in We Speak Games?

BA: I’ve always been creatively driven – I’m a maker, a builder – growing up playing adventure and strategy games and more recently, creating my own. My background is in web and mobile technology, which has a strong overlap with mainstream gaming – known as ‘gametech’. Over the years, I’ve also worked on an early Oculus VR project, business-class event management software, and a live zombie survival experience.

As voice and conversation technology has become more prominent on smart speakers, mobile, and the web, I decided to explore what was possible by applying them to games. That started a couple of years ago, and since then I have given a number of talks on the subject, met some fantastic people in the process, and am now working on making this technology work for game developers. Throughout this journey, I’ve not found any real place or time when the voice-in-gaming community comes together. No focussed event existed in the UK, so I decided to start my own – and We Speak Games was born with the support of my friends and colleagues, and our inaugural speakers.

Why is Bristol the ideal home for We Speak Games? How are the tech and creative industries booming in Bristol?
Bristol is among the most successful European cities for tech investment in 2022 and it’s the third top UK city for tech investment. Tech is Bristol’s fastest-growing industry, with over $1bn invested since 2014. Bristol is well-known for its creative industry and outputs: music (particularly electronic – Xample, Massive Attack, Eats Everything), film and TV production (Wallace and Gromit, Casualty, Deal or No Deal), and it has a thriving art scene – just check out the diversity of street art around town.

The University of the West of England (UWE) offers TIGA-accredited (The Independent Game Developers’ Association) degrees in games development, with students and local developers also being members of the Bristol Games Hub (one of 18 industry hubs in the UK).

In addition to this, Bristol continues to strive to be a green, sustainable, ethical city and these values and ambitions align with my own and those of my team, making technology work for people, and respecting the fabric of society in the process.

The city is big enough for this event to grow, and it’s also well-connected to the rest of the UK and Europe – by trains and a large airport. With Bristol being named one of the best places to live in the UK by The Sunday Times, it’s also a fantastic place to visit. Bristol welcomes one and all – and so do we.

Can you explain what exactly people will be able to do in games with the use of voice input, conversational AI and computer-generated speech and how this technology will change games forever?

Voice input allows a player to speak to the game itself (e.g. navigate a menu or interface) and/or talk to the characters in it. One may be familiar with ‘controlling’ Alexa or Siri to perform tasks. This is stage one for games – voice control; stage two is having conversational interactions with characters – e.g. asking a character if they have a key for a castle, and them pointing you in the right direction (or not); stage three is having more-freeform conversations with characters – e.g. formulating a plan together, discussing the strategy of an ongoing battle.

Conversational AI allows a system to determine appropriate responses to a player’s words or speech (voice or text input) in a way that appears intelligent – ie. not entirely pre-programmed. This is making it possible to have a wider variety of responses, more relevant responses, and more dynamic responses (e.g. specific to a player) when interacting with characters and their worlds.

Computer-generated speech is the process of trying to create human-like (or in some cases, deliberately robotic or cartoon-like) speech in a computer, often from text and other contextual information. So when characters (or e.g. a storyteller) ‘talk’ to the player, their speech can be generated on-the-fly depending on the situation. Some responses will be pre-defined, some will be dynamic. Regardless, the dialogue is delivered consistently using each character’s voice. All three of these come together to deliver what we’d call a natural language interface between the player and game/characters.

How will this technology make games more accessible to people with disabilities?
I think it’s useful to consider this in two parts – understanding the game and interacting with the game. Computer-generated speech – also known as ‘AI voices’ – offers the potential to help a player better understand a game if they otherwise get a less-than-ideal experience. For example: if they have limited vision or are blind.

Voice input – speaking to the game or characters in it – has the potential to provide a more engaging way to interact with the game. If standard controllers are a challenge to a player (which might include people who just aren’t familiar with them), using speech can offer a viable, intuitive alternative. I very much see these going beyond instructing the game to do something – e.g. ‘turn left’, ‘unlock the door’ – although that’s a core principle. We’ll go into this in more detail at the summit.

Can you tell us about the speakers at the conference and what audiences can expect from the event?
From the outset, we have aimed to cover a broad set of topics that fall under ‘using voice technology in games’; and in the process, consider what we can learn from the past, what’s possible now, and how we unlock the potential for these technologies going forwards. We expect to cover around 10 key topics in at least six talks, including audience Q&A. There will also be a handful of related demos along the way.

We’ve tried to encourage speakers with a variety of profiles to join us – from both inside and outside the industry, any gender, ethnicity, research or application-led background. Our line-up of speakers includes Thomas Keane – developer of voice-controlled adventure game, Unknown Number, who has worked with some of the world’s leading tech brands including Xbox and Microsoft. Nomi Gallagher – from gaming charity SpecialEffect – who will share a case study on the effectiveness of adding customised voice control to a mainstream game for improved accessibility. Chris Woolcott will also share his lessons from developing two popular role-playing games for Alexa, and will join us remotely from Nashville, Tennessee.

After the talks there will be a chance to network at The Square Club where we have a private area for a drink or two. Attendees can join the complete event in-person, or the talks remotely via a live stream. We’ve kept the prices as low as possible and this is a not-for-profit event.

As the UK’s first conference on voice in gaming for creators, developers and strategic thinkers, what do you hope this conference will do for the industry as a whole?
For creators, developers and strategic thinkers, I hope this summit will give them the knowledge, inspiration, tools, and a community in which to start exploring this technology – a head start if you like. I also believe it’ll give them some insight around privacy and accessibility; in order to make decisions and strategy that takes into account ethics and people when they make use of voice, cloud, and AI systems. For the industry as a whole, I hope the event raises awareness of the potential of voice, and gives those who are leading the way a chance to share our experiences and innovations. The event is also an opportunity for tech investors to explore the gaming space, and investors from the creative sector to explore gametech.

By 2030, what do you predict games will look like? What is the future of gaming in your opinion – and where does Bristol fit in to it?
The metaverse is going to creep into our lives more and more, whether we like it or not, largely driven by big tech’s global mission. We can already see this in the games world, with the continued growth of immersive 3D platforms like Roblox and Fortnite. Hundreds of millions of people are already hanging out in them, and the time we spend in these kinds of worlds will likely increase as our mainstream experiences become more virtual and all-encompassing.

As we interact with these virtual environments across a variety of devices, I expect voice technology to continue to grow as a way to interface with these worlds and their characters. While we might tap a message to a virtual assistant today – e.g. a customer services helpdesk – typing messages won’t be the interface of choice on consoles and TVs. Voice will provide the most familiar way to communicate, through natural language.

Conversation with characters will offer deeper immersion in games – e.g. voice allows for a broader expression of emotion; it allows for negotiation; and it’s almost infinitely descriptive.

Alongside this, as developers and studios leverage more cloud services, they will become aware of the value of their data and how to manage and protect that in an increasingly connected environment.

I hope that we’ll see a continued push to increase the accessibility of games, and a commitment from project leaders to consider players’ privacy as devices become more and more integrated into our lives.

Bristol can play a substantial role in this – as a top UK and EU city of creativity and innovation, with a genuine desire to be ethical and welcoming, and an exciting location for in-person events. Perhaps we’ll even see the first voice in gaming teaching module here in the UK?

We Speak Games summit will be held at Origin Workspace in Berkeley Square from 2–6pm on 27 September; wespeak.games