Skip to main content
wordpress supportwordpress support services

#136 – Matthias Pupillo on Enhancing WordPress With AI Translations

Transcript
[00:00:00] Nathan Wrigley: Welcome to the Jukebox podcast from WP Tavern. My name is Nathan Wrigley.

Jukebox is a podcast which is dedicated to all things WordPress. The people, the events, the plugins, the blocks, the themes, and in this case, enhanced WordPress with AI translations.

If you’d like to subscribe to the podcast, you can do that by searching for WP Tavern in your podcast, player of choice. Or by going to wptavern.com/feed/podcast. And you can copy that URL into most podcast players.

If you have a topic that you’d like us to feature on the podcast, I’m keen to hear from you and hopefully get you, or your idea, featured on the show. Head to wptavern.com/contact/jukebox and use the form there.

So on the podcast today, we have Matthias Pupillo.

Matthias has extensive experience in the technology and creative sectors, and is currently working as the co-founder of FluentC AI, an AI powered language technology company.

With a background in technology, he’s focusing on developing solutions to enhance communication across different languages and platforms. He’s been involved with WordPress since its early days, around version 1.2, and has a rich history of web design and consulting, having worked on hundreds of WordPress websites. But it’s only recently that he’s become more engaged in the WordPress community through events like WordCamp Buffalo.

In the podcast today, we talk about AI driven language translations, particularly focusing on Matthias’s work with FluentC, which is his translation plugin for WordPress. It supports multithreaded simultaneous translations of up to 140 languages, enabling your pages and posts to be offered in other languages in just a few moments.

We covered the differences between AI models designed for translation, such as ChatGPT, and Llama, which aren’t specialized for this task, and how his platform builds a contextual layer above those.

He emphasizes the importance of context and diverse multi-lingual data in producing high quality translations. FluentC’s functionality involves local storage of translated content in an effort to maintain website speed. This is done using native WordPress hooks, and URL modifications.

Matthias also offers his thoughts on the ongoing multi-lingual support phase of the Gutenberg project. And his hopes for FluentC to evolve from a standalone plugin to an API, which could be used by WordPress Core.

We get into the broader implications of AI in translation, the need for open source models to compete in this rapidly evolving space, and the parallels between AI evolution and past trends like blockchain, and web 2.0.

If you’re interested in the intersection of AI and WordPress, or looking to enhance your website’s multi-lingual capabilities, this episode is for you.

If you’d like to find out more, you can find all of the links in the show notes by heading to wptavern.com/podcast, where you’ll find all the other episodes as well.

And so without further delay, I bring you Matthias Pupillo.

I am joined on the podcast today by Matthias Pupillo. How you doing Matthias?

[00:03:54] Matthias Pupillo: I’m doing fantastic, Nathan.

[00:03:55] Nathan Wrigley: Very, very nice to have you with us. We had a little bit of a chat before we pressed record, and in that chat, Matthias revealed to me that he’s got a long history with WordPress, but not necessarily the WordPress community.

Matthias, we’re going to be talking about AI, transcribing, transliteration, multilingual, all that kind of stuff today. Before we do, would you just give us a quick potted bio of your history with tech, WordPress, however far you want to go back.

[00:04:19] Matthias Pupillo: Oh yeah, absolutely. So I’ve been a software, I have to say commercially, building software for 25 years. I’ve been recreationally building software for 35 years. So I started pretty much when I was eight building code.

And I started in WordPress with 1.2. I was writing hand coded HTML in Microsoft notes, and so it was a dramatic shift back then in 2002, 2003.

And I was running my own consulting firm, doing web design professionally, and found WordPress by, it was a divine intervention one day. Someone wanted to pay me for editing, and I didn’t know how to write software, besides HTML, CSS and Java. And Java back then was not building a website. It was a complicated journey and it was fun.

The day WordPress 2.5.5, when we had tabs, that was great. And then we got 2.6 and it went horizontal menu, that was a fun day. It’s been a long road with WordPress. I think I’ve built two or three hundred websites with it, maybe more. Not to mention coaching, staffing, and like guidance from an architecture standpoint.

[00:05:22] Nathan Wrigley: That’s a really long and storied, well, a really long story basically, so that’s lovely. But however, one of the things that you said a moment ago was that, although you’ve been using WordPress for a long time, the community side of it is more recent I think. Only in the fairly recent past that you’ve got yourself out to events, and started to interact with the community more. Is that right?

[00:05:41] Matthias Pupillo: Yeah, that’s right. Yeah, so I built the translation company FluentC, we built for apps, and GraphQL, and other integrations. And I forgot WordPress, I really did. Our website was built in WordPress, our marketing flow, our CRM, everything was in WordPress, and I forgot to build the engine.

So, out of my shame of forgetting that, I rapidly built the plugin. Then spent four months trying to get it approved, and then joined the community in person. And my first WordCamp was in Buffalo this last May.

[00:06:09] Nathan Wrigley: You alluded to it earlier, but I might as well get the URL out there. So FluentC is the URL, but it’s not what you are thinking, I suspect. I imagine you are thinking it ends in a Y, but my records here show that it’s fluent, and then the letter C, so F-L-U-E-N-T-C, dot io. So, fluent, the letter C, dot io. What is this service? And we’ll come to the WordPress component in a moment. But obviously you built the SaaS version, if you like, first. What is its MVP, if you like?

[00:06:39] Matthias Pupillo: Yeah, so it came out of a problem that, I’ve built a lot of apps for healthcare, and it always offended me that they were only in one language. And the cost, time and effort to build a multi-language app was just always put to the back end of the priority list. And in healthcare it’s incredibly important that the patient knows what the directions are, knows what the medications are, knows what their appointments are.

And so I built FluentC to handle that, and to build that multi-language, make it app developer friendly, our SaaS component is a no code solution, no code translation. If you want it native, are built into the app itself, use our GraphQL, and our i18next integrations. And then it was just, it just had to exist because of, patients need that native touch, and it needs to be super easy.

Developer tools are always built by people who aren’t developers, it has always been a sore point my side, that people just cannot, developers are not the first thought. Like you have to write code for a living, we have to make this easy for you. So we built that first.

[00:07:41] Nathan Wrigley: Can I just ask, in the US, given that you’ve mentioned the healthcare industry, in the US is there an obligation for that particular industry to have things translated into multiple languages? Here in the UK where I’m based, if you are going to produce something and it’s going to go into a hospital, for example, I think there is legislation around that. I don’t know what the cornucopia of languages are, but I know that there is some responsibility there. Is the same true in the US? Does it cover a particular bunch of languages? Is there a minimum requirement that you can have before you can say that’s done?

[00:08:13] Matthias Pupillo: Yeah, there is a minimum number of, relative minimum number of, languages. It varies state by state, and it’s for the in-person experience. If you show up to the hospital and you only speak Spanish, they have to have someone in the building that speaks Spanish, or a translator available by phone. Or if you speak Russian, or Czech, or Slovakian. And they have to take care of you, and they have to treat you, and they have to give you care, and then they hand you the directions to leave in English.

They send you your email notifications in English. After just saving your life in Croatian, they then hand you an English sheet of paper with the directions on how to get your next step care.

The obligation, you know, Canada has that hard list obligation of two languages, French Canadian and English. But the US is sort of the in-person experience, but not the digital experience. Not your emails, not your text messages, your notifications. None of that is compelled to be in multilanguage.

[00:09:05] Nathan Wrigley: Let’s turn our attention away from healthcare and just think about, I don’t know, like an e-commerce store or something like that. I guess this whole idea of getting things translated just makes perfect sense over there as well. Because for the last 10 years we’ve been all getting more and more into buying things online from our mobile phones, in the comfort of our own homes, sitting in an armchair and what have you.

And increasingly, a lot of the properties are crossing international borders, so it’s not difficult for me to buy from a US company, or a European company that isn’t based where I am, knowing that the shipping and everything will be taken care of. But I guess the language component, for me, I am only an English speaker. So if I, for example, was to come across a great deal on a French website, I wouldn’t know what it was because it would simply be in French. So I guess there is a real good economic argument if you are hoping to participate in international trade, there’s a really good compelling argument to get behind this.

[00:10:01] Matthias Pupillo: There really is because e-commerce is one of the ones that is, once you fulfill the product, you’re done. You’ve shipped it. Some amount of people will call support, some will email, some will use chat. For the majority, overwhelming majority, of e-commerce sales, it’s ship it, and you’re done. You really don’t care if the person that spoke only German. You ship them the great product, maybe you have multi-language directions if they need directions, we should be designing products that don’t need directions, like a T-shirt.

But yes, the ability for a website and e-commerce to have those additional keywords in those native languages adds billions of potential customers. We did the analysis, and if you just cover Hindi, Chinese, English, French, and German, it’s about five languages, you add about 4.2 billion potential customers. You’re adding millions of new keywords. You’ve spent so much time, most of our WordPress, and the Woo people, spend so much time optimising their titles, their descriptions, their framework for communication on their products, and then that’s it, it’s in one language. And it’s perfect, it’s beautiful, but it really doesn’t serve them for 4 billion customers.

We support about 140 languages. There’s about 40 languages that cover 8 billion people. Those searches are the key. You can expand at no cost, your SEO presence, just by having the translations built in in a search engine optimised way. And e-commerce fulfillment, all your tools, that’s an email and chat. The phone one, you’re going to struggle with, you may have to build some extra capabilities. That’s where the 4 billion customers need to make enough revenue for you to have some support for them. But if you have 4 billion new customers available to you and you don’t have the revenue, I’d like to question your product.

But from an e-commerce standpoint, selling globally is super easy now. Woo handles the transactions, handles the currency, you can get some plugins for that. You can do the fulfillment. It’s super easy to cover all the taxes, but you forgot the descriptions. You forgot the titles, the tags, the meta tags, all of that stuff. And the AI translation these days, because you need a special language model, you’re using special descriptions, and they’re really good at this. It really is a known thing to translate, and it’s super easy to adjust and bring in that new traffic. And I think that’s the important thing about e-commerce, yeah.

[00:12:22] Nathan Wrigley: If we rewind the clock, I don’t know, 25 years or something, before the internet had taken off, apart from a bunch of really nerdy academics at CERN or something like that, the idea of owning a shop which would be communicating globally was really more or less pie in the sky. You know, there are a few giant companies that we all know of that were crossing international boundaries, car manufacturers and things like that, these giant entities.

But people who had a regular shop, which you might describe as a bricks and mortar shop, they’re not going to be doing that because they’re locally based, there’s no prospect of doing that, all of it is completely out of bounds.

And then along comes the internet. Suddenly the boundaries are collapsing. And although it’s still probably out of bounds for many people, it’s becoming increasingly obvious to any store owner that you can ship things. All of that shipping capability has been taken care of, the logistics can be taken care of by somebody else.

But there’s this missing piece. And I guess, again, if we rewind the clock about 25 years. The idea of translating things must have been fabulously expensive, because for every piece of text that you wished to translate, presumably you had to get a human being to read it, spend time wrangling it, and then giving you the translation, which you would then need to print in some way.

But now, the advent of AI, and I know that AI is all the rage at the moment, but it does seem like a lot of the AI stuff is kind of hype. And some of it might have utility, but much of it doesn’t seem to have utility. But it really feels like the translation piece is really credible. I don’t know how perfectly accurate it is, whether it’s 95% from English to French, say, or 99.9, I don’t really know, but because that’s such a logical thing to do, it does seem like an area in which a computer could excel at. Is that the case? Is translation from one language to another, do computers do this increasingly well, accurately?

[00:14:18] Matthias Pupillo: Yeah, computers are doing it very well. And it is a different AI than the ChatGPT. ChatGPT, Llama, some of those things are terrible at translations, you need a special one for it. And the special ones are getting of standard, by the book translations very good. They’re still very bad at a couple of things, colloquialisms. We have a great phrase, six of one, half a dozen of the other, that doesn’t make any sense in Spanish.

We lose such great things, like there’s a wonderful Spanish word, an event called the quinceañera, which is your 15th birthday party. There’s cultural significance in language that we lose. And we are trying to fix that. Everyone’s trying to fix that. It’s as bad as it’s going to be. But as far as formal communication goes, if you were setting a date and a time for an event, or you were describing what is best be described as a pair of shoes, or fixed product file folders, or something like that, it does very good at those things.

It doesn’t do good at tone and colloquial mannerisms. But other than that, it’s pretty good. The AI translations are getting so good. And the cost to review them. So I was once on a project years ago, and this is before this. They built an app, then they sent it to a translation company with no context, so it didn’t say what was on the page. So there’s this wonderful place in the world called Turkey. There’s also a wonderful piece of food from a bird called turkey. And so here you are with the word Turkey, the country, or turkey, the meat, being all across your website. So even humans without context get it wrong. And so you really need that context engine.

We also have this wonderful thing in most websites called the back button, we do that. We also as humans have a part of our body called the back. So in other languages, those aren’t the same word. And those type of things, humans get wrong so often because they just see the word back, and they just see the, most translation tools that send them over to the company to even manually do it, they do it one word at a time. One line at a time. And there’s no context, so they’re just like, I guess it means the country, it’s capital, I guess we’re not talking about food, but it was a recipe guide and it’s your turkey dinner. And so those are the type of things that even humans get wrong still.

And then the management of it, back to developer friendly. Could you imagine getting emailed a spreadsheet every week as you’re trying to go to prod? You’re building a website, here you are, new post, new blog, new product. Okay, I got to email, and I got to check my email, and then I got to save it on that one, and I gotta create a new one, and then I got to save it again because, oh no, I changed the description, now I have to go back to the translator.

Because in your native language, every keystroke, comma, everything matters. So the human translate, I got to go back, and now I have an entire workflow added to my post and pages. And I love my post and pages, I just want to go in and type and hit save. I barely want somebody else to review it. This is WordPress, we’re cowboys, we go right to prod. We hit save, we hit publish, we hit draft for a little bit, but we’re going to prod. And that’s the difference with these older systems and how to do this is that. With humans involved, you either have to have a huge team or you have to do it yourself, and it still impacts your workflow.

The real problem with translations is it’s so time consuming. It’s so much effort. We thankfully have a connected enough world that we can find a chain between every language on Earth. We can find a person who speaks English to French, we can find a French person who speaks German, we can find a German person who speaks Polish, and we can connect the whole world, great.

That’s got to get much faster, and we have to connect the world. If we could talk to each other like this, this is where we want to go. This is what I want to do. I want this multi-language, Star Trek had it right. The universal translator built into our ears is the correct technology. Every interface you look at is in your technology, localised to your context. That’s the vision. That’s what I’m working on.

[00:18:06] Nathan Wrigley: You were talking about a workflow there. The AI presumably is significantly quicker, because if you were to be employing humans to do this, presumably you’d send an email with the text, wait a little while, maybe a day, or a week, or whatever it may be, and then it comes back, and then you’ve got to then copy and paste that into the blog post. And I realise that there are WordPress plugins out there that will handle that more or less seamlessly on the back end of your website as well.

But I’m guessing that AI can handle, let’s say you’ve got a, I don’t know, 1500 word article or something, I’m going to guess, I actually don’t know, but I’m guessing that it could translate that in moments. Be it with all the caveats of how accurate it is and what have you. It’s a fairly straightforward amount of time, so that you could see the results in seconds. Is that accurate?

[00:18:49] Matthias Pupillo: Yeah, as a software architect, I can no longer abide by anything longer than three seconds. So our average response time is 1.2 seconds, over at FluentC, and then we are trying to see if we can get that down. That seems too slow to me. And so 1.2 seconds for a whole article.

[00:19:05] Nathan Wrigley: Can you translate multiple languages simultaneously? Maybe on the FluentC backend, it’s actually queuing them and doing them one at a time. But would I be able to, for example, say, okay, my target audience is the Chinese market, the Japanese market, the Philippines market, throw in, I don’t know, Vietnamese and a bunch of other things, you know, languages that I’m really not that familiar with. Can I get the same result? 1.2 seconds later or thereabouts, they will all be taken care of and ready to go.

[00:19:30] Matthias Pupillo: Yeah.

[00:19:31] Nathan Wrigley: It’s phenomenal.

[00:19:32] Matthias Pupillo: Yeah, we built a multithreaded. We have different channels for each language, and basically you can hit 140 languages, don’t do it, you don’t need some of these languages. Some of these are like, we’re thinking about adding Klingon and Kardashian, we’re talking about mythical languages. If you’d like to speak Elfish from Lord of the Rings, we’re working on that too. We do parallel processing. So we handle all of them in about 1.2 seconds. So in about one second, two seconds, you’ll be able to have all 40 languages, five languages, downloaded to your copy of WordPress.

[00:19:59] Nathan Wrigley: You mentioned that ChatGPT, which is the one I think most people are familiar with, you mentioned that that is not quite as robust for the language translation. What is the difference then between the one that you are using? What is that one called, and how is that different? Is it just purely built to do the job of translating one language to another?

[00:20:17] Matthias Pupillo: Yeah, so the regular ChatGPT, the Llama, they’re only fed documents in certain languages. And they can read, they’re fed French documents, it’s all about access to data. This is why we have an accidental bias in AI. We’re only feeding it English content and, you know, we’ll say Western languages because that’s all we have access to.

We have very limited documentation written in Swahili. There are very few books. There are very few books in Hebrew. The Arabic books do not have the wealth of digital knowledge. There are so many books in Arabic that are not digital. They’re handwritten books, they’re handwritten things from historical precedents. But every book written by Charles Dickens is online. Almost everything written in the King’s English, every bit of Shakespeare.

But some of these other cultures do not have the digital free access for these AI companies to ingest. And their models, their tokenisation, their transformers, all of that stuff on their side is not designed to go from token to language, it’s designed to go from token to token. So you need a specific large language model, and a neural net just tied to multilanguage. So Google Translate, DeepL, AWS Translate, those are the big players in the game. And then what we’ve done is we’ve built a layer on top of them. And we’ve built the FluentC LLM to be context driven.

So instead of translating one word, we’re sending in FluentC context. So because we’re connected to your WordPress site, we know all of your pages. We know all of your about. We know your tagline. We know your title. We know all of that stuff about your site, and we can include that, and we process the context to make sure we have the context right.

And then we do a scoring algorithm across the big cloud translation engines to really drive a good output. So we’ll know if it’s a bad translation. And then, you know, we’ve just recently launched an edit capability, so if you do notice, you can just go in and hit edit, and change. But yeah, the ChatGPTs do terrible translations, and it’s just not designed for it.

[00:22:12] Nathan Wrigley: So when this podcast episode airs, I will use an AI, and we’re both speaking in English, and it’s pretty good with the UK accent that I have, and I have no doubt that it will be excellent on the accent that you have. So I will feed the audio into that, and within a minute, less than a minute, it will have transcribed that audio. And it will have done, to my eye, about 95, somewhere between 95 and 97% accurate. There’s always little bits, for example, it’ll just mishear the slur between one word and another, and so it’ll misunderstand that word, but it’ll do a pretty good job.

The thing about that is, it’s just trying to do one word at a time, you know, discreetly. What’s that word? What’s that word? What’s that word? But then if I was to get that translated, let’s say, into another language, French, German, whatever we pick. When I’ve had the opportunity live to use Google Translate, it’s kind of interesting watching that happen because on the screen as I’m saying English, I’ll see the Italian words being printed out, and then there’ll be like a little pause, and things sometimes get deleted in real time. And so somehow it’s thinking, okay, that word wasn’t what I thought it was.

So what I’m trying to say is, when it’s translating, it’s not taking one word strictly at a time, because that would just be junk. We know that the order of French sentences, for example, is entirely different. They put words before other words and so on. The order in English is completely different to the order in French. Presumably you have to take that into account. So it’s not just, if we were to watch this happening, it’s not linear. It must take the whole sentence as a whole, and then have a guess, and then rewrite its guess. How does that all work?

[00:23:49] Matthias Pupillo: Yeah, and it works by, the more you give it, the better it is. So a one word translation, turkey, meat, country, in English. But if you give it a paragraph, and you start talking about a city in Turkey, and you start talking about a neighborhood, and you give it context. You have to have context of everything else you’re talking about.

A word, a phrase, you’re at a bar versus you’re at church. You’re at your doctor’s office versus just walking down the street. Those are contextual elements you have to give to those models so they know who you’re talking to. If you put relationships between two people, a producer and a customer, a transaction. Or a mother and a daughter. Those are different contexts, and they’re going to speak differently to each other. And you have to feed all of that in.

That’s where pre-processing translations is a really important thing. The Google Translate on Chrome does a great job, it just doesn’t have all the context. But the context of translations is key. Where we’re talking, why we’re talking, when we’re talking, those are all different things in every language.

[00:24:51] Nathan Wrigley: So, do you have a plugin which is linking up to your SaaS backend to do this? And how does that interface with, I don’t know, let’s say the block editor. How do I, for example, if I created just a post, or a page, or something like that, how does it all look? Do you have a version on the repo, or is this just a premium thing? How does it work?

[00:25:10] Matthias Pupillo: Oh yeah, the FluentC plugin is on wordpress.org, and that was a big challenge for me. Coming back to the community, I was unaware that there were standards, and that took a few months. I didn’t know comments had to end in a period. That was a long-term feedback in my first publication, I’d go a week and then I’d have comments wrong. It’s like, okay then, I don’t think that’s critical to the app.

But the FluentC plugin works in the background. Every time you post or publish anything, we pick it up, and then on that traffic we then store the translation locally to your copy of WordPress without object duplication.

The other thing that, being in WordPress so long, WPML and Polylang have been running the game for, I don’t know, since the beginning. And the object duplication was always a problem to me. I can’t grow WordPress to enterprise scale if I have a German version of the page and an English version of the page, I can’t integrate that content. If I have an e-commerce flow, we talked to some people, and I have a backend not in WordPress, and I’m feeding WordPress and Woo in my front end, I can’t have five copies of the same product. I can’t have object duplication. I need product number seven, I need all my backend systems to go to number seven, I can’t have a different order ID and things like that.

And so we really built the plugin to solve that problem. And the FluentC plugin, it’s on wordpress.org, and you just download it, install it, sign up, pick your languages. We’re giving away a free language still, so there’s one free language out of the gate, and then you can add and subscribe to more of those, though one language is free.

[00:26:39] Nathan Wrigley: So let’s say, for example, I’ve got a product with an ID of seven, and I want that product to have five languages. We don’t need to get into which ones, but five languages. Is the product ID for all of those seven? You are saying, does that differ from other solutions that may be out there? And we’re not going to get into the competitive differences and what have you but, is there, a difference there?

Some of the options that are available, they might create different IDs because they’ll have the German one, and the Spanish one, and what have you. Whereas you are saying, it’s all handled in ID of seven. And if that is the case, how does it do the translation? How does it pull out the English text and swap it for the German on the fly?

[00:27:16] Matthias Pupillo: Yes, thankfully the fantastic developers in the community at WordPress has given us hooks. And so as the hooks render the front end, and as they’re done, we intercept the hooks, we run some processing on it, and then we store up right before it’s viewed. So we have tricked WordPress into thinking there’s a German page. We have modified the URL, so it’ll be dash de, and then it’ll be the product name. And so we have tricked WordPress using hooks and standards.

The block editor is super fun. That was super fun to get that all integrated. Because for most of the cases, we found that most people actually don’t speak Japanese. Most English producers of content do not speak Japanese, so they don’t want to be bothered by it. And then as you go through that workflow, what always bothers me is anything that gets in the way of publishing. We’re independent free thinkers, we’re self publishers, right? The whole goal of WordPress is independent publishing, right? It was originally a blogging platform, and it was meant for us to get our word out there and communicate in an internet that didn’t support it. But FluentC plugin is designed to be seamless, and if you want to control it, you can. But other than that, you just ignore it. And I hope you just go to my app and hit subscribe, and then you never touch it again.

[00:28:25] Nathan Wrigley: How do you, on the backend, how do you see, are you able to see the German translation? Does it sort of store the German in a meta field somewhere, or?

[00:28:33] Matthias Pupillo: Yeah, actually we just published it yesterday, that you’re actually able to, as of yesterday, go in and see all of the translations. We use transients, and so we’re using transients in the WordPress database, and it’s super fast because you never hit my cloud. Once it comes a way back to you during that initial load, that 1.3 seconds per page thing like that, then it’s right from your page, and right from your server. You don’t call back. You’re not dependent on me after you get the translation, so it’s super fast.

As a software architect, I’ve been doing that for 20 years, performance is key. If we slow down all of WordPress because we want to support more people, that’s counterproductive. We can’t add billions of new people to our sites and then, oh yeah, everyone’s now one and a half, two seconds slower.

[00:29:15] Nathan Wrigley: I don’t know how closely you keep your eye on the roadmap for WordPress, but we’re in phase three of the Gutenberg project. And at the moment that, broadly speaking, could be categorised as collaboration. There’s a whole bunch of other things thrown in there as well, but the idea of having some interface which we can communicate with in real time with other people, and that’s yet to happen.

But the fourth phase is all about this topic, is all about multilingual and what have you. And I imagine that there is an opportunity there, but also maybe there’s a little bit of trepidation. Because, although the scope for this phase four hasn’t been exactly ironed out really, and there’s a little bit up in the air because we haven’t got there yet. I don’t know what your thoughts are, whether or not you are building a business which may end up being completely upended by WordPress Core. Or whether you think actually there’s a opportunity for me here because I’ll be able to bolt into whatever is built. So that question is, it’s not very targeted, but hopefully you’ve got an intuition. Phase four is coming, it’s going to be exactly in the ballpark of what you are doing. So does it offer you an opportunity, or is it something to be worried about?

[00:30:24] Matthias Pupillo: We are looking forward to participating in phase four. Now that I’m actively in the community, I’d love to help with phase four. We are dreaming of a world where you just use FluentC’s backend, you don’t need to install our plugin, be fantastic. If we could get standardisation among the hooks, the translations, the power to do it and edit it, that would be fantastic. We view all of the phase four talk as a super impactful, important upgrade to WordPress, to make it really global publishing software,

[00:30:55] Nathan Wrigley: So in that scenario, perhaps the intention for you is to pivot away from being a plugin, more to being an API basically. You know, you go to your service in the same way that you do for ChatGPT, for example, and you get a key. And you paste that into some core component of WordPress, which ships with vanilla WordPress, and then your translation has just happened on the fly. That’s interesting. So that’s a potential direction you are hoping it might go in, right.

[00:31:22] Matthias Pupillo: Yeah, that would be great. If we end up that WordPress does it so well that all of the translation plugins are irrelevant, and now we’re competing on API qualities, I’m perfectly fine with that. I’m perfectly fine with building a better mousetrap and helping WordPress. Honestly, I just want the world to get smaller and make this easier.

So I think we’re going to do fine in that world. I think we’re going to be proactive, I think the developer tools we’re coming out with in the next few months, to actually automatically get all of your language files ready to go, and at least one version of translation ready to submit with your plugin, will make the world smaller.

I think having multi-language themes with 1.2 second response time is pretty good, so that theme developers can start to plug in. We think there’s still an area around WordPress that’ll make it better, and then if WordPress can standardise the way it thinks about multi-language, standardise the control mechanisms. We’ve had to invent a lot of stuff that, once it’s standardises we think it’ll be better for everyone. And yes, it will allow more up market entrants, and yes, it will be more open, but that’s WordPress.

[00:32:29] Nathan Wrigley: What’s the nature of the LLMs? I know that you drew a distinction between the LLMs that you’ve decided to use and the layer that you’ve added on top, and the ones that we’re all familiar with, ChatGPT and what have you. What I’m meaning by this is that it feels like a lot of money, in many cases, tens of billions of dollars is pointing towards things like ChatGPT. And we know that Google has deep products.

Are we worried about a future where only a few of these AI companies are able to offer services? Simply because they’ve been the ones that have invested so much, and they’ve become the defaults. Does anything like that concern you at all? That we’re sort of building something where there’s going to be a few incumbents, and their AI is going to be so superior to anything else, that we won’t have an opportunity to use rivals, because they just won’t be worth it.

[00:33:14] Matthias Pupillo: Yeah, that’s all around tech though. We have our big six tech firms, they have more money than anyone thought they should have. They have so much power, they have so much control. They could buy all of the competitors in the entire space for less than they lost buying coffee. And that consolidation, that’s going to be a real problem.

There almost has to be a WordPress for AI. We have to get an open source AI model that actually is contributed by other people. We have to do, maybe Matt wants to start another thing. And we have to go at this because WordPress and this open source, it serves a real valid purpose. It does keep everyone in check, it does keep everyone in line. The reason it’s not $25,000 to run out a basic website is because WordPress exists. Self-publishing, open source tools exist.

If the AI goes private, we’ll call it 1980s versions of software. The eighties existed, we’ve already seen this. We’ve seen languages, coding languages that were proprietary. Software that’s proprietary. We have Java. We know how this goes. We have Objective-C. We have whole languages locked down by a company. If they lock down the AI, we’re all not going to be able to use it unless we pay them.

There is not a competitive landscape. We need that openness, we need the WordPress people to really get involved in this because it’s going to be out of our control, and it’s going to get self consolidated. It’s how capitalism works. The market drive to a single winner. It drives unless there are other values than money, and people who see other value, community value, network value, people that actually just want to win and communicate.

Like all of this stuff that we get out of WordPress, all of the joy and happiness. We were so happy, I was just at WordCamp Ottawa. It was such a happy place. We’re all just there to learn, and talk, and teach, and do, and be together, and it’s so fantastic. And I really love that about WordPress, and we do need that for AI because it is going to get controlled by these mega companies. I mean, Microsoft spent $10 billion, yeah, billion.

[00:35:15] Nathan Wrigley: My understanding is that they’re potentially also building a $100 billion data center in the very near future as well. You know, just eye watering amounts of money that the likes of you and I, really it’s difficult to understand the levels that these companies are on. And you imagine that, the more that they can pour into it, the more that they pull ahead in the race, and make themselves so indispensable.

And actually that leads me to another, and probably final question. If you are in the technology space, for the last 50 years, it’s been a tough thing to keep up because technology changes so fast, but some things don’t change too quickly. Like the CSS spec doesn’t change all that quickly. So you’re building, I don’t know, a page builder, you’ve at least got a bedrock there and you can work on it.

How is this for you, in the AI space, trying to keep up? Because it feels like, you know, you blink and things have changed. From one day to the next, something that was working is no longer working. Something which was cool is no longer cool. Some company that you’d never heard of is now worth a hundred billion dollars, you know. How do you keep up with all of this?

[00:36:18] Matthias Pupillo: Yeah, so we focus on things that I personally, and our company, focus on value. We do not respect this slideware, vaporware version of AI. We have lived through blockchain. We have lived through Web 2.0. I have been doing this for 25 years. I had an AOL disc, I used to use dial up internet.

So we are driving to things that actually create value. And those things we’re chasing, duplicating, replicating, incorporating. And we’ve gotten good at identifying the stuff that’s just fluff. And that’s going to be hard for new people to come in in tech, that they can’t identify it.

But we were here through all of this. Like we have been here since containers, when containerisation was going to change the world. And it sort of did, but it sort of didn’t. And it’s now just on the tool belt, and it’s another button in the servers. But it’s not really, it didn’t change the world, it didn’t solve everything.

[00:37:07] Nathan Wrigley: Yeah, it kind of feels to me that, I really don’t know, I think you could almost flip a coin and see whether AI will be used everywhere, or the opposite. You know, people will just get fed up with it and think, actually, do you know what, the human touch is much better. But it does feel like you are on fairly steady ground because the capacity for AI to do the translation job is so profoundly obvious. It’s really straightforward. There’s a direct line between, I don’t know, the amount of time that it takes and the cost, and what have you. It really seems like a really sensible place to be.

We’re kind of going to have to knock it on the head because of the amount of time that we’ve got available for this podcast. But just before we go, would you mind just dropping a few details about where people can find you? That could be, I don’t know, a social handle, it could be a website, or an email. What’s the best place to find you if people want to talk about translating in WordPress with AI?

[00:37:56] Matthias Pupillo: Yeah, I’m on the Make WordPress Slack. Our website FluentC.ai is our WordPress focus website, F-L-U-E-N-T-C dot A-I, is our WordPress focus website. That’s a great place. I’m on LinkedIn. But Slack and wordpress.org are great places to go.

[00:38:13] Nathan Wrigley: Okay. Thank you so much for chatting to me today. I really appreciate it.

[00:38:16] Matthias Pupillo: Thank you.

On the podcast today we have Matthias Pupillo.

Matthias has extensive experience in the technology and creative sectors, and is currently working as the co-founder of FluentC.ai, an AI-powered language technology company. With a background in technology, he’s focusing on developing solutions to enhance communication across different languages and platforms. He’s been involved with WordPress since its early days around version 1.2, and has a rich history of web design and consulting, having worked on hundreds of WordPress websites. But it’s only recently that he’s become more engaged in the WordPress community through events like WordCamp Buffalo.

In the podcast today we talk about AI-driven language translations, particularly focusing on Matthias’s work with FluentC, which is his translation plugin for WordPress. It supports multithreaded, simultaneous translations of up to 140 languages, enabling your pages and posts to be offered in other languages in just a few moments.

We cover the differences between AI models designed for translation, such as ChatGPT and Llama, which aren’t specialised for this task, and how his platform builds a contextual layer above those. He emphasises the importance of context and diverse multilingual data in producing high-quality translations. FluentC’s functionality involves local storage of translated content, in an effort to maintain website speed. This is done using native WordPress hooks and URL modifications.

Matthias also offers his thoughts on the ongoing multilingual support phase of the Gutenberg project, and his hopes for FluentC to evolve from a standalone plugin to an API which could be used by WordPress Core.

We get into the broader implications of AI in translation, the need for open-source models to compete in this rapidly evolving space, and the parallels between AI evolution and past trends like blockchain and Web 2.0.

If you’re interested in the intersection of AI and WordPress, or are looking to enhance your website’s multilingual capabilities, this episode is for you.

Useful links

FluentC website

FluentC plugin

ChatGPT

Llama

Google Translate

DeepL

AWS Translate

WPML plugin

Polylang plugin

Matthias on WordPress.org