Can you really create AI voice agents just by talking? (Lovable tutorial)
Description
📌 Try Lovable: https://lovable.dev/?via=lena
🔔 Follow me on LinkedIn for more tips on GenAI & Conversational AI: https://www.linkedin.com/in/lena-shakurova/
🤝 Need help with your AI Assistant? Schedule a consultation: https://calendly.com/lena-shakurova/consultation
☕ Support the work I do: https://buymeacoffee.com/lenashakurova
In this video, we are going to test the limits of Lovable and try to build a advanced voice agent. Vibecoding chatbots and voice agents, is it already possible? Watch the full video to find out.
Timestamps
00:00 Intro
00:37 Initial setup
01:36 Connect to GitHub
01:56 Test chatbot
03:11 Connect to ElevenLabs
05:14 Update the prompt
06:18 Connect Lovable to Supabase
06:43 Store conversation logs
08:00 Change voice
09:13 Set up RAG pipeline
16:06 Outro
📌 Subscribe to weekly newsletter with tips from working on 89+ Conversational AI projects since 2018 https://lenashakurova.substack.com/
🔔 Follow me on LinkedIn for more tips on GenAI & Conversational AI: https://www.linkedin.com/in/lena-shakurova/
🤝 Need help with your AI Assistant? Schedule a consultation: https://calendly.com/lena-shakurova/consultation
☕ Support the work I do: https://buymeacoffee.com/lenashakurova
In this video, we are going to test the limits of Lovable and try to build a advanced voice agent. Vibecoding chatbots and voice agents, is it already possible? Watch the full video to find out.
Timestamps
00:00 Intro
00:37 Initial setup
01:36 Connect to GitHub
01:56 Test chatbot
03:11 Connect to ElevenLabs
05:14 Update the prompt
06:18 Connect Lovable to Supabase
06:43 Store conversation logs
08:00 Change voice
09:13 Set up RAG pipeline
16:06 Outro
📌 Subscribe to weekly newsletter with tips from working on 89+ Conversational AI projects since 2018 https://lenashakurova.substack.com/
Summary
Lovable AI Voice Agent Tutorial: Building Voice Assistants Through Conversation
In this hands-on tutorial, Lena explores whether it's possible to create sophisticated AI voice agents simply by talking to Lovable, a conversational AI development platform. The video demonstrates the process of building a voice-enabled chatbot without writing code, testing the limits of what's possible with current no-code AI tools.
Lena begins by creating a basic text chatbot using simple natural language instructions to Lovable, which automatically generates both frontend and backend code. She then transforms this into a voice agent by integrating speech recognition and text-to-speech capabilities through ElevenLabs. The tutorial shows how to customize the user interface, making it more streamlined with just a microphone button for interaction.
The video covers several advanced features including connecting to GitHub for code storage, implementing conversation history logging with Supabase database integration, and customizing the voice responses. Lena also attempts to build a RAG (Retrieval-Augmented Generation) pipeline to enable the voice agent to answer questions about her company, Parlabs, by crawling website content and storing it in a vector database.
Throughout the demonstration, viewers can see both the successes and limitations of using conversational AI tools for development. While basic functionalities like voice input/output and conversation logging work well, more complex features like implementing RAG prove challenging without coding knowledge. Lena provides honest feedback about when the tool excels and where it falls short, noting that programming knowledge is still valuable for debugging and implementing more sophisticated features.
The video serves as a practical exploration of AI-assisted development tools, offering insights into the current state of no-code voice agent creation. It's particularly valuable for developers, conversational AI enthusiasts, and those interested in the evolving landscape of AI development tools. The tutorial demonstrates that while platforms like Lovable can create functional prototypes quickly, building truly advanced voice agents still requires technical expertise and understanding of conversation design principles.
Transcript
0:00
Today we are going to try something
0:01
different. We're going to be using
0:03
Lovable, which is an AI powered platform
0:05
that allows you to build any kind of web
0:08
apps and software by simply talking in
0:11
plain English and describing what you
0:13
want. And I want to see how far we can
0:15
bring it and whether we can use Lable to
0:17
build voice agents. And to be precise, I
0:20
wonder if we can create a voice agent
0:23
that tells story about my company called
0:25
Parlabs in a natural and engaging way.
0:28
Let's find out. If you've never tried
0:30
Lovable before, this is the Lovable main
0:34
page. And once you land on it, you're
0:36
supposed to in a very short way explain
0:38
what is it that you want to build. Let's
0:40
start with a chatbot first, a text
0:43
chatbot, and then try to make it into a
0:45
voice bot. Let's say something like,
0:47
"Please create both backend and front
0:49
end for a chatbot that is powered by
0:53
LLMs and can answer user questions using
0:56
OpenAI LLMs.
1:01
Let's see what it can
1:04
build. Mhm. Okay, it is making a plan of
1:09
what it needs to build. While it's
1:12
making a plan, let's make sure it's
1:14
connected to
1:15
GitHub. Yes, that's correct. So now all
1:19
the code that lovable writes will be
1:22
stored in our GitHub so that later if we
1:25
want to work with it ourselves we can
1:27
open VS code or cursor and go and edit
1:30
it ourselves after the MVP is done. All
1:33
right so we are supposedly gotten
1:35
something and GitHub is connected as
1:38
well. Let's try to refresh the page and
1:42
build unsuccessful. Okay, we click try
1:45
to fix that and see if that helps. All
1:47
right. And now the code should be fixed.
1:50
Let's try to refresh
1:53
it. This is where it wants me to paste
1:56
my
1:57
code. Okay, it saved it. Let's try
2:02
again. Hey, how are
2:05
you? Okay, so the chatbot is working
2:09
now. Let's see if we can turn this
2:11
chatbot into a voice bot. Now turn this
2:15
chatbot into a voice bot. There's going
2:17
to be a microphone. I'm going to click
2:18
on it and start talking. You're going to
2:21
transcribe it using whisper and respond
2:24
back to me using
2:27
llms. Okay. So, it should be done now.
2:30
Now, we have this microphone button.
2:32
Let's try. Hey, how are you doing?
2:37
Hello. I'm just a program, so I don't
2:40
have feelings, but I'm here and ready to
2:42
help you. How can I assist you today?
2:45
Okay, we even got the voice even though
2:47
I was not asking for the voice. Can you
2:50
connect to 11
2:53
labs? Okay, let's see if it can connect
2:56
to 11 labs. Probably it's going to ask
2:58
me for my 11 labs at geeky. So, I'm
3:02
going to go and search for that
3:05
one.
3:08
11s
3:10
logging. Okay. Where would I have the IP
3:14
keys? Here. Create an IP key level
3:20
to
3:22
Okay. Uh, it said that I can uh specify
3:27
my IP key here from Lovable and now it
3:31
should be able to work. Hey, what's the
3:33
weather like today?
3:36
I'm sorry, but I can't provide realtime
3:39
weather updates. You might want to check
3:41
a weather app or website for the latest
3:43
information. Okay, that is better. Now,
3:46
I don't like the interface, so I want to
3:49
simplify it and
3:51
say, "Don't show me the logs. Make a
3:55
very s simple interface instead where
3:58
you have just a mic in the middle of the
4:00
screen and I can click on it, talk to
4:02
it, and once I finish speaking, then I
4:05
get a response generated using LLMs.
4:10
Okay, now it got very simple. That's
4:13
exactly what I asked for. Let's see.
4:16
Hey, what are you up to today? Hello,
4:19
I'm here to help answer your questions
4:21
and provide information. How can I
4:23
assist you today? What can you
4:26
do? It's going to generate it. It's just
4:29
slow because it's slow. I can assist
4:30
with a wide range of tasks such as one
4:34
providing information and answering
4:35
questions on various topics. Please stop
4:38
offering explanations and
4:41
you're welcome. If there's anything else
4:43
you need, feel free to ask. Okay, we can
4:45
see that it's not very conversational.
4:47
So, let's try and see if we can update
4:49
the prompt. Can you show me the current
4:52
prompt that you use to generate uh the
4:56
answer? Okay, so here's the prompt.
4:59
You're helpful, friendly assistant.
5:01
Please provide clear and concise
5:02
response. Please update the prompt and
5:05
say that you are helpful and friendly
5:07
assistant and you are leading a
5:10
conversation. It's a voice conversation.
5:12
So your text needs to be simple and
5:15
conversational. The sentence structure
5:17
needs to be simple and your responses
5:20
need to be rather short, maximum three
5:23
sentences. Okay, let's try to see what
5:26
happens if it updates the prompt. And
5:28
then I ask the exact same question,
5:30
which was, "How can you help?"
5:32
Okay, so now it updated. Let's try
5:35
again. Hey, how can you help?
5:39
Hi, I can help answer questions, provide
5:42
information, or assist with tasks like
5:44
reminders or finding things online. Just
5:47
let me know what you need. Okay, that
5:49
was way more conversational. That's
5:51
good. The next step would be to store
5:54
the conversation history in Superbase.
5:57
Let's try that. Can you create a new
5:59
superb base table called conversation
6:01
log and store all our conversation
6:05
history
6:06
there?
6:08
Okay, so it said that I need to first
6:11
connect my project to superbase. Let's
6:14
see how to do that. I'll click connect
6:16
because I already created my superbase
6:18
project and I click connect and now it
6:23
is connected. Okay, perfect. Please
6:25
create a new superbase table. There we
6:30
will store all the conversation logs
6:32
that happen between our voice assistant
6:33
and the user. This is going to be
6:35
important for us because later we can
6:38
analyze those logs at scale and we can
6:40
even create smart dashboards using lava
6:45
bomb. So let's see it
6:48
generated code to create a new table. I
6:51
just need to click apply changes and it
6:55
will now create the superbase database
6:58
and connect our front end and back end
7:00
to it so that we can store the logs.
7:02
Let's see. Mhm. Okay, let's let's try
7:05
again. Hi, what are you up to today?
7:10
Hi, I'm here to help you with any
7:12
questions or tasks you have. What about
7:14
you? Okay, that's perfect. Let's now try
7:17
to go to
7:19
superbase and try to see which tables do
7:21
we have. So we have one called
7:23
conversation logs. I think that's the
7:25
recent one. And this is the structure of
7:29
that table. And if we click here then we
7:33
can see the logs. Hi, what are you up to
7:35
today? Hi, I'm here to help you with any
7:37
questions. So that was exactly what we
7:39
now asked. So now we are storing
7:42
conversation logs and then later we can
7:44
analyze them. That is perfect.
7:47
Now, let's go back to our project and
7:49
try another thing. Please switch my
7:53
current voice to
7:56
Liam. Okay. Did the voice change? Hey,
8:00
tell me more about Edinburgh.
8:05
Edinburgh is the capital of Scotland.
8:07
Okay. Known for its historic and
8:09
cultural attractions. It features the
8:11
famous Edinburgh Castle. Okay. I got to
8:14
thank you. Edinburgh festival. You're
8:15
welcome. If you have more questions,
8:17
feel free to ask. Okay. To be honest, I
8:19
liked Sara more, but I just wanted to
8:22
see if we can also easily change the
8:24
voices. The next thing we need to do is
8:26
to create rag pipeline because if you
8:30
remember, we initially wanted to create
8:32
a storytelling voice assistant that will
8:34
tell about my company. So, what I'm
8:37
going to do, I'm going to say create a
8:40
new superbase table where you're going
8:42
to store information about my company.
8:44
My company website is
8:48
httpsparslabs.org. You you need to crawl
8:51
information from my website and store it
8:53
there and then create a rack pipeline so
8:57
that when I am asking questions that you
9:02
answer using LLM based on my data and
9:04
based on information I have about my
9:06
company called
9:09
Parlabs. Okay, this sounds a little bit
9:12
complex. So let's see if it can actually
9:14
crawl information from the website,
9:16
store it in superbase and then set up a
9:19
vector database as well. Okay, so for
9:23
the super basease, it now suggests to
9:26
create a new
9:29
table. Okay, let's do that.
9:33
Mhm. And now we got an issue. So it's
9:37
trying to fix that.
9:40
Okay, so the SQL migration was
9:42
successful. That's good. Now it's trying
9:44
to implement the rag
9:47
pipeline. I do wonder meanwhile which
9:50
information it is storing in a superb
9:52
basease. So let's go
9:53
back here and it has something called
9:57
content embeddings which for now is
10:00
empty. So it only created the structure.
10:02
Okay. Okay. So it said that it
10:05
implemented everything.
10:08
Let's try if it works. Okay, let's um
10:11
give it some
10:13
feedback. Okay, so the interface changed
10:18
a
10:19
lot. Maybe I was not very clear about
10:23
what is it that I wanted. But let's ask
10:25
about pars labs. What is pars
10:28
labs? I guess it's no longer using
10:31
voice.
10:33
I don't have enough information about
10:34
pars labs from the provided context.
10:36
Could you please provide more details or
10:39
clarify your question? This did not work
10:42
as I would have wished. Okay, let's uh
10:46
give it a simpler task. Let's not ask it
10:49
to crawl the website. Let's just copy
10:52
the website information and then say
10:55
store this information in a new
10:59
superbase table called data.
11:05
Yeah, I'm really not liking what it's
11:07
doing with the UI. It's been going so
11:09
well. And now we got something really
11:12
complicated. And I think that's it with
11:14
those kind of tools. Yes, they are
11:17
already quite powerful if you know how
11:18
to use them. And simple things they can
11:21
also do rather well. However, it can
11:25
also really quickly go wrong, especially
11:28
if you are not very clear in what you
11:31
want and if you don't formulate things
11:33
very well, then it will just make things
11:36
up that you didn't ask for. Okay, now it
11:39
is storing information in superbase.
11:43
That's good. And what we're going to try
11:44
to do afterwards is ask to train a
11:48
vector database based on this data.
11:52
Let's see. Let me know in the comments
11:54
below. Do you think it would have been
11:56
faster to just code it instead of
11:59
talking in English with Lovable to make
12:01
this voice agent? Then I'm really
12:03
curious if you think it's uh worth the
12:05
time. Let me know. Now, it's updating
12:08
the UI, which I didn't necessarily ask
12:11
for. I don't need to update the UI, but
12:13
let's see if it stored the information.
12:16
And let's and let's again check the
12:19
superbase to
12:21
see where it is storing
12:24
things. Okay,
12:26
so it has the section type, section
12:29
content, section title. Okay, good. So
12:32
it did create the database. That's good.
12:35
And here's all the parls
12:39
information. And you can even filter it.
12:43
Why choose? Why choose parl slabs? Okay,
12:46
that's that looks kind of fancy. I
12:48
didn't ask for it, but but all
12:51
right. Do you think that being a
12:53
programmer would help with talking to
12:57
Lavable? Can you imagine that someone
12:59
who doesn't know how to program at all
13:01
and who has never written SQL queries
13:04
that they would be just as well able to
13:07
build apps in superbase without any
13:10
coding knowledge? already now in the
13:12
state that Lavable is working at now.
13:15
Let me know. I am not sure. For now, it
13:19
seems like it still is very helpful to
13:22
be able to understand the code and I'm
13:25
not doing it now just for
13:28
the for the purity of the experiment.
13:32
But if I would have been reading the
13:34
code, I think I would have been able to
13:37
give a better feedback to lovable and to
13:41
control it and guide it in a better
13:43
way. Okay, one last try. Let's try to
13:47
see if we can simplify it. Hi, I only
13:50
need one view where I'm talking with the
13:53
voice. My voice is being
13:55
transcribed and then the
13:58
transcription is being passed through
14:00
rag pipeline to find the information
14:03
that is relevant to my question. This
14:06
information is then being pasted into
14:08
the LLM prompt. LLM generates the
14:11
response given the relevant information
14:14
and then you use text to speech to
14:16
respond back to me. Please fix
14:18
everything and simplify the front end. I
14:21
only need the microphone and everything
14:23
else training the embeddings rack
14:26
pipeline working with the data that all
14:28
needs to happen on the back
14:30
end. Okay, I gave it just
14:34
um another
14:38
try. When I clicked generate embeddings,
14:42
did it actually generate any embeddings?
14:46
And if so, where would it be storing
14:51
them? It doesn't look like it's in
14:55
parlabs info. It's also not in rag
14:59
queries
15:02
and it's not on the
15:06
website content n. So that's still a
15:10
mystery to
15:13
me. Okay. Okay. Okay, we got a simpler
15:16
way, a simpler interface. Let's try it.
15:20
Tell me about
15:26
parls. Oh god.
15:30
No, I was referring to the company that
15:32
you know about. I'm here to provide
15:35
information about Par. What do they do?
15:37
How can I assist you? Pars Labs
15:40
specializes in providing services and
15:42
solutions related to data processing and
15:44
management. No, this is a wrong. If you
15:46
have specific questions about their
15:47
services or product, that's it. Feel
15:49
free to ask. I think that rag was not
15:53
implemented and if you don't know how to
15:56
code, there is also not a very simple
15:58
way to debug it and test if that works
16:01
or doesn't. So my conclusion for now is
16:06
that you cannot create great chatbots or
16:09
voice assistants using lovable. You can
16:12
create simple prototypes that just work,
16:14
but if you need something a little bit
16:16
more complex, then you definitely need
16:18
to know how to code and how to put that
16:20
together. And I'm not even talking about
16:23
making your AI assistants and voice
16:25
agents sound more human. That requires a
16:28
completely different set of expertise
16:30
and you won't be able to fix that just
16:33
by using tools like lovable. But that
16:35
having said, it was a fun experiment for
16:37
me. I hope you also enjoyed it and you
16:41
learned something new about what is
16:43
possible now with the current
16:45
technology. If you want to see more
16:47
videos like this and more videos, more
16:49
serious videos about voice technology
16:52
and chatbot development, conversation
16:54
design, then follow me on this channel,
16:58
click subscribe and like this video if
17:00
you liked it. And I see you in the next