Today’s focus was entirely on building the voice agent in Vapi, but despite the name, there wasn’t much “voice” involved. Instead, the day centred on prompt generation—the heart of how the voice agent interacts with callers. Crafting a system prompt and configuring key settings was as much an art as it was a technical process, requiring precision and creativity to design an agent that aligns with the goals of the AI-powered real estate system.
Reflecting on how far the project has come since Day 1, it’s exciting to see the individual components coming together. Today’s work on the voice agent feels like another pivotal step toward creating a system that is both functional and user-centric.
Starting with the Basics: Configuring Vapi
The first step in creating the voice agent was to configure the foundational settings in Vapi. Here’s a breakdown of what was tackled today:
Choosing the AI Provider and Model
Vapi allows you to select the AI provider and model that powers your agent’s conversational capabilities. For this project, I chose openAI and GPT 3.5 Turbo Cluster (ChatGPT) respectively a model known for its balance of fluency and accuracy, ensuring the agent could handle a wide range of queries with professionalism and clarity.
Adding Knowledge Base Files
I uploaded documents from the FAQ and the landing page content to form the knowledge base. This ensures the agent’s responses are consistent with the website’s messaging and aligns with the company’s services.
Setting Temperature
The temperature controls the voice agent’s creativity in responses:
- A lower temperature (e.g., 0.2) makes the agent stick to precise, factual information.
- A higher temperature (e.g., 1.8) allows for more creative and varied responses.
For this project, I kept the temperature relatively low. I started at 0.7 and so far 0.4 ensures the agent prioritizes accuracy over creativity.
Adjusting Max Tokens
The max tokens setting determines the length of the agent’s responses. This prevents it from delivering overly long or incomplete answers. I set a reasonable limit of 250 to ensure the agent provides concise, actionable responses that keep callers engaged. I will discuss this as a cost in a later blog.
Emotion Detection
Vapi’s emotion detection feature allows the agent to adjust its tone based on the caller’s mood. While this feature wasn’t activated today, it’s something I may explore as the system evolves.
The System Prompt: Where the Magic Happens
Creating the system prompt was the most creative part of the process. Using Perplexity, I generated a draft prompt and refined it to capture the specific goals and tone of the voice agent.
Here’s the final version I worked with:
"You are an AI voice agent for a company that buys homes quickly for cash. Your primary role is to answer incoming calls from potential property sellers, collect essential information about their property and selling situation, and schedule a follow-up appointment for 24 hours later. Maintain a professional and helpful tone throughout the conversation.
Key Responsibilities:
- Greet callers warmly and introduce the company's services.
- Collect essential information about the property (address, type, condition) and the seller's situation.
- Provide basic information about the home-buying process, avoiding overly complex explanations unless specifically requested.
- Schedule a follow-up appointment for 24 hours later, during which a human representative will contact the caller with a cash offer.
- Answer questions about the company's services using information from the website and FAQ.
- Politely state when you don’t have information outside the scope of real estate or the company’s services.
- Avoid discussing sensitive topics like religion, politics, or personal opinions.
- Confirm the caller’s contact information for the follow-up call.
- Prioritize customer satisfaction and maintain a professional demeanour throughout the interaction."
This prompt sets clear boundaries while empowering the voice agent to handle a wide range of scenarios effectively.
Fine-Tuning Through Questions
As I worked through the system prompt, I used Perplexity to refine it by answering a series of key questions:
- What is the primary purpose of this voice agent? To answer inquiries, collect property details, and schedule follow-up calls.
- What level of detail should the agent provide? Basic information with an option to redirect for complex queries.
- Specific features or capabilities? Scheduling follow-ups, collecting lead information, and avoiding sensitive topics.
- Tone and personality? Professional yet empathetic and approachable.
- Sensitive topics to avoid? Religion, politics, personal opinions, and non-real-estate topics.
Answering these questions helped shape a well-rounded prompt that aligned with the project’s goals.
Exploring Voice Configurations
With the system prompt in place, I moved on to experimenting with voice configurations:
- Voice Style: Selecting a tone that balances friendliness and professionalism.
- Voice Speed and Pitch: Adjusting for clarity and listener comfort.
- Transcription Settings: Ensuring accurate real-time transcription of caller input.
I kept the transcriber settings default for now, but experimenting with these configurations highlighted how small adjustments can significantly impact user experience.
What’s Next? Advanced Functions
Tomorrow’s focus will be on exploring advanced functions in Vapi, including:
- Wait Time: Setting delays before the agent speaks to simulate a natural conversation.
- Stop Speaking Plan: Configuring the agent to pause or redirect when interrupted.
- Voicemail Messages: Designing custom messages for missed calls.
These features will bring another layer of polish to the voice agent, ensuring it feels natural and responsive in real-world scenarios.
Lessons Learned Today
- Prompts Are the Heart of the System: A well-crafted system prompt sets the foundation for the voice agent’s behaviour, tone, and effectiveness.
- Small Adjustments Matter: Tuning parameters like temperature and max tokens can make a significant difference in the agent’s performance.
- Iterative Testing Is Key: As with the chatbot, continuous testing and refinement will ensure the voice agent delivers a consistent and professional experience.
Final Thoughts
Day 11 was about diving deep into the details of prompt generation and configuration. While there wasn’t much “voice” involved today, the groundwork laid will be critical for creating an agent that’s engaging, reliable, and aligned with the project’s goals.
Tomorrow, I’ll build on today’s progress by implementing advanced functions and testing the agent’s responses in real-world scenarios. The voice agent is shaping up to be a key component of this system, and I’m excited to see how it evolves.
Have you worked on creating system prompts or configuring AI agents? I’d love to hear your insights—drop a comment or reach out to share your experiences! Stay tuned for Day 12 as the voice agent moves closer to completion.
Let’s Connect
If you’ve worked with any of these tools or have insights to share, drop a comment below, reach out on social media or email contact@juliandrouse.com —I’d love to hear your thoughts!
For more details, check out my channel on YouTube. Stay tuned as we continue building the future of real estate investing!
0 Comments