Training Your Chatbot

Welcome to the ultimate training guide for your chatbot! 🎓 This document is your roadmap to creating a smart, accurate, and helpful AI assistant. We'll explore key strategies, including structured data usage, FAQs, and best practices for formatting your training data.

🧠 What is a Knowledge Base?

A knowledge base is the brain of your AI chatbot, containing all the information it needs to understand and respond to queries based on your specific business context. Let's break down how it works:

tip

A well-maintained knowledge base is crucial for your chatbot's performance and accuracy.

Create Knowledge Bases
On the training page, you can create multiple knowledge bases. Each one serves as a distinct repository of information for your AI to draw upon.

Choose the Right Knowledge Base
Once your knowledge bases are set up, you can select which one your AI chatbot should use. This flexibility is invaluable for:
- Testing New Information: Easily update and evaluate how the bot performs with fresh data.
- Preparing for Business Changes: Anticipate company updates by preparing new knowledge bases in advance. When changes go live, simply switch the bot to the updated knowledge base.
Customize AI Behavior
Create different knowledge bases for various AI behavior models. This allows your chatbot to adapt its personality and responses based on different situations or customer needs.

info

Regularly update your knowledge bases to keep your AI chatbot's information current and relevant.

By leveraging these features, you ensure your AI chatbot remains:

Well-informed
Up-to-date
Aligned with your evolving business goals

info

Always review and validate the information in your knowledge bases to maintain the quality and accuracy of your chatbot's responses.

training

Training Options

tip

Diversify your training sources for a well-rounded chatbot!

Website/Link: Empower your chatbot with web-based knowledge through data extraction and learning.
File (PDF, Excel, TXT, DOCX): Leverage various file formats to streamline the training process.
Q&A: Boost your chatbot's expertise with a curated list of questions and answers.
Text: Input clear, concise text for quick comprehension and effective implementation.

training

Best Practices for Training

note

Following these practices is crucial for developing a high-performing chatbot!

Practice	Description
Consistency	Maintain a uniform format across all training data.
Clarity	Ensure both questions and answers are crystal clear and concise.
Comprehensiveness	Expand your training data to cover a wide range of potential user queries.
Accuracy	Guarantee that all training data is precise and up-to-date.
Periodic Updates	Regularly refresh the training data to incorporate new information.
Test Frequently	Continuously evaluate the chatbot to verify accurate response delivery.

note

Remember, a well-trained chatbot is an invaluable asset to your customer service strategy!

caution

Neglecting to update your chatbot's knowledge base regularly may lead to outdated or inaccurate responses.

Taking Your Chatbot to the Next Level

Personalization: Train your chatbot to recognize user preferences and tailor responses accordingly.
Multi-language Support: Expand your chatbot's capabilities by training it in multiple languages.
Sentiment Analysis: Teach your chatbot to recognize and respond appropriately to user emotions.
Integration with Business Systems: Connect your chatbot to CRM, inventory, or other relevant systems for real-time data access.

danger

Always respect user privacy and data protection laws when training and deploying your chatbot.

Split Rule

When saving a website link or uploading documents (PDF, DOCX, or TXT), the Split Rule feature allows you to choose how the content will be divided into smaller, manageable chunks for processing. You can select between:

🔢 Max Token
- Define the maximum number of tokens per chunk.
- Example: Setting a limit of 500 tokens will divide the content into segments, each containing up to 500 tokens.
- Best for unstructured text like paragraphs or continuous content.
🔖 Separator
- Choose a separator to split the content. Examples of common separators:
  - #####
  - New line
  - ======

Example: Using Separator

The document will be split at a specific separator, such as #####.

Input:

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. ##### It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

Output:

"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book."
"It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged."

By choosing the appropriate Split Rule when saving links or uploading documents, you can optimize how the system processes content, ensuring accurate and efficient results.

By following this guide, you're on your way to creating a chatbot that will revolutionize your customer interactions. Happy training! 🎉

Best Practices for Creating Your Own Knowledge Base

Creating a knowledge base for your AI model is crucial for ensuring that it provides relevant and accurate information when responding to user queries. The goal is to develop a system that not only stores knowledge effectively but also enhances the model's ability to retrieve and deliver the best possible answers.

In this article, we'll discuss the best practices for populating a knowledge base, with a focus on vector embeddings, optimal vector length, and how to structure your data for efficient querying.

1. Understanding the Importance of Vectors

Vectors play a crucial role in how AI models understand and retrieve information. A vector is a mathematical representation of a piece of information, which can be used to compare and find similarities between different pieces of data.

Why Should a Vector Contain Clear Information?

For a knowledge base to be effective, each vector should represent clear and concise information that directly answers a specific theoretical question. It’s important that the vector is not ambiguous and contains a well-defined answer to avoid confusion when the AI model retrieves it.

Precision: The vector should be detailed enough to fully answer the user's question without needing additional context.
Clarity: Avoid overly complex or technical language unless necessary, as it can lead to misinterpretation by the AI model.

Example:
If you are creating a vector for the question, "What is machine learning?" the vector should contain a straightforward, precise definition, such as:

"Machine learning is a branch of artificial intelligence that involves the development of algorithms that allow computers to learn from and make predictions or decisions based on data."

Optimal Length of a Vector

When creating vectors for your knowledge base, it’s important to consider the optimal length. A vector that is too short might not contain enough information, while one that is too long might become difficult for the AI model to process effectively.

Best Practice:
Aim for vectors that are concise (a few sentences to a short paragraph) but still comprehensive. This length typically provides enough context for the AI model to understand the information and apply it to related queries.

Example:

Short Vector (Less effective): "Machine learning is a field of AI."
Longer Vector (Better): "Machine learning is a field of artificial intelligence that enables computers to improve performance on tasks through data and experience, without being explicitly programmed."

2. Adding Questions to Vectors

Each vector should be associated with one or more questions that it answers. This not only helps the model understand the context but also assists in determining which vectors are most relevant when a user asks a question.

Primary Question: Every vector should be linked to a primary question that it is designed to answer.
Secondary Questions: You can also link a vector to additional related questions. This helps the AI model retrieve a broader set of answers when queries are ambiguous or can be interpreted in multiple ways.

Example:

Vector: "Machine learning algorithms can be categorized into supervised, unsupervised, and reinforcement learning."
Primary Question: "What are the types of machine learning?"
Secondary Question: "What is supervised learning?"

3. Retrieving the Best Matching Vectors

Once your knowledge base is populated, it’s essential to design a retrieval mechanism that will find the most relevant vectors based on a user’s query.

Two Vectors Approach

AI models typically use a two-vector approach to find the most relevant answers to user questions. The idea is that two vectors that most closely match the question will be passed to the model, allowing it to compare and select the best response.

Steps:

Query Vectorization: Convert the user’s query into a vector.
Vector Matching: Find the two vectors that best match the query by calculating their similarity (using metrics like cosine similarity).
Answer Selection: Provide the model with these two vectors, and let it choose the one that best answers the query.

Example of Fragmented Context (Incorrect Approach):
Consider a list of events for 2025, which is split across two vectors. In this case, the second part loses its meaning without the first, potentially leading to incomplete or incorrect responses.

Vector 1:
"Events in 2025:
- AI Future Summit – March 2025
- Future Technologies Conference – May 2025"
Vector 2:
"- Business Automation Seminar – July 2025
- Innovation Expo – October 2025"

Problem:
If the model retrieves only the second vector, it lacks context indicating that these events are part of the 2025 schedule. The information appears fragmented, potentially resulting in an incomplete answer.

Solution:
Combine the events into a single vector or ensure that each vector retains full context.

Corrected Vector:
"Events in 2025:
- AI Future Summit – March 2025
- Future Technologies Conference – May 2025
- Business Automation Seminar – July 2025
- Innovation Expo – October 2025"

Conclusion: Avoid fragmenting data when creating vectors. Each vector should be self-contained and include all necessary information to provide a comprehensive answer to the user's query.

Example:
For the question, "How does supervised learning work?" the model may retrieve the following two vectors:

"Supervised learning involves training a model on labeled data to make predictions based on new data."
"Supervised learning is used when the output variable is known, and the goal is to predict the output for new inputs."

4. Best Practices for Vector Addition

When adding new vectors to your knowledge base, consider the following best practices to ensure smooth integration and high-quality results.

Structure of a New Vector:

Clear Answer: Ensure the vector clearly and accurately answers the question.
Question Association: Link each vector to one or more questions.
Contextual Relevance: Make sure the vector is relevant to the domain or topic the knowledge base covers.

Example:
Adding a vector for the topic "Types of Neural Networks":

Vector: "Types of neural networks include feedforward networks, convolutional networks, and recurrent networks."
Questions:
- "What are the different types of neural networks?"
- "Can you explain the different types of neural networks?"

Versioning Your Vectors

As your knowledge base evolves, the answers may need to be updated. Keep track of any changes to ensure that outdated vectors are not used.

5. Conclusion

Creating a knowledge base is a dynamic process that requires attention to detail and ongoing refinement. By following these best practices, you can ensure that your AI model has access to a high-quality, efficient knowledge base that helps it deliver the best possible answers to users. Remember that the clarity, length, and structure of your vectors are key to success, as well as ensuring that the right questions are attached to each piece of information.

With a well-constructed knowledge base, your AI will be more effective, responsive, and accurate in addressing user needs.

Training Your Chatbot

🧠 What is a Knowledge Base?​

Training Options​

Best Practices for Training​

Taking Your Chatbot to the Next Level​

Split Rule​

Example: Using Separator​

Best Practices for Creating Your Own Knowledge Base​

1. Understanding the Importance of Vectors​

2. Adding Questions to Vectors​

3. Retrieving the Best Matching Vectors​

Steps:​

4. Best Practices for Vector Addition​

5. Conclusion​