Harnessing AI for Intelligent Q&A Generation: A Deep Dive into CrewAI

In today’s fast-paced world, where information overload is a constant challenge, the ability to quickly extract key insights from lengthy documents is invaluable. Imagine a tool that could read through complex case studies, dense reports, or extensive articles, and not only generate relevant questions but also provide accurate answers and even evaluate the quality of those Q&A pairs. This is exactly what our innovative AI-powered Q&A generation system, built using the CrewAI framework, accomplishes.

The Power of AI-Driven Q&A Generation: A Short Story

Meet Purvi, a busy management consultant tasked with analyzing a 100-page report on emerging market trends. Traditionally, this would require hours of careful reading and note-taking. But with our AI-powered Q&A generation tool, Purvi’s workflow is transformed.

She feeds the report into the system, which quickly generates a set of insightful questions covering various aspects like key facts, SWOT analysis, critical decisions, and ethical considerations. Not only does it provide these questions, but it also produces comprehensive answers and even grades the quality of each Q&A pair.

In a matter of minutes, Purvi has a well-organized summary of the report’s key points, complete with thought-provoking questions and detailed answers. This allows her to quickly grasp the essence of the report and focus her energy on strategic analysis rather than information gathering.

Q&A Generation tool

Here is a simple summarization tool that an help quickly summarize the documents, given the topics of summarization. It automatically creates the sections and summarizes informatio in each section and also provide extended information for each summary.

summary-tool

Now, let’s dive into how this powerful tool is implemented using CrewAI Agent Framework.

Understanding LLM Agents

Large Language Model (LLM) agents are AI-powered entities that leverage the capabilities of advanced language models to perform specific tasks. These agents are designed to understand and generate human-like text, making them ideal for a wide range of applications, from content creation to complex problem-solving.

LLM agents work by processing input text, understanding the context and requirements, and generating appropriate responses or actions. They can be fine-tuned or prompted to specialize in particular domains or tasks, allowing them to leverage their broad knowledge base for specific applications.

Key features of LLM agents include:

Natural language understanding and generation
Contextual awareness
Task-specific specialization through fine-tuning or prompting
Ability to process and synthesize large amounts of information
Adaptability to various domains and applications

In our Q&A generation system, LLM agents play crucial roles as question generators, answer providers, and quality evaluators, each leveraging the power of language models to perform their specialized tasks.

Introduction to the CrewAI Framework

CrewAI is an innovative framework designed to orchestrate multiple AI agents to work together on complex tasks. It provides a structured approach to creating, managing, and executing workflows that involve multiple specialized AI agents.

Key features of CrewAI include:

Agent Definition: Easily create specialized agents with defined roles, goals, and backstories.
Task Management: Define and manage tasks for each agent, including input requirements and expected outputs.
Workflow Orchestration: Coordinate the execution of tasks across multiple agents in a specified order.
Flexible Integration: Seamlessly integrate with various LLM providers and tools.
Output Handling: Manage and process the outputs from each agent and task.

In our Q&A generation system, CrewAI allows us to create a team of specialized agents (QuestionGenerator, AnswerGenerator, and Grader) and orchestrate their tasks to work together seamlessly. This enables us to break down the complex process of Q&A generation and evaluation into manageable, specialized steps, each handled by an expert agent.

By leveraging CrewAI, we can create more sophisticated and effective AI systems that can tackle complex, multi-step problems in a structured and efficient manner.

Implementing Q&A Agents with CrewAI: A Step-by-Step Guide

Our Q&A generation system is built using the CrewAI framework, which allows us to create a team of specialized AI agents working together to accomplish complex tasks. Let’s break down the implementation into steps and examine the key code snippets in detail.

summary-tool

Step 1: Setting Up the Environment

First, we import the necessary libraries and set up logging:

import os
import sys
import json
import PyPDF2
import pandas as pd
import logging
import configparser
from typing import List
from pydantic import BaseModel, Field, ValidationError
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)

This setup is crucial for our project:

os and sys are used for system operations and command-line argument handling.
json is used for JSON data processing.
PyPDF2 allows us to read PDF files.
pandas is used for data manipulation and CSV file creation.
logging helps us track the execution of our script.
configparser is used to read configuration files.
pydantic is used for data validation and settings management.
crewai is the core framework we’re using for our AI agents.
langchain_openai provides an interface to OpenAI’s language models.

Step 2: Defining Data Models

We use Pydantic to define our data models, ensuring type safety and easy serialization:

class Question(BaseModel):
    context: str
    text: str
    question_type: str = Field(description="One of the question types mentioned in the instruction.")

class Answer(BaseModel):
    text: str

class QAPair(BaseModel):
    question: Question
    answer: Answer

class GradedQAPair(BaseModel):
    question: Question
    answer: Answer
    rating: int = Field(..., ge=1, le=5)
    explanation: str 

class QuestionList(BaseModel):
    questions: List[Question]

class QAPairList(BaseModel):
    qa_pairs: List[QAPair]

class GradedQAPairList(BaseModel):
    graded_qa_pairs: List[GradedQAPair]

These Pydantic models are crucial for our data structure:

Question includes the context, the question text, and the question type.
Answer simply contains the answer text.
QAPair combines a question and its corresponding answer.
GradedQAPair extends QAPair with a rating (1-5) and an explanation for the grade.
QuestionList, QAPairList, and GradedQAPairList are used to manage lists of these objects.

Using Pydantic models allows us to easily validate our data structure and catch any inconsistencies early in the process.

Step 3: Creating Specialized Agents

We define three specialized agents using the CrewAI framework:

class QuestionGenerator(Agent):
    def __init__(self, llm):
        super().__init__(
            role="Question Generator",
            goal="Generate relevant questions based on the case study text",
            backstory="You are an expert in creating insightful questions from management case studies. You specialize in creating diverse types of questions.",
            llm=llm,
            verbose=True
        )

class AnswerGenerator(Agent):
    def __init__(self, llm):
        super().__init__(
            role="Answer Generator",
            goal="Provide accurate answers to questions using the given context",
            backstory="You are a knowledgeable expert in management studies, capable of answering complex questions based on case study information. You ensure your answers are comprehensive and directly relevant to the questions asked.",
            llm=llm,
            verbose=True
        )

class Grader(Agent):
    def __init__(self, llm):
        super().__init__(
            role="Grader",
            goal="Validate and rate question-answer pairs for quality, completeness, and relevance",
            backstory="You are an experienced evaluator of management case study analyses. You have a keen eye for detail and can effectively assess the quality, completeness, and relevance of questions and answers in the context of management studies.",
            llm=llm,
            verbose=True
        )

Each agent is a specialized AI entity with a specific role:

The QuestionGenerator creates insightful questions based on the input text.
The AnswerGenerator provides comprehensive answers to these questions.
The Grader evaluates the quality of the question-answer pairs.

By giving each agent a specific role, goal, and backstory, we’re leveraging the power of role-playing in AI to get more focused and relevant outputs.

Step 4: Defining Tasks for Each Agent

For each agent, we define a corresponding task:

class QuestionGenerationTask(Task):
    def __init__(self, agent: QuestionGenerator):
        super().__init__(
            description="""
            Generate a set of {num_questions} relevant questions from the given case study text given in the context.  
            Ensure each question is clear, focused, and promotes deep understanding of the case.
            The question can be based on any of the following topics: 

            {instructions}

            Case study text: 
            <context>
            {text_chunk}
            </context>
            """,
            agent=agent,
            output_pydantic=QuestionList,
            expected_output=f"A list of Question objects"
        )

class AnswerGenerationTask(Task):
    def __init__(self, agent: AnswerGenerator, task: QuestionGenerationTask):
        super().__init__(
            description="""
            Produce accurate and comprehensive answers for the generated questions using the given context. 
            Use both the original text and the specific questions provided.
            Ensure each answer is:
            - Directly relevant to the question asked
            - Based on information from the case study
            - Clear and well-structured
            - Give elaborate answers with detailed context and descriptions
            - Not more than 500 words
            """,
            context = [task],
            agent=agent,
            output_pydantic=QAPairList,
            expected_output=f"A list of Question Answer Pair objects"
        )

class GradingTask(Task):
    def __init__(self, agent: Grader, task: AnswerGenerationTask):
        super().__init__(
            description="""
            Evaluate and rate each question-answer pair on a scale of 1 to 5 for based on overall quality, completeness, and relevance based on the context given.
            Provide a brief explanation for each rating.
            """,
            context = [task],
            agent=agent,
            output_pydantic=GradedQAPairList,
            expected_output=f"A list of GradedQAPair"
        )

Each task is carefully defined with specific instructions:

QuestionGenerationTask provides detailed guidelines on how to generate questions, including the types of questions to be created.
AnswerGenerationTask outlines the criteria for good answers, ensuring they are relevant, comprehensive, and well-structured.
GradingTask provides instructions on how to evaluate and rate the Q&A pairs.

The context parameter in AnswerGenerationTask and GradingTask allows these tasks to access the output of previous tasks, creating a chain of dependencies.

Step 5: Creating the CrewManager

The CrewManager orchestrates the entire process:

class CrewManager:
    def __init__(self, llm):
        self.question_generator = QuestionGenerator(llm)
        self.answer_generator = AnswerGenerator(llm)
        self.grader = Grader(llm)

        self.question_task = QuestionGenerationTask(self.question_generator)
        self.answer_task = AnswerGenerationTask(self.answer_generator, self.question_task)
        self.grading_task = GradingTask(self.grader, self.answer_task)

        self.crew = Crew(
            agents=[self.question_generator, self.answer_generator, self.grader],
            tasks=[self.question_task, self.answer_task, self.grading_task],
            process=Process.sequential
        )

The CrewManager is the central orchestrator of our system:

It initializes all the agents and tasks.
It creates a Crew object, which manages the execution of tasks by the agents.
The Process.sequential parameter ensures that tasks are executed in order, with each task using the output of the previous task as input.

Step 6: Executing the Q&A Generation Process

In the main execution block, we set up the CrewManager and run the process:

if __name__ == "__main__":
    # Check command-line arguments
    if len(sys.argv) < 3:
        print("Please provide a PDF filename and number of questions as arguments.")
        sys.exit(1)

    # Get input parameters
    pdf_filename = sys.argv[1]
    sample_text = read_pdf(pdf_filename)
    num_questions = sys.argv[2]

    # Read configuration and set up OpenAI API
    config = configparser.ConfigParser()
    config.read('agent.ini')
    os.environ["OPENAI_API_KEY"] = config['OpenAI']['API_KEY']

    # Initialize language model
    llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=1000)

    # Create and run CrewManager
    crew_manager = CrewManager(llm)
    result = crew_manager.crew.kickoff(
                inputs={
                    "num_questions": num_questions,
                    "text_chunk": sample_text,
                    "instructions": instructions
                }
            )

    # Process and save results
    if isinstance(result, GradedQAPairList):
        data = [
            {
                'Question': graded_qa.question.text,
                'Type': graded_qa.question.question_type,
                'Answer': graded_qa.answer.text,
                'Ratings': graded_qa.rating, 
                'Explanation': graded_qa.explanation
            } for graded_qa in result.graded_qa_pairs
        ]
        
        df = pd.DataFrame(data)
        df.to_csv(f"{pdf_filename.split('.')[0]}.csv", index=True)
        print(f"Results saved to {pdf_filename.split('.')[0]}.csv")
    else:
        print(f"Unexpected result type: {type(result)}")
        print(result)

This main execution block ties everything together:

It checks for correct command-line arguments (PDF filename and number of questions).
It reads the PDF file and extracts the text.
It sets up the OpenAI API key from a configuration file.
It initializes the language model (in this case, GPT-4).
It creates a CrewManager and runs the Q&A generation process.
Finally, it processes the results, converting them into a pandas DataFrame and saving them as a CSV file.

The Q&A Generation Workflow

Let’s break down how these components work together:

Question Generation: The QuestionGenerator agent reads the input text and generates a list of relevant questions based on the provided instructions.
Answer Generation: The AnswerGenerator agent takes these questions and the original text, producing comprehensive answers for each question.
Grading: The Grader agent evaluates each question-answer pair, assigning a quality rating and providing an explanation for the rating.
Result Processing: The final output is a list of graded question-answer pairs, which is then formatted and saved as a CSV file for easy analysis.

This workflow, as illustrated in the provided image, allows for a sophisticated, multi-step analysis of complex documents, leveraging the power of AI to extract and evaluate key information.

Conclusion

By harnessing the power of AI through the CrewAI framework, we’ve created a powerful tool for automated Q&A generation and evaluation. This system can significantly streamline the process of analyzing lengthy documents, allowing professionals like Purvi to focus on high-level analysis and decision-making rather than getting bogged down in information extraction.

As AI continues to evolve, tools like this will become increasingly crucial in helping us navigate the vast seas of information we encounter daily. By automating the process of question generation, answering, and evaluation, we’re not just saving time – we’re enhancing our ability to derive meaningful insights from complex data sources.

The modular nature of this system, with its specialized agents and tasks, allows for easy expansion and customization. For instance, you could add new types of questions, implement more sophisticated grading criteria, or even integrate additional AI models for specific domain expertise.

As we continue to push the boundaries of AI-assisted information processing, tools like this Q&A generation system will play a pivotal role in transforming how we interact with and derive value from complex information sources.