Tree of Thoughts (ToT) Explained: Improving Reasoning in RAG & Agentic AI

Tree of Thoughts (ToT): Improving Reasoning in RAG and Agentic AI Systems

Published March 2026 · RAG · 5 min read · Agentic AI · Multi-Path Reasoning · LLM Systems

While building production-grade RAG and agentic AI systems in BFSI domain, I ran into a repeated problem: traditional Chain-of-Thought (CoT) fails in multi-path reasoning scenarios e.g., financial decisioning, risk exploration, etc.

To solve these kinds of problems, I implemented Tree of Thoughts (ToT) which is a structured reasoning approach that significantly improves exploration and decision quality.

In this article, I’ll break down:

Why Chain-of-Thought (CoT) fails in real-world production systems
How Tree of Thoughts (ToT) works internally with multi-path reasoning
How you can implement this approach using Gemini API

If you're building similar AI systems and facing reasoning limitations, this article will be directly useful.

Why CoT fails in real-world systems ?

Before directly jumping into Tree of Thoughts, let’s first understand why Chain of Thought (CoT) fails in multi-path reasoning scenarios.

The major issue with CoT prompting is that it assumes there is only one correct reasoning path. However, in real-world problems, we often have multiple possible interpretations, multiple valid solutions, or alternative decision-making processes.

Example

Let’s assume we are building a risk evaluation pipeline for loan applications:

Path 1: Based on income stability → Approve
Path 2: Based on credit history → Reject
Path 3: Based on recent transactions → Conditional

In this scenario, Chain-of-Thought (CoT) typically selects one path and commits early. It does not explore alternative reasoning paths that could potentially lead to a better decision.

The problem with early commitment is that the final decision becomes biased toward the initial reasoning path. If an early step in the reasoning is incorrect, the entire process becomes flawed, often resulting in confident but incorrect outputs.

Simply put, CoT is a linear reasoning strategy trying to solve inherently non-linear problems.

What is Tree of Thoughts

Tree of Thoughts (ToT) is a prompting technique that explores multiple reasoning paths instead of following a single linear chain.

In this approach, each node in the tree represents a thought, and each edge represents a possible transition or reasoning path.

Major Components of Tree of Thoughts

To understand how Tree of Thoughts (ToT) works, let’s break it down into its key components:

State

The state represents the current memory of the system, containing the most relevant and important thoughts generated so far. Execution starts from an initial state, which can be an instruction, a user query, or any problem definition.

As the process progresses, this state is continuously updated with better and more refined thoughts.

Thoughts

Simply put, thoughts are different reasoning outputs generated by the LLM. Using a ToT prompt, we generate multiple (K) candidate thoughts based on the current state.

New State

A new state is created by updating the existing state with selected thoughts. This is where state transition happens.

We take the generated thoughts, expand them, and form a new state. This updated state is then used for the next round of thought generation.

In summary:

Evaluation

One of the most important aspects of ToT is evaluating multiple thoughts and assigning a score to each of them. Based on these scores, we can apply selection strategies such as beam search to retain only the top-k thoughts.

So the process becomes:

Repeat for N iterations

This process is repeated multiple times to explore better and more diverse reasoning paths. The number of iterations (N) is a hyperparameter, typically determined through experimentation based on the use case.

Implementation

Objective: Build a pipeline to assist in property purchase decision-making in Pune.

1. Loading the Gemini Model


        def get_response(prompt, max_output_tokens = 800):
            response = client.models.generate_content(
                model="gemini-3-flash-preview",
                contents=prompt,
                config=types.GenerateContentConfig(
                    max_output_tokens=max_output_tokens,
                )
            )
            
            return response.text

We will use this model to implement the Tree of Thoughts (ToT) approach. You can use any LLM of your choice depending on your requirements.

2. Helper Function to Generate K Thoughts


    # helper function to generate thoughts for a given state
    def generate_thoughts(state, num_thoughts=3):
        prompt = f"""
        You are an investment strategist.
        
        Current situation:
        {state.description}
        
        Steps so far:
        {state.steps}
        
        Generate exactly {num_thoughts} next strategies.
        
        STRICT RULES:
        - Return ONLY the strategies
        - No explanations
        - No intro text
        - No numbering like "Here are..."
        - Each strategy on a new line
        - Be concise (1 line per strategy)
        
        Output format:
        <strategy 1>
        <strategy 2>
        <strategy 3>
        """

        response = get_response(prompt)

        thoughts = [t.strip() for t in response.strip().split("\n") if t.strip()]

        return thoughts

This function generates K candidate thoughts based on the current state. These thoughts represent different possible reasoning paths.

3. Helper Function to Expand Thoughts


    # helper function to convert thoughts to state, it applies state.description to the thought
    def expand_state(state, thoughts):
        new_states = []

        for t in thoughts:
            new_description = f"{state.description} -> {t}"
            new_steps = state.steps + [t]

            new_states.append(State(new_description, new_steps))

        return new_states

This function expands the generated thoughts, allowing deeper exploration of each reasoning path. It is used during the state transition step.

4. Helper Function to Evaluate State


    # Evaluation function to rank the thoughts in state from best to worse
    def rank_states(states):
        descriptions = "\n".join(
            [f"{i+1}. {s.description}" for i, s in enumerate(states)]
        )

        prompt = f"""
            Rank the following investment strategies from BEST to WORST.
            
            Be decisive. Avoid ties.
            
            {descriptions}
            
            Return ONLY ranking like:
            2 > 1 > 3
            """

        response = get_response(prompt)
        ranking = response.strip()

        order = [int(x.strip()) - 1 for x in ranking.split(">")]

        # assign scores based on rank
        for rank, idx in enumerate(order):
            states[idx].score = len(states) - rank

        return states

This function evaluates the generated thoughts and assigns a score or rank based on defined evaluation criteria. This step is critical for selecting the most promising paths.

In production, evaluation is often hybrid (LLM + rules + domain heuristics)

5. Beam Search (Orchestration)


    # orchestrator, helper function to perform ToT
    def beam_search(initial_state, beam_width=3, max_steps=3):
        states = [initial_state]

        for step in range(max_steps):
            all_candidates = []

            for state in states:
                thoughts = generate_thoughts(state)
                new_states = expand_state(state, thoughts)

                ranked = rank_states(new_states)
                all_candidates.extend(ranked)

            # GLOBAL top-K selection
            all_candidates.sort(key=lambda x: x.score, reverse=True)
            states = all_candidates[:beam_width]

            print(f"\nStep {step+1}:")
            for s in states:
                print(s)

        return states

Here, based on our objective, we define:

Initial state (problem context)
Depth (number of iterations)
Breadth (number of thoughts generated at each step)

Output Logs

Step 1:
Invest in Pune real estate -> Acquire commercial retail assets or land parcels along the emerging Pune Metro and Ring Road corridors. | Score: 3
Invest in Pune real estate -> Invest in premium gated communities in the Baner-Balewadi belt to capture upscale housing demand. | Score: 2
Invest in Pune real estate -> Target under-construction residential units near Hinjewadi and Kharadi IT hubs for high rental yields. | Score: 1

Step 2:
Invest in Pune real estate -> Target under-construction residential units near Hinjewadi and Kharadi IT hubs for high rental yields. -> Select Grade-A developers with strong RERA track records and proven delivery timelines to minimize execution risk. | Score: 3
Invest in Pune real estate -> Acquire commercial retail assets or land parcels along the emerging Pune Metro and Ring Road corridors. -> Partner with Grade-A developers to create high-density mixed-use projects leveraging Transit-Oriented Development (TOD) incentives near Metro stations. | Score: 2
Invest in Pune real estate -> Invest in premium gated communities in the Baner-Balewadi belt to capture upscale housing demand. -> Diversify into Grade-A commercial office spaces in Baner to benefit from the expansion of IT and Global Capability Centers. | Score: 2

Step 3:
Invest in Pune real estate -> Target under-construction residential units near Hinjewadi and Kharadi IT hubs for high rental yields. -> Select Grade-A developers with strong RERA track records and proven delivery timelines to minimize execution risk. -> Prioritize compact 1BHK and 2BHK configurations to align with the high-demand rental profile of IT professionals. | Score: 3
Invest in Pune real estate -> Acquire commercial retail assets or land parcels along the emerging Pune Metro and Ring Road corridors. -> Partner with Grade-A developers to create high-density mixed-use projects leveraging Transit-Oriented Development (TOD) incentives near Metro stations. -> Execute a capital recycling strategy via REIT inclusion or a strategic sale to institutional investors once occupancy stabilizes. | Score: 3
Invest in Pune real estate -> Invest in premium gated communities in the Baner-Balewadi belt to capture upscale housing demand. -> Diversify into Grade-A commercial office spaces in Baner to benefit from the expansion of IT and Global Capability Centers. -> Target high-street retail developments to service the surge in affluent residential occupancy. | Score: 3

Final State

        [Invest in Pune real estate -> Target under-construction residential units near Hinjewadi and Kharadi IT hubs for high rental yields. -> Select Grade-A developers with strong RERA track records and proven delivery timelines to minimize execution risk. -> Prioritize compact 1BHK and 2BHK configurations to align with the high-demand rental profile of IT professionals. | Score: 3,
        Invest in Pune real estate -> Acquire commercial retail assets or land parcels along the emerging Pune Metro and Ring Road corridors. -> Partner with Grade-A developers to create high-density mixed-use projects leveraging Transit-Oriented Development (TOD) incentives near Metro stations. -> Execute a capital recycling strategy via REIT inclusion or a strategic sale to institutional investors once occupancy stabilizes. | Score: 3,
        Invest in Pune real estate -> Invest in premium gated communities in the Baner-Balewadi belt to capture upscale housing demand. -> Diversify into Grade-A commercial office spaces in Baner to benefit from the expansion of IT and Global Capability Centers. -> Target high-street retail developments to service the surge in affluent residential occupancy. | Score: 3]

This output Logs shows how Tree of Thoughts explores multiple investment strategies step-by-step, evaluates them at each stage, and progressively refines the most promising options.

The final state represents the top-performing reasoning paths (highest scores), indicating the most optimal real estate investment strategies after iterative exploration and selection.

When NOT to use Tree of Thoughts

Simple Q&A → use CoT
Latency-sensitive APIs → avoid ToT
Deterministic workflows → use rules instead

Final Thoughts:

Tree of Thoughts (ToT) is a powerful prompting technique that enables structured exploration of multiple reasoning paths. It is particularly useful in RAG systems and agentic AI workflows where decision-making involves uncertainty and multiple possible outcomes.

However, its effectiveness depends on how well you design the thought generation, evaluation, and selection strategy, especially when balancing accuracy, latency, and cost in production systems.

If you are building RAG pipelines or agentic AI solutions and want to incorporate advanced reasoning techniques like ToT, feel free to reach out. I’d be happy to discuss how this can be applied to your specific use case.

Until next time, Happy learning!

LinkedIn profile: Connect on LinkedIn

← Back to Articles