feat: Add token counting utility + Add support for it in Compression #5593

Mustafa-Esoofally · 2025-12-03T15:37:15Z

Summary

Adds a comprehensive token counting utility that works across multiple model providers (OpenAI, Anthropic, AWS Bedrock, Google Gemini, LiteLLM) wherever supported

Also, integrates token-based compression into the existing CompressionManager.

Type of change

Checklist

Code complies with style guidelines
Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
Self-review completed
Documentation updated (comments, docstrings)
Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
Tested in clean environment
Tests added/updated (if applicable)

libs/agno/agno/utils/tokens.py

manuhortet

Nice! would be great to see tests for the model-specific counting functions too

libs/agno/agno/compression/manager.py

manuhortet · 2025-12-05T06:38:51Z

libs/agno/agno/models/aws/bedrock.py

+            return response.get("inputTokens", 0)
+        except Exception as e:
+            log_warning(f"Failed to count tokens via Bedrock API: {e}")
+            return super().count_tokens(messages, tools)


Can't we just use this? probably the same counting mechanism, as it should just depend on the model encoding?

The token counting logic won't work for our Claude models

Why cant we use the count_tokens fn of the base Claude class: libs/agno/agno/models/anthropic/claude.py ?

Bedrock supports other Non-Anthropic models as well so count_tokens fn of the base Claude class won't work? Also I am not sure if token couting is same here because Claude has intelligent caching

libs/agno/tests/unit/utils/test_tokens.py

cookbook/agents/context_compression/token_based_tool_call_compression.py

libs/agno/agno/compression/manager.py

dirkbrnd · 2025-12-08T15:24:06Z

libs/agno/agno/compression/manager.py

+        self,
+        messages: List[Message],
+        tools: Optional[List] = None,
+        main_model: Optional[Model] = None,


target_model?

better to use both as model

libs/agno/agno/models/aws/bedrock.py

libs/agno/agno/models/base.py

dirkbrnd · 2025-12-08T19:33:45Z

libs/agno/agno/models/base.py

                    # Add a function call for each successful execution
                    function_call_count += len(function_call_results)

-                    all_messages = messages + function_call_results


Why are we changing this? I think probably you are right, but there was a reason we did it here

Before we were limited by tool call count but now we can estimate token count for messages before API calls so moved it up (before 1st API call)

dirkbrnd · 2025-12-08T19:37:18Z

libs/agno/agno/models/google/gemini.py

+        messages: List[Message],
+        tools: Optional[List[Union[Function, Dict[str, Any]]]] = None,
+    ) -> int:
+        if not self.vertexai:


Not true? Their client supports it in general? And worsk with Multimodal which is nice

For Google AI Studio the system_instruction and tools are ignored for count_token.

For VertexAI it works correctly

dirkbrnd · 2025-12-08T19:38:57Z

libs/agno/agno/models/google/gemini.py

+                    tool_names.append(result.tool_name)
                message_metrics += result.metrics

+        tool_name = ", ".join(tool_names) if tool_names else None


What does this do? It looks like you append all the tool names?

Gemini combines multiple tool results into a single message (unlike OpenAI/Claude) but can we find the tool name - a comma-separated list like "search, calculator". Useful for logging (message.log()) and debugging

dirkbrnd · 2025-12-08T19:41:53Z

libs/agno/agno/utils/tokens.py

+# Tool token counting
+
+
+def _format_function_definitions(tools: List[Dict[str, Any]]) -> str:


What format does this create? Is this how it is done for OpenAI?

added comments in the code

libs/agno/agno/models/anthropic/claude.py

ysolanky · 2025-12-08T20:39:35Z

libs/agno/agno/utils/tokens.py

+        for msg in messages:
+            total += _count_message_tokens(msg, model_id, tokens_per_message, tokens_per_name)
+
+    # Add 3 tokens for reply priming


Whats the rationale here?

added comments in the code

ysolanky · 2025-12-08T20:40:56Z

libs/agno/agno/utils/tokens.py

+
+    # Count tool tokens
+    if tools:
+        includes_system = any(msg.role == "system" for msg in messages)


Can you please share the rationale here as well? Also some comments here would be good

added comments in the code

What is meant by more efficiently here?

Updated the comment

ysolanky · 2025-12-08T20:42:36Z

libs/agno/agno/utils/tokens.py

+) -> int:
+    tokens = tokens_per_message
+
+    if message.role:


Can you please look into whether a model counts the "role" as a part of input tokens? As more often than not, role takes up a separate param

Yup! Updated the algo

libs/agno/agno/models/openai/responses.py

ysolanky · 2025-12-09T03:21:28Z

libs/agno/agno/utils/tokens.py

-        Total token count for the text.
+        # gpt-4o models use the newer o200k_base encoding with 200k vocabulary
+        if "gpt-4o" in model_id.lower():
+            return tiktoken.get_encoding("o200k_base")


What about gpt 5? Does that not use the newer o200k_base encoding?

We can use tiktoken method which handles all models

libs/agno/agno/utils/tokens.py

Co-authored-by: Yash Pratap Solanky <[email protected]>

Update

bc3cc01

Mustafa-Esoofally requested a review from a team as a code owner December 3, 2025 15:37

Mustafa-Esoofally added 5 commits December 3, 2025 15:41

Update

517f2d7

Update

22ab15d

Update

7011c06

Merge branch 'main' into feat/improve-token-counting

8a43877

update

4c73db2

Mustafa-Esoofally changed the title ~~Feat: Improve token counting Logic~~ feat: Add token counting utility + Add support for it in Compression Dec 4, 2025

dirkbrnd reviewed Dec 4, 2025

View reviewed changes

libs/agno/agno/utils/tokens.py Outdated Show resolved Hide resolved

dirkbrnd reviewed Dec 4, 2025

View reviewed changes

libs/agno/agno/utils/tokens.py Show resolved Hide resolved

dirkbrnd reviewed Dec 4, 2025

View reviewed changes

libs/agno/agno/utils/tokens.py Outdated Show resolved Hide resolved

Mustafa-Esoofally added 4 commits December 4, 2025 09:24

Merge branch 'main' into feat/improve-token-counting

6f43ed5

update

ff1e84a

update

f6e7200

Merge branch 'main' into feat/improve-token-counting

259b5a7

manuhortet reviewed Dec 5, 2025

View reviewed changes

dirkbrnd reviewed Dec 5, 2025

View reviewed changes

libs/agno/tests/unit/utils/test_tokens.py Show resolved Hide resolved

Mustafa-Esoofally added 9 commits December 5, 2025 08:48

Merge branch 'main' into feat/improve-token-counting

8728502

Update

3e51b13

Update

2fec0e9

Merge branch 'main' into feat/improve-token-counting

4269391

Update tests

131f190

Merge branch 'main' into feat/improve-token-counting

5d1ed33

Merge branch 'main' into feat/improve-token-counting

5f09d4b

Merge branch 'main' into feat/improve-token-counting

6458a30

Update

be4e3c1