Claude Code: How to Use It Properly and Avoid Wasting Tokens

April 29, 2026 jarvis

AI article illustration for ai-jarvis.eu

  Claude Code from Anthropic is one of the most capable AI tools for programmers, but its power has a dark side: every unnecessary message, imprecise instruction, or long session can cost you dozens of extra dollars per month. We found out how to use the tool effectively and where tokens are most often wasted.

What is Claude Code and why it deserves attention

Claude Code is an agentic coding assistant from Anthropic that works directly in your terminal, IDE, or browser. Unlike a classic chatbot, it doesn't just answer questions — it actively browses code, edits multiple files at once, runs tests, creates pull requests, and integrates with GitHub and GitLab. According to Anthropic, teams using Claude Code deploy code 7.6 times more often than before.

The tool is available in multiple forms: as a desktop application, extensions for VS Code and JetBrains, web interface, mobile app, and Slack integration. For Czech developers, it's key that the service is fully available in the Czech Republic and the entire EU, including paid subscriptions by card. Anthropic explicitly states that all user content in paid plans is not used for model training — which is an important advantage in the context of protecting companies' intellectual property.

How much does Claude Code actually cost

Before we dive into saving tips, you need to know the pricing. Claude Code is part of the Claude Pro subscription and is not available standalone or in the free version.

Plans for individuals:

Pro: $17 per month with annual billing ($200 per year), or $20 per month. Includes Claude Code, Claude Cowork, uninterrupted access to Sonnet 4.6 and Opus 4.7 models, projects, and extensions for Excel, PowerPoint, and Word.
Max 5x: $100 per month — five times higher usage limit than Pro.
Max 20x: $200 per month — twenty times higher limit, priority during server load.

API prices (relevant for companies and advanced users):

Sonnet 4.6: $3 per million input tokens, $15 per million output tokens.
Opus 4.7: $5 per million input tokens, $25 per million output tokens.
Haiku 4.5: $1 per million input tokens, $5 per million output tokens.

For perspective: one token roughly corresponds to one English word. When you send a long source code and Claude responds with an extensive explanation, you can easily reach tens of thousands of tokens in a single session. In practice, this means that unconscious habits can double your monthly costs without even upgrading your plan.

One session, one task

The most common and most expensive mistake of Claude Code users is mixing topics in one conversation. Claude carries the entire context of the history — every new message means the model re-reads all previous messages. When after sixty messages about debugging you switch to a new feature, Claude still reads through the entire debugging history to answer the first question about the new task.

The solution is simple: one session = one task. Once you complete a feature, fix a bug, or generate documentation, use the /clear command (in the terminal) or start a new conversation. This habit alone can reduce token consumption by 30–50 percent.

Combine queries, don't feed Claude one by one

The second widespread habit: sending queries individually because it feels more "natural." In reality, it's financially disadvantageous. Three separate messages mean three complete context loads. One message with three questions means one load — and often a better answer, because Claude sees the whole picture at once.

Instead of: "Check this function for errors." → "And is the exception handling sufficient?"
Write: "Check this function for errors and assess whether the exception handling is sufficient."

This principle also applies in regular chat with Claude: the more specific and comprehensive your query, the fewer tokens you waste on guesses and additional explanations.

Edit the original message, don't add corrections

When Claude answers inaccurately, most users send a follow-up correction: "That's not what I meant, try it differently…" Each such correction is added on top of the history. Claude reads it together with everything previous — and tokens pile up.

Much more effective is to click the Edit button on the original message, modify it, and let Claude regenerate the answer. The original bad exchange disappears, token consumption stays at the original level. This trick is especially important in long sessions, where the cumulative effect of every unnecessary message quickly escalates.

Turn off Extended Thinking when you don't need it

Claude offers an Extended Thinking mode (extended reasoning), in which the model step by step thinks through the answer before assembling it. This process naturally consumes extra tokens — and often a significant amount.

For architecture design, complex logical tasks, or difficult decisions, it makes sense to turn on Extended Thinking. For writing an email, refactoring a function, generating tests, or summarizing a document, it's unnecessary waste. The default setting should be off — turn it on only for tasks where you see a real benefit in answer quality.

Precision in prompts saves more than shortening answers

A vague instruction "Help me fix the error in the dashboard" forces Claude to search more files, guess context, and offer more solution variants. A specific instruction "Fix the null reference in the loadUserMetrics method in the dashboard/analytics.js file" gets straight to the point.

This principle sounds trivial, but its impact on tokens is enormous. A vague prompt often generates an answer two or three times longer than necessary — and you pay for every output token. Specificity is free; vagueness is expensive.

Use off-peak hours for longer work

Since March 26, 2026, Anthropic changed how limits work: during peak times, the 5-hour usage window depletes faster. Peak time is 5:00–11:00 Pacific Time, which corresponds to 2:00–8:00 PM Central European Summer Time. Evening and night work (from the Czech perspective after 8 PM) therefore allows you to get significantly more from your plan.

For Czech developers and companies, this means a practical tip: schedule extensive refactorings, documentation generation, or automated tasks for evening hours. The difference between peak and off-peak can be up to double the number of available tokens within the same subscription.

When to consider an upgrade and when to change habits

If you regularly hit the Pro plan limit, it doesn't automatically mean you need Max. First, check whether token waste isn't happening due to the habits described above. Users who switched from vague prompts to specific ones, introduced /clear between tasks, and started batching queries often report reduced consumption by 40–60 % without any drop in productivity.

The Max plan is only worth it when you use Claude Code intensively for several hours a day on large codebases, need priority access during server load, or work in a team where every member needs a high volume of tokens. For individual developers and smaller teams, the Pro plan with disciplined usage is usually sufficient.

Claude Code versus competition: Gemini, GitHub Copilot, Cursor

Claude Code is not the only player in the field. GitHub Copilot costs $10 per month and offers direct integration into Visual Studio, but its agentic capabilities are more limited — it rather complements code than independently makes changes across the project. Cursor ($20 per month) combines an editor with AI and allows more advanced edits, but stands on its own ecosystem. Gemini Code Assist from Google is free for individuals and integrates into VS Code and JetBrains, but in tests lags in understanding more complex codebases.

The advantage of Claude Code is precisely its agentic approach: the ability to analyze the entire project, make changes across multiple files, run tests, and interact with GitHub — all without leaving the terminal. For complex refactorings and work with large projects, it is therefore often more effective than the competition, even though its price is somewhat higher.

Summary: six rules for effective Claude Code

One task, one session — use /clear or start fresh.
Batch queries — one message with multiple questions saves more than shortening answers.
Edit the original prompt — instead of follow-up corrections, modify the source message.
Be specific — vagueness is paid for with every token.
Turn off Extended Thinking — turn it on only for complex tasks.
Work in the evening — off-peak (after 8 PM CEST) gets you more from your subscription.

Claude Code is an exceptionally capable tool, but its power fully shows only with disciplined use. With the right habits, an individual developer can stay on the Pro plan even with intensive daily work — and save hundreds of dollars per year.

Is Claude Code available for free?

No. Claude Code is part of paid plans Pro ($20/month) and higher. It cannot be used in the free version of Claude. For students and academic institutions, Anthropic offers discounted academic plans.

How large a project can Claude Code process at once?

Claude Code uses agentic search, so it doesn't load the entire project at once, but browses relevant files based on context. Practically, this means it can work even with monorepos of tens of thousands of lines of code, as long as tasks are well specified. However, with overly large context windows, token consumption increases.

Can a company in the EU safely use Claude Code for internal code?

Yes. Anthropic guarantees in all paid plans that user content is not used for model training. For companies with higher security requirements, it offers an Enterprise plan with audit logs, SCIM provisioning, HIPAA compliance, and custom data retention controls. The service is fully available in the EU including the Czech Republic.