Top benchmarks and agent performance
According to independent evaluation by Artificial Analysis, Grok 4.3 achieved a score of 53 points on the Intelligence Index, placing it in seventh place in the overall ranking. It thus surpassed models such as Muse Spark or Claude Sonnet 4.6 and beat its predecessor Grok 4.20 by four points. However, the most significant improvement was achieved in the area of agent tasks — that is, in scenarios where the AI itself calls tools, browses data, and fulfills complex instructions.
On the GDPval-AA benchmark, which measures performance on real economically valuable tasks with access to the web and terminal, Grok 4.3 scored ELO 1500. This is 321 points more than the previous version Grok 4.20 and means the model surpassed Gemini 3.1 Pro Preview, Muse Spark, GPT-5.4 mini, and Kimi K2.5. It is still several hundred points behind the leading GPT-5.5 (xhigh), but its performance leap in this category is one of the most significant in recent months.
In the area of instruction following, Grok 4.3 also performed excellently: on the 𝜏²-Bench Telecom benchmark it achieved 98% and on IFBench it maintained 81%. These tests verify how precisely the model follows given instructions — a key skill for automating enterprise processes.
Dominance in enterprise domains
In addition to general rankings, xAI also boasts results in tests by ValsAI, which focuses on specific industries. Here Grok 4.3 took first place in the categories of case law with an accuracy of 79.3% and corporate finance with a result of 68.5%. In the overall Vals Index, the model placed 13th out of 46 evaluated models with an overall score of 62.6%.
This specialization may be particularly interesting for Czech and European companies. Legal and financial domains are among the areas where artificial intelligence is making the fastest inroads, and the model's ability to understand complex legal documents or financial statements opens up possibilities for automating compliance, due diligence, or legal research.
Speed and context window
xAI describes Grok 4.3 as its fastest model to date. According to Artificial Analysis measurements, it generates 107 tokens per second, placing it fourth in the speed ranking behind models such as gpt-oss-120B or Gemini 3.1 Pro Preview. For developers, this means significantly smoother interaction even when working with long documents.
A key advantage is also the million-token context — that is, the model's ability to process the equivalent of approximately 750,000 words in Czech at once. This enables analysis of entire books, extensive legal files, source code of large projects, or complete financial statements without the need to split them into parts. In practice, this greatly simplifies workflows where data from multiple sources needs to be compared at once.
A price that pressures the competition
One of the strongest selling points of Grok 4.3 is its price. xAI has set rates at $1.25 per million input tokens and $2.50 per million output tokens. Compared to the previous version Grok 4.20, this represents a reduction in input price of 37.5% and output price of 58.3%. According to Artificial Analysis calculations, running the complete benchmark suite costs $395, which is roughly 20% less than its predecessor.
In the market context, this means that Grok 4.3 is among the most efficient models in its class in terms of price-to-performance ratio. In blended pricing (3:1 input to output ratio) it comes out to $1.6 per million tokens, which is less than Gemini 3.1 Pro Preview ($4.5), GPT-5.4 ($5.6), or Claude Opus 4.7 ($10.9). For European startups and developers, this can be an interesting alternative when building applications where not only performance but also operating costs play a role.
Availability in the Czech Republic and Europe
The model is available through the xAI API, which is globally accessible including the European Union. For Czech developers and companies, this means the ability to integrate Grok 4.3 into their own applications without geographical restrictions. End users can also use the model through subscriptions on the social network X (formerly Twitter), where Grok is integrated into higher subscription tiers.
Czech language support is not explicitly declared by xAI on the official level, but models of this class typically master dozens of languages including Czech. However, for demanding legal or financial tasks in the Czech environment, it is still advisable to verify the quality of outputs on specific datasets. Given that this is a closed proprietary model, companies should also consider the requirements of the EU AI Act, especially if they plan deployment in regulated areas such as law or financial advisory.
Comparison with competition at a glance
| Model | Intelligence Index | Speed (tok/s) | Price 1M tokens (blend) |
|---|---|---|---|
| GPT-5.5 (xhigh) | 60 | 76 | 11.3 $ |
| Claude Opus 4.7 | 57 | 49 | 10.9 $ |
| Gemini 3.1 Pro Preview | 57 | 137 | 4.5 $ |
| Grok 4.3 | 53 | 107 | 1.6 $ |
| DeepSeek V4 Pro | 52 | 34 | 2.2 $ |
Conclusion
Grok 4.3 is a significant step forward for xAI. While it still lags behind top models like GPT-5.5 or Claude Opus 4.7 in absolute intellectual performance, in the area of agent tasks and instruction following it has managed to climb among the absolute top. Combined with aggressive pricing and a million-token context, it makes for an interesting tool for developers and companies looking for a powerful yet financially accessible model for automating complex workflows.
For the Czech scene, the news of more competition in the field of large language models is encouraging — pressure on prices and performance increases ultimately benefit all users, from individual developers to large corporations.
How does Grok 4.3 differ from version Grok 4.20?
Grok 4.3 brings particularly significant improvement in agent tasks (an increase of 321 ELO points on GDPval-AA), faster text generation, and significantly lower prices — by 37.5% on input and 58.3% on output. It is also the first xAI model with a context window of 1 million tokens.
Can I use Grok 4.3 for Czech legal or financial texts?
The API is available from the Czech Republic and the entire EU, and the model theoretically supports Czech as one of many languages. However, for demanding legal or financial tasks in the Czech environment, it is advisable to perform your own verification of output quality, as ValsAI benchmarks were measured primarily on English data.
Is Grok 4.3 open-source?
No, Grok 4.3 is a closed proprietary model available only through xAI's API or as part of a subscription on the X platform. Currently, xAI does not disclose the model weights or technical details about its architecture.