Anthropicās Prompt Engineer Interactive Guide. .Ā
Free AI Tutorials | AI Academy | Advertise | AI Agent No-code n8n
Plus: Anthropicās Prompt Engineering Interactive Tutorial
Read time: 5 minutes
AI models that can actually “see” are performing worse than the average human. Why the sharpest AI minds still canāt quite match humans when it comes to visual reasoning⦠and why text-only models might still be the smarter choice?
What are on FIRE š„
IN PARTNERSHIP WITH HUEL
Youāre doing breakfast wrong
Letās face itāmost breakfast options just donāt cut it.
Toast? Too light. Cereal? Mostly sugar. Skipping it altogether? Not ideal.
If you want real fuel to power your day, itās time to upgrade to Huel Black Edition. This ready-in-seconds shake is packed with 40g of plant-based protein, 27 essential vitamins & minerals, and 0 artificial sweetenersājust science-backed nutrition to support your muscles, digestion, and more.
Oh, and did we mention? Itās delicious.
Right now, first-time customers get 15% off, plus a free t-shirt and shaker with code HUELSPRING, for orders over $75.
AI INSIGHTS
š§ Ā Genius Without a Face, Why The Smartest AI Canāt See (Yet)?
OpenAIās o3 model scored a staggering 135 on a real Mensa IQ test, placing it in the āgeniusā category, higher than most humans will ever reach. But before you assume AI is about to outthink us in every way, hereās something that might surprise you: AI models that can āseeā (like GPT-4o Vision) performedĀ worse than average humans. Why?
These models rely solely on text (no image input), and they’re currently the sharpest tools in the AI shed:
-
OpenAI o3: 135 IQ ā Genius-level, #1 on the leaderboard
-
Claude-4 Sonnet: 127
-
Gemini 2.0 Flash Thinking: 126
-
Gemini 2.5 Pro: 124
All of these outperformed the average human IQ range (90ā110) and even topped whatās considered gifted or high-IQ levels.
AI that can āseeā seems to struggle when it comes to structured reasoning. These models, despite being multimodal and newer, scored below the human average:
-
GPT-4o (Vision): 63
-
Grok-3 Think (Vision): 60
These scores are in the borderline intellectual disability range for humans, a massive gap in performance.
All top 10 models were text-only, revealing that visual reasoning is still an AI weakness. OpenAI dominates the rankings, with multiple entries in the top 10. Metaās Llama 4 Maverick scored 105 ā above average, but below the top-tier models.
Why It Matters: Multimodal models arenāt quite there yet. Despite the hype around GPT-4o and other vision-capable AIs, theyāre not matching humans in structured problem-solving. This widens the gap between marketing and reality. Companies may push multimodal AIs for their flashiness, but if you’re solving logic-heavy tasks, a simple text-only model may actually be smarter.
PRESENTED BY GENEVA TOURISM
The Science City You Didnāt Know You Needed to Visit
Did you know that the World Wide Web was born in Geneva, Switzerland? Indeed, the first version of the Internet cropped up at CERN in 1989. Today the world-renowned center is home to the largest particle accelerator and to the CERN Science Gateway ā a must-see hub for science enthusiasts that features hands-on exhibits, immersive virtual reality experiences, and live demonstrations.
TODAY IN AI
AI HIGHLIGHTS
š Apple study reveals major Large Reasoning Models (LRMs), like o3, Claude 3.7 Sonnet, DeepSeek-R1 and Gemini, collapse entirely when faced with complex problems. See the full report here.
ā After running tough coding tests on 14 major LLMs, theseāre 5 clear winners and several AIs you should avoid. Pro models donāt guarantee better output. Hereāre final results.
š Google Gemini is testing a new feature called Temporary Chats, based on an APK teardown. It works like ChatGPTās Temporary Chat mode but lets users opt out of data sharing.
š Zapier CEO just shared the chart the startup uses to measure AI fluency, ranging from āunacceptableā to ātransformative.ā Where do you fall on the chart?
š A new YC-backed startup just launched a frontier research agent that doesnāt stop until it finds what you need. It scores a 94.9% on OpenAIās SimpleQA. Try it here.
š A 16-year-old boy killed himself after online criminals used fake AI nude photos to blackmail him for $3,000. FBI warns these cruel scams are hitting more teens.
š° AI Daily Fundraising: Anysphere has secured $900 million in funding, achieving a $9.9 billion valuation. Key investors include Thrive Capital, Accel, and DST Global. Their AI tool generates $200 million annually.
AI SOURCES FROM AI FIRE
ToolDrop Episode 10 is LIVE šĀ (Hard to believe weāve made it to 10 eps)
Hereās whatās in store for you this week (FREE download below):
-
Anthropicās Prompt Engineering Interactive Tutorial
-
Collection of Awesome LLM Apps with AI Agents
-
List of Free GPTs without needing a Plus subscription
-
AI-Powered Task Management System
-
Memory for AI Agents in 5 Lines of Code
Note: These exclusive resources & reviews are available only in our AI Fire community. Itās because you guys can freely ask for support or share personal experience during testing there. Get your full breakdown here (no hidden fee)!
NEW EMPOWERED AI TOOLS
-
šøĀ Glims turns any photos or frames into catchy videos, right in your browser.
-
š„Ā Kling AI 2.1 offers faster rendering, lower costs & superior video quality.
-
š ļø Moonlit builds scalable content workflows for SEOs and content teams.
-
š¤Ā FuseBase AI agents unify internal & external teamwork with Notion-style.
-
šļø Agora is an AI search engine for millions of e-commerce stores, products.
AI QUICK HITS
-
š£ļø Apple launched live translation across Messages, FaceTime, and iPhone.
-
šļø Gemini lets you schedule recurring tasks, just like ChatGPT, hereās how.
-
š§° Hereās how to get the most out of Google Free AI Studioās all features.
-
š¬ Microsoft just dropped a Free AI video creator, and it’s wildly easy to use.
-
ā³ Anthropic quietly killed its AI blog, “Claude Explains” just a month after launch.
AI CHART
š§¬Ā The Lightning AI Drug R&D Has Been Waiting For – 1000x Faster Than Physics
A new biomolecular AI from MITās Jameel Clinic and Recursion just achieved a major milestone: Boltz-2 predicts binding affinity with physics-grade accuracy, but does it 1,000 times faster. Itās the first deep learning model to rival traditional FEP simulations.
š§ What Is Boltz-2? Itās a next-gen biomolecular foundation model for predicting:
-
3D molecular structures & Protein-ligand binding affinity
-
Successor to Boltz-1, already widely used as an open-source alternative to Googleās AlphaFold3.
š What Makes It Different
-
Jointly models structure + binding affinity in a single model ā First AI model to match FEP-level affinity accuracy.
-
Over 1000x faster than physics-based simulation pipelines like FEP+ or OpenFE. Outperforms docking and ML methods on real-world drug screening benchmarks (e.g., MF-PCBA).
š¬ Benchmark Performance
-
OpenFE Benchmark: Pearson correlation of 0.62, matching FEP performance.
-
CASP16 Affinity Challenge: Outperformed all other submitted methods.
-
Prospective Screening (TYK2): Top-10 Boltz-2 compounds validated by ABFE as strong binders.
-
Crystal Structure Prediction: Matches or exceeds Boltz-1, especially on DNA, RNA, and antibody-antigen complexes.
It supports molecular dynamics (MD) conditioning at inference for improved local accuracy. It was also optimized for GPU inference and large-scale use cases (e.g., SynFlowNet screening).
ā Boltz-2 aims to own the open ecosystem just like AlphaFold did, but with a broader scope (affinity + structure).
AI JOBS
We read your emails, comments, and poll replies daily
Hit reply and say Hello ā we’d love to hear from you!
Like what you’re reading? Forward it to friends, and they can sign up here.
Cheers,
The AI Fire Team
Ā
Leave a Reply