• Neural Net
  • Posts
  • Peering Inside The Black Box: How AI Thinks

Peering Inside The Black Box: How AI Thinks

Plus xAI Buys X, An AI IPO Bellyflop, The Natural Gas Race, And More

Welcome to another edition of the Neural Net!

In this edition: Anthropic tries to crack open the black box (no crowbar required), CoreWeave flirts with Wall Street, Musk expands his “X” empire, and Meta drops $10B on a rice field in Louisiana. Buckle up—AI never sleeps, and neither do the headlines.

Anthropic Attempts To Break Open The Black Box

One of the biggest mysteries in AI today isn’t what these models output—it’s how they arrive at the output. Large language models (LLMs) like Claude, ChatGPT, and Gemini spit out impressive responses, but under the hood? It’s mostly a black box.

Anthropic’s recent work on a framework called circuit tracing aims to reverse engineer that process by tracing the model’s internal logic. This gets at what data scientists call the black box problem in machine learning: we can see what goes in and what comes out, but not much about what happens in between.

Research into explainability is growing fast—but with how quickly AI tools are being adopted, we’ve ended up in a “cart-before-the-horse” situation. This issue isn’t unique to language models—it’s present across many ML applications, from forecasting to clustering. In general, the more complex the model, the more opaque its inner workings are—especially in deep learning.

In a recent study, Anthropic used circuit tracing to peer inside the black box of their Claude model—and found some fascinating behavior. As they put it, “These findings aren’t just scientifically interesting—they represent significant progress towards our goal of understanding AI systems and making sure they’re reliable.”

So, What’s Going On Inside Claude’s “Mind”?

  1. Universal Concepts Across Languages
    Claude appears to think in a shared conceptual space that’s independent of language. When translating simple sentences, the model processes the same ideas similarly, regardless of the language used—hinting at a kind of “language of thought.”

  2. Long-Term Planning
    Even though Claude generates one word at a time, it often plans several words ahead. In poetry, for example, it thinks of rhyming words in advance and writes lines to land on them.

  3. Plausible But Wrong Reasoning
    Claude can sometimes prioritize agreeing with the user over being logically correct. This lead to Claude providing “fake” reasoning when nudged.

Why Is Explainable AI Important?

🛠️ Debugging and Performance Enhancement
Explainability helps developers tune the model by understanding which features or inputs are driving the prediction.

⚖️ Bias Detection
It can reveal hidden biases baked into the model’s training data or behavior—biases that might otherwise go unnoticed.

📈 Interpret Results with Confidence
Knowing how a model reached its conclusion gives meaning to the result. (Because when your boss asks, “How did you get that?”—“the model said so” isn’t going to cut it.)

The Reality: We’ve built powerful machine learning models faster than we’ve learned how to truly understand them.

The EU is trying to find an interim solution by putting guardrails around unexplainable AI by categorizing models into different risk levels. In high-risk domains like healthcare, black-box models may be restricted, while low-risk use cases like spam filtering are more acceptable.

Ideally every model would be as transparent as a decision tree, where you can follow a simple yes/no flowchart: Does the applicant have a credit score over 700? Yes → Do they have existing debt? No → Approve the loan. You can see every step. There’s no mystery.

That’s why work like Anthropic’s matters. The future of AI isn’t just about power and performance. It’s about trust, transparency, and understanding what’s going on inside the box before we hand it the keys to everything else.

 

Heard in the Server Room 

Elon Musk just pulled another power move—his AI startup, xAI, is absorbing X (formerly Twitter) in an all-stock deal that values them at $80 billion and $33 billion, respectively. The plan? Use X’s firehose of real-time data to supercharge xAI’s models (like its chatbot, Grok) while making it easier to secure funding under the newly formed xAI Holdings Corp. Musk has played this game before—remember when he merged Tesla and SolarCity? Now, he’s betting big that AI and social media together will be a winning formula.

CoreWeave’s much-anticipated IPO turned into a flop, pricing below expectations and closing its first trading day flat—a stark contrast to last year’s AI frenzy. Investors pushed back on the company’s high costs and heavy reliance on Microsoft for revenue, highlighting broader concerns about the sustainability of the AI boom. The weak debut could chill other IPO hopefuls like StubHub and Klarna, as bankers had been eyeing 2025 as a comeback year for public listings. With stock-market volatility adding to the uncertainty, companies may think twice before making the leap.

Apple's taking a swing at the healthcare game with "Project Mulberry," an AI-powered triple threat featuring a digital doctor, souped-up Health app, and personalized coaching service. The AI physician will dish out custom medical insights while the health coach, likely also powered by AI, will assist users with fitness, nutrition, and lifestyle improvements. Rumor has it these features could be available as soon as Spring 2026, but given how often Siri’s AI revamp has been delayed we aren’t holding our breath.

AI’s Not-So-Secret Crush On Natural Gas

Holly Ridge, Louisiana: population low, AI potential sky-high. For nearly two decades, a massive swath of farmland in Louisiana has seen one big economic hope after another fall through—from vanishing farm jobs to a failed auto plant deal. But now, Meta is setting up shop, building a $10 billion, 4-million-square-foot data center to train future versions of its open source Llama AI models. The project promises 500 permanent jobs and a wave of construction crews, power demand, and—hopefully—momentum.

But this isn’t just about jobs and servers—it’s about power, literally. Meta’s facility alone could consume 15% of Louisiana’s current electricity generation, prompting an energy company to propose $3.2 billion in new natural gas plants. That’s raising eyebrows, especially since Meta’s contract lasts 15 years, but those gas plants could stick around for decades. (Translation: if you live in Louisiana, you may get stuck paying for Zuckerberg’s leftover power plants.)

Here’s the irony: the same tech companies that love to tout their clean energy goals are suddenly getting cozy with natural gas. Why? Because building massive AI systems takes enormous amounts of electricity. We recently explored the energy battle behind AI in another edition—read more here.

Still, locals and lawmakers are mostly on board. The jobs pay well, the land is selling, and rural Louisiana just might get a second act as a tech hub!

That’s all folks! Here’s to a strong start to your week—keep exploring, and we’ll see you in the next edition.

How did you like today's newsletter?

Login or Subscribe to participate in polls.

  • ❓Have a question or topic you’d like us to discuss? Submit it to our AMA!

  • ✉️ Want more Neural Net? Check out past editions here.

  • 💪 Click here to learn more about us and why we started this newsletter

  • 🔥 Like what you read? Help us grow the community by sharing the Neural Net!