Overview
Just like an artificial flower isnβt a real flower, artificial intelligence isnβt real intelligence. Then what is AI? To develop a deeper understanding of AI, especially large language models (LLMs), Iβve been reading extensively, including news articles, research papers, thought leadership blogs, and overhyped videos, as well as a few books. I’ve also completed several online courses.Β
Unfortunately, much of the content online is not helpful. Too many make big claims, overhype the potential, and recycle the same talking points to get views or attention.
Ironically, it was the generic AI tools that helped me understand AI better than most other resources. Unlike books or videos, they are not a one-way communication. I could ask questions and get explanations in simple language. I usually ask them to explain complex ideas as if I were a 10-year-old. I follow up with as many cross questions as I need. The best part is, they never get tired or impatient. They just keep explaining, like some of the best teachers I have had in my life β patient, clear, and genuinely helpful.
Another source of knowledge in my learning journey is Andrej Karpathy. He was the Director of AI at Tesla and a founding member of OpenAI. His content stands out because itβs grounded. No hype, no ego. Just clear thinking, shared from first principles. Thatβs the approach we believe in at Contify too.
On June 17, 2025, Andrej gave a talk at AI Startup School in San Francisco titled βSoftware Is Changing (Again).β
In this article, Iβve tried to adapt some of the key ideas from that talk to a domain I work in every day: market and competitive intelligence platforms. Iβll explore what this technology actually is, how itβs reshaping the way we work, and what we should realistically expect from it.
Mental Models to Understand LLMs
Before we start using this new technology for serious business, we need to pause and ask: What exactly are we actually working with? Without a clear understanding, we risk misusing it, underutilizing it, or expecting too much from it.
To make sense of something as complex as large language models, Iβve found that simple analogies work best. So letβs break it down using analogies and develop a few clear mental models.
LLMs Are Like Electricity
In 2017, Andrew Ng said, AI is the new electricity. He predicted that just as electricity transformed nearly every industry a century ago, AI would do the same in the years to come.
At the time, it felt like a bold metaphor. Today, itβs starting to feel literal, especially when you see how LLMs are being used and consumed across industries.
Companies like OpenAI, Anthropic, and Google are investing heavily in training these models, similar to building a power plant. They then offer access to language tokens (basically, words in a language) through APIs. And we pay for those tokens (per million tokens) just like we pay for electricity (per kilowatt-hour).
Just like electricity, API users expect predictable performance, that is, reliability, low latency, and high uptime, so that they can build useful applications. These arenβt just βnice to haveβ expectations; theyβre baseline requirements for applications built over these new technologies.
When ChatGPT went down recently, it wasnβt just a technical outage. It was an intelligence blackout. Work stalled. Dashboards went blank. Teams got stuck. It was an important reminder that weβre building applications on a new kind of infrastructure, and we need to treat it with the same seriousness as electricity.
There is one key difference. Unlike hardware for electricity, LLMs donβt compete for physical space. So, we can have multiple options and swap them instantly. For example, at Contify, we use Llama, Gemini, OpenAI, and various embedding models across various components of the platform, selecting the most suitable models based on requirements, performance, and cost.
LLMs Are Like Time-Shared Mainframes
Another mental model to think about todayβs LLMs is to imagine weβve gone back to the 1960s. LLMs are like centralized mainframes, not personal computers.
These are not just new tools, theyβre an entirely new class of computing. And just like the mainframes of the past, theyβre powerful, centralized, and also expensive to run.
Most of us donβt load LLMs on our servers. We interact with them over the internet. We submit our queries, wait for our turn, and get back a response, just as users did 60 years ago with βtime-sharedβ mainframe computers.Β
The LLM runs in a data center. Weβre thin clients on the edge. Running LLMs is expensive; therefore, it makes sense to centralize them, similar to mainframe computers, and distribute access via APIs.Β
Until LLMs become far more efficient, using them on personal devices, such as laptops, will remain out of reach.
That said, Iβm very bullish about the potential of small language models (SLMs), especially when theyβre fine-tuned and optimized for your specific use cases. Several recent studies have shown that smaller models, when trained on focused tasks, can outperform larger models on accuracy, efficiency, and responsiveness. Theyβre also easier to deploy on your secured servers, making them a strong candidate for enterprise AI applications where control and customization matter most.
LLMs Are Like People Spirits
LLMs can talk like humans. They remember more facts than any of us. They can hold a conversation across domains and write with clarity and structure. But donβt confuse that for real intelligence.Β
They are people-pleasers by design. Trained to be helpful and agreeable. Thatβs why they say βyesβ more than they should, and when they donβt know something, theyβre more likely to make up an answer instead of admitting βI donβt know.β
Weβve all seen examples where they fabricate citations or URLs that look perfectly real, inventing non-existent papers, court cases, or product specs when asked for βsourcesβ. Theyβre optimized to produce plausible text, not to validate truth.
What we see is a simulation of how humans use language. LLMs are trained on massive amounts of internet text. They have identify and memorize patterns (weights) in how humans speak and write. They generate text, but they donβt understand it.Β
They donβt understand the physical world the way we do. Their core strength lies in predicting the next word based on the patterns they have memorized. They donβt build mental models like what we are doing right now. No physical intuition. Just the ability to predict the next word based on patterns they’ve seen. That prediction engine is powerfulβit makes their responses sound intelligent.
They have near-infinite memory but no real understanding. No awareness of whatβs right or wrong. Thatβs why they can hallucinate, which is just a fancy word for βthey make up thingsβ. They might solve a complex math problem and still claim that 9.11 is greater than 9.9. And that makes us question even their correct answers.Β
Another reason LLMs feel more like spirits than real people is their limited working memory, called a context window. Even advanced models with large context windows can only hold a conversation for a limited time, and only within a single session. Once the session ends, everything resets.
Itβs not like talking to a colleague who remembers your company, your past conversations, and gets better over time. Itβs more like the main character in Memento, who forgets everything that happened yesterday.
LLMs donβt have real long-term memory. What they βknowβ is frozen in their model weights. What they βrememberβ lives only in the current session. They donβt learn from experience the way humans do.
LLMs Are Like Operating Systems
If electricity explains the infrastructure, and people’s spirits explain the behavior, then this mental model helps us understand how the entire ecosystem is evolving.
Think of LLMs as a new kind of operating system (OS).
Theyβre not just utilities like electricity that flow through wires. They are becoming complex software environments. In this model, the LLM is the CPU. It is the core compute layer with memory (the context window), processing (token generation), and orchestration.
Your input prompt lives in the working memory (context window), where the LLM reads it and generates output, token by token, just like CPU cycles.Β
Closed models such as GPT and Claude are the Windows and macOS equivalents. Open models like LLaMA and Mistral are like Linux.
And to actually get work done on those OS, you use business applications like CRMs or ERPs, purpose-built tools with user interfaces, workflows, and connectors that integrate with the rest of your systems.
Can you interact directly with the OS? Sure, you can open a terminal and type commands to interact directly with the OS. But thatβs not how most people work. You wouldnβt run your CRM from a command line.
Generic AI tools such as ChatGPT or Claude are like those terminal windows, a text-based interface to powerful systems. But without a user interface, with no workflows or shared memory across sessions, and no knowledge of your organization.
Trying to run intelligence business functions directly through raw prompts is like managing enterprise software from a command line. Itβs technically possible, but practically exhausting.
Enterprises Need Purpose-Built Apps, Not Generic LLM Tools
LLMs are powerful, but using ChatGPT or Claude directly through text-based windows is ineffective. To use them well, you need applications with user interfaces, structured workflows, and working memory.
This is even more important in functions like market and competitive intelligence, which has complex workflows. You need an application to design structured workflows, persistent working memory, and integrations with other internal systems.Β
To succeed in the fast-moving, competitive landscape, you need more than prompts. You need systems that remember your context, connect with your existing tools, and understand the specifics of your businessβyour strategic priorities, your competitors, and what matters most to your teams.
Letβs understand how these ideas come together in a real business application with the example of Contifyβs AI, Athena.
Athena is an application built on top of LLMs, designed for market and competitive intelligence workflows. It doesnβt replace the LLMs. It adds what generic tools miss: orchestration, organizational memory, user interfaces, and human-in-the-loop workflows.
Hereβs how Athena transforms raw LLM capabilities into a business-ready application.
1. Built with Organizational Context
Generic LLM tools donβt know whatβs important to your business. They donβt know your business priorities, your competitors, what relevance means for your stakeholders, or what innovation means for your organization.
In competitive intelligence, relevance is always measured from the context of your business objectives, not by generic definitions.
Take innovation, for example. It could mean new product features. Or a pricing shift. Or a new channel strategy. Or a technology change. The definition isnβt universal. Itβs always contextual.
To use LLMs effectively, you need to provide this context. Expecting users to provide this context in every prompt, every time they interact with generic AI tools, is not realistic.
Athena solves this by storing and applying your organizationβs context automatically by applying Knowledge Graphs. It utilizes your custom taxonomies, strategic focus areas, industry segments, competitor list, and internal knowledge base when using LLMs to generate insights.Β

This means every insight Athena delivers is not just grounded in facts, it also has the context of your organization.
2. A User Interface Built for M&CI Workflows
Using ChatGPT like a terminal windowβtext in, text out, blank slate every timeβmight work for a quick prototype. But itβs not how people use business software where the goal is efficient workflows and informed decisions.
In real-world intelligence workflows, no one wants to scroll through long paragraphs trying to spot what has changed in their competitive landscape. They need to see timelines of strategic activities, side-by-side comparisons, benchmarks, competitor profiles, and live dashboards that surface trends, early signals, innovations, and disruptions, all without having to manually prompt each piece of intelligence separately.
A proper user interface isnβt just about convenience. Itβs critical for speed, comprehension, and verification. Thatβs what is missing in most generic AI tools. The burden is on you to enter what matters and connect the dots across sessions.Β
Reading text is a slow and cognitively demanding process. Visualizing intelligence in a structured manner unlocks different parts of our brain and taps into real human intelligence. You see patterns more clearly, move faster, and catch early warning signals before your competitors. Unless they are also using Contify, which makes it a level playing field. π
Contify provides that interface for market and competitive intelligence automation. Itβs an orchestration of various components optimized for intelligence creators and users, including role-based dashboards, fact cards tied to sources, company timelines, event streams, and insight views that stay grounded in reliable facts and your organizational context.Β

You are not entering the same question each morning. The system already tracks what you care about and shows you what changed. You stay focused on analyzing new developments and their implications for your organization.
3. Orchestration, Not Just Prompts
Orchestration simply means youβre not starting from zero every single time. The system carries context forward, links each step to the next, and runs the entire process as a single unified flow instead of a series of disconnected prompts.
Athena doesnβt just send one prompt to a model and hand you a paragraph to read. It coordinates the entire intelligence pipeline.Β
When a new update comes in, it first checks whether itβs relevant to your business. If it is, it figures out what kind of update it isβpartnership, product launch, funding, hiring, regulatory, etc. If itβs a material business event, it extracts the structured data and creates facts. Those facts then roll up into insights using pre-configured insight templates that already have your organizationβs context, taxonomy, competitors, and priorities. From there, related dashboards, competitor profiles, and timelines refresh automatically.

Under the hood, there is a chain of fine-tuned prompts, handpicked LLM models, and agentic workflows, which are optimized for translation, tagging, deduplication, disambiguation, fact extraction, and insight generation.Β
To achieve this, weβve tested a lot of LLMs to see which ones are best for accuracy and speed, and deployed different models to handle various steps of the pipeline. Thereβs also a monitoring layer that continuously monitors the performance of the live system and watches for any drifts in the answers, hallucinations, or odd spikes so that issues can be flagged early.
The result: instead of typing and retyping prompts, saving fragments, copying outputs, and stitching them together yourself, you get a system that automates less-intelligent manual work. You stay focused on the analysis partβwhy it matters and your recommendations. The machine handles the repetitive steps. You support strategic decisions.
4. Democratizing AI Across the Organization
An intelligent organization doesnβt come from just a few people writing clever prompts. If AI is going to make a real difference, it has to work consistently for everyone, not just power users who know how to get the most out of ChatGPT. It should be part of everyday workflows, across teams, without requiring anyone to learn how to be a prompt engineer or how to evaluate LLMs.
Thatβs what business applications, such as Athena, do for organizations.
Instead of making each user figure out what to ask and how to ask it, Athena gives you ready-made templates for common intelligence tasks. These templates arenβt generic; theyβre designed to be customized to your business, so the insights stay relevant, accurate, and aligned with your organizationβs priorities.
Sales, product, strategy, and marketing teams just get the intelligence they need, in their dashboards, in their preferred format, and within the tools they already use. And those dashboards keep updating automatically. Nobody needs to remember to ask, βWhat has changed today?β The system tells them.

And, if someone does want to go deeper or needs to know something specific, they can. The chat interface of Ask Athena lets anyone type a question in plain English and get a fact-based, contextual response.

5. Built for Human-in-the-Loop AI
LLMs can sound confident, even when they donβt have all the facts to answer the question. The answer might read well, but thereβs no easy way to know where it came from. Did it use facts from a reliable source? Or did it just make something up? That uncertainty is unacceptable when the stakes are high, such as when analyzing the competitive landscape to inform strategic decisions.
When it comes to intelligence, you donβt need answers fast. You need answers that are verified so that you can trust them.
Thatβs why Athena is designed with the human-in-the-loop approach at its core.
Athena is built on the principles of traceability. This is important for verifying the responses provided by LLMs. You can see the exact sources LLMs used, whether they are news articles, regulatory filings, or internal CRM notes, and what pieces of information (facts) from those sources were used. Those facts are right there, clearly mapped, so you know exactly what informed the insight.
And if something doesnβt look right, youβre in control. You can edit a fact, override the insight, or add your own context. Athena learns from that input and improves over time.
To implement human-in-the-loop, especially for strategic functions like competitive intelligence, youβll need to ask:
- What should the LLM see, and what should it generate?
- What decisions still need human judgment?
- Where can the system operate independently?
- And where should it stay under your supervision?
Athena doesnβt replace the analyst; it takes the repetitive work off your plate, so you can focus on the work that really needs your judgment.
Final Thoughts
Back in 2013, Andrej Karpathy rode in a self-driving car through the streets of Palo Alto. The car drove perfectly. No interventions. Just a smooth, 30-minute ride. It felt like the future had arrived.
But that was more than a decade ago. Weβre still working on fully autonomous cars. Not because the demo was fake, but because the final stretchβwhere reliability mattersβis always more complicated than it looks.
Generative AI is following a similar path.
LLMs are powerful tools, but they are not intelligent in the way humans are. They donβt understand context the way we do. Human intelligence involves continuous learning, adaptation, and integration of new information through experience.Β
LLMs donβt learn from experience. They generate responses through statistical patterns, rather than genuine comprehension of the situation or reasoning based on an understanding of language. Thatβs why you need someone in the loop to verify, adjust, and apply judgment when it counts.
But do not dismiss it as just another technology hype. It is sometimes confusing, especially with the media noise and the investment community trying to justify their investments. But make no mistake, this technology is real.
That said, the most realistic and practical expectation from AI right now is partial autonomy.Β We shouldnβt aim to take humans out of the loop. Instead, we should aim to make humans far more effective. AI should lead to less “rote work,” allowing humans to use real human intelligence and produce more valuable work.
Successful B2B software will evolve into these partially autonomous systems. Systems where you can dial the autonomy up or down based on how much you trust the output and how high the stakes are.Β
When designed right, AI becomes a force multiplier. It allows teams to move faster by making informed decisions. And thatβs the direction weβre building toward with Athena. If youβd like to see how Contifyβs Athena can help your market and competitive intelligence program, click here to request a demo.