Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 3 1 LLM in Development

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.
AI-GenAI

  • AI-GenAI Index

  • NVIDIA AI-LLM Developers Certification Path

  • Understanding Generative AI

  • What is AI Models and How to pick the right one?

  • How to Choose the Right AI Model for Your Use Case

  • What are Transformer Models?

  • Retrieval-Augmented Generation (RAG) for AI Applications

  • LLMs & Foundation Models Explained

  • Using LLMs in Development

  • Using LLMs in Production

  • Ethical AI vs Responsible AI vs Trustworthy AI

  • Generative Adversarial Networks (GANs) Explained

  • U-Net Explained

  • Understanding CLIP: Connecting Images and Text in Generative AI

  • Diffusion Models Explained

  • The Economic Impact of Generative AI

  • NVIDIA Certified Associate Generative AI (NCA-GENL) Practice Questions

Cover Image for Using LLMs in Development

Using LLMs in Development

Practical examples of how large language models are integrated into real production systems, from support automation and knowledge retrieval to developer tooling, code generation, and intelligent assistants.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Sat Mar 07 2026

Share This on

← Previous

Using LLMs in Production

Next →

Deep Learning Path 🤖

Using LLMs in Software Applications

Prompt-Based Development

Instead of training a classifier, we can simply write a prompt.

Example:

prompt = """
Classify the following review
as either positive or negative:

The banana pudding was really tasty!
"""

response = llm_response(prompt)
print(response)

Expected output: Positive

This works because large language models already have general knowledge learned during pretraining.

Classic ML vs Generative AI Workflow

The workflow difference is significant:

Supervised learning

  • Get labeled data
  • Train AI model
  • Deploy model
  • Can take months

Prompt-based AI

  • Specify prompt
  • Deploy model
  • Can take minutes, hours, or days
flowchart TD
    A[Supervised Learning] --> A1[Get labeled data]
    A1 --> A2[Train AI model on data]
    A2 --> A3[Deploy run model]

    B[Prompt-Based AI] --> B1[Specify prompt]
    B1 --> B2[Deploy run model]

This is one of the biggest reasons LLMs are attractive in product development: they dramatically reduce time to first prototype.

Before LLMs, a common way to build a text application was supervised learning.

For example, if a restaurant wanted to monitor online reviews, the team would:

Input→Labeled Data→Model Training→DeploymentInput \rightarrow Labeled\ Data \rightarrow Model\ Training \rightarrow DeploymentInput→Labeled Data→Model Training→Deployment

Example:

Input: Restaurant reviews Output: Sentiment (Positive / Negative)

This process could take months.

Generative AI changes this dramatically.

  1. Collect labeled examples
  2. Train an AI model
  3. Deploy the model

The system learns a mapping from input text AAA to output label BBB. For sentiment classification:

f(A)=Bf(A) = Bf(A)=B

where:

  • AAA is the review text
  • B∈{Positive,Negative}B \in \{\text{Positive}, \text{Negative}\}B∈{Positive,Negative}

For example:

A="Best soup dumplings I’ve ever eaten."⇒B=PositiveA = \text{"Best soup dumplings I've ever eaten."} \Rightarrow B = \text{Positive}A="Best soup dumplings I’ve ever eaten."⇒B=Positive A="Not worth the 3 month wait for a reservation."⇒B=NegativeA = \text{"Not worth the 3 month wait for a reservation."} \Rightarrow B = \text{Negative}A="Not worth the 3 month wait for a reservation."⇒B=Negative

This approach works, but it is often slow because it depends on dataset creation and model training.

Instead of training a model, we can use prompting.

prompt = """
Classify the following review
as having either a positive or
negative sentiment:
The banana pudding was really
tasty!
"""
response = llm_response(prompt)
print(response)

Development time often drops from months to hours or days.


Lifecycle of a GenAI Project

Building an AI system is an iterative engineering process.

Typical lifecycle:

  • 📝 Scope project: what you want to build, what problem you want to solve, and what success looks like.
  • 🏗️ Build or improve system
  • 📋 Internal evaluation
  • 🚀 Deploy and monitor

Lifecycle Diagram

flowchart TD
    S[Scope project 📝] --> B[🏗️ Build or improve system]
    B --> E[Internal evaluation 📋]
    E --> D[Deploy and monitor 🚀]
    D --> B

A prototype may look good on a simple example, but fail on a slightly different one.

A working demo is not the same thing as a reliable product.

This loop is central to real LLM engineering. You ship a prototype, observe failure cases, improve prompts or architecture, and repeat.

This loop repeats continuously.

Engineers must analyze failures and improve the system.

Improving LLM Performance

Building AI systems is highly empirical.

We improve performance through experimentation.

Common techniques include:

1. Prompting

Prompting is usually the first and cheapest lever.

You change the instructions, add examples, clarify format, or provide constraints.

2. Retrieval Augmented Generation (RAG)

RAG gives the LLM access to external data sources so it can answer questions using organization-specific information rather than relying only on its built-in knowledge.

3. Fine-tuning

Fine-tuning adapts a model to your task, style, or domain.

4. Pretraining

Pretraining means training an LLM from scratch.

This is the most expensive and hardest option, and usually the last resort.

Improvement Loop Diagram

flowchart LR
    I[Idea] --> P[Prompt]
    P --> R[LLM response]
    R --> I

Cost Intuition

Estimate LLM cost using tokens. Roughly:

1 token≈34 word1 \text{ token} \approx \frac{3}{4} \text{ word}1 token≈43​ word

If a person reads about 250250250 words per minute, then in one hour they consume about:

60×250=15000 words60 \times 250 = 15000 \text{ words}60×250=15000 words

If the system also processes a similar amount of prompt text, total words might be around:

15000+15000=30000 words15000 + 15000 = 30000 \text{ words}15000+15000=30000 words

Converting words to tokens:

30000 words≈40000 tokens30000 \text{ words} \approx 40000 \text{ tokens}30000 words≈40000 tokens

If cost is about:

$0.002 per 1K tokens\$0.002 \text{ per 1K tokens}$0.002 per 1K tokens

then the total estimated cost is:

40×0.002=$0.0840 \times 0.002 = \$0.0840×0.002=$0.08

So 8 cents can keep 1 user busy for 1 hour.


Tool Use with LLMs

LLMs call external tools for doing a task

LLMs are powerful, but they are not reliable at everything.

LLM often struggle with precise arithmetic or actions that require external systems.

Example:

LLMs are not always good at exact math.

Question:

How much would I have after 8 years if I deposit $100 at 5% interest?

A model may produce the wrong number if it tries to reason directly in text. The more reliable method is tool use:

100×1.058=147.74100 \times 1.05^8 = 147.74100×1.058=147.74

So the LLM should call an external calculator:

CALCULATOR(100×1.058)\text{CALCULATOR}(100 \times 1.05^8)CALCULATOR(100×1.058)

Math Tool Flow

flowchart TD
    Q[User asks math question] --> LLM[LLM recognizes need for precise calculation]
    LLM --> Calc[External calculator]
    Calc --> Result[147.74]
    Result --> Answer[LLM returns grounded answer]

This is an important engineering lesson: do not force the LLM to do tasks that a specialized tool can do more reliably.


Real Software Applications of LLMs

LLMs can power many types of applications.

Writing Applications

Examples:

  • drafting emails
  • generating reports
  • marketing copy
  • summarizing documents

Architecture:

User → Prompt → LLM → Generated Text

Reading Applications

LLMs can understand and extract information from text.

Example tasks:

  • summarization
  • information extraction
  • sentiment analysis
  • document classification

Example prompt:

Classify the sentiment of the following review:

Output: "The mochi is excellent!"

Chat Applications

LLMs also power conversational systems.

Example interaction:

User: I'd like a cheeseburger for delivery
Bot: Sure. Anything else?
User: That's all
Bot: It will arrive in 20 minutes

These systems combine:

  • prompts
  • conversation memory
  • business logic
← Previous

Using LLMs in Production

Next →

Deep Learning Path 🤖

AI-GenAI/3-1-LLM-in-Development
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.