Getting Started with the ChatGPT API

Why Use the ChatGPT API Instead of the Web Interface?

The ChatGPT web interface is great for casual use, but the API unlocks a completely different tier of capability. With the API you can embed AI directly into your own applications, automate workflows, process thousands of documents, and build products that your users can interact with — all without them ever leaving your platform.

Compared to the web interface, the API gives you: programmatic control over every request, the ability to set system-level instructions, fine-grained access to different model versions, streaming output for better UX, and full visibility into token usage for cost management.

Prerequisites

Node.js 18+ or Python 3.8+ installed on your machine
An OpenAI account at platform.openai.com
A credit card added to your OpenAI account (required for API access)
Basic familiarity with running terminal commands

Getting Your API Key

Head to platform.openai.com, sign in, and click your profile icon in the top right. Select API Keys from the menu. Click Create new secret key, give it a name (e.g. "my-first-project"), and copy the key immediately — you won't be able to see it again.

Store your API key as an environment variable, never hardcode it in source files:

# Linux / macOS
export OPENAI_API_KEY="sk-..."

# Windows (PowerShell)
$env:OPENAI_API_KEY = "sk-..."

# Or create a .env file (and add it to .gitignore!)
OPENAI_API_KEY=sk-...

Your First API Call

Install the official library and make your first call:

# Python
pip install openai

# Node.js
npm install openai

Python Example

from openai import OpenAI

client = OpenAI()  # reads OPENAI_API_KEY from environment

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain what a REST API is in two sentences."}
    ]
)

print(response.choices[0].message.content)

JavaScript / Node.js Example

import OpenAI from "openai";

const client = new OpenAI(); // reads OPENAI_API_KEY from environment

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain what a REST API is in two sentences." }
  ],
});

console.log(response.choices[0].message.content);

Understanding the Response Object

The API returns a rich response object. Here are the key fields you'll use:

choices[0].message.content — the text of the AI's reply
choices[0].finish_reason — why the model stopped: stop (normal), length (hit max_tokens), content_filter
usage.prompt_tokens — tokens used in your input (system + user messages)
usage.completion_tokens — tokens generated in the response
usage.total_tokens — sum of both; this is what you're billed for
model — the exact model version that processed the request

Chat History and Multi-Turn Conversations

Unlike traditional stateless APIs, the ChatGPT API is also stateless — it doesn't remember previous requests. To create a conversation, you pass the entire chat history in every request. This is simple but requires you to manage the history yourself:

from openai import OpenAI

client = OpenAI()

conversation_history = [
    {"role": "system", "content": "You are a friendly coding tutor."}
]

def chat(user_message):
    conversation_history.append({"role": "user", "content": user_message})
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=conversation_history
    )
    
    assistant_message = response.choices[0].message.content
    conversation_history.append({"role": "assistant", "content": assistant_message})
    return assistant_message

print(chat("What is a for loop?"))
print(chat("Can you show me an example in Python?"))
print(chat("What about JavaScript?"))

Notice that each call includes the full history, so the model knows the context of "What about JavaScript?" refers to for loops, not something new.

Understanding Tokens and Pricing

Tokens are chunks of text — roughly 4 characters or 0.75 words in English. "Hello, world!" is about 4 tokens. "The quick brown fox jumps over the lazy dog" is about 10 tokens.

OpenAI charges separately for input tokens (your messages) and output tokens (the model's reply). Output tokens are more expensive because generating text is more compute-intensive than reading it.

As of 2025, approximate pricing:

GPT-4o: $5 per 1M input tokens, $15 per 1M output tokens
GPT-4o mini: $0.15 per 1M input tokens, $0.60 per 1M output tokens
GPT-3.5 Turbo: $0.50 per 1M input tokens, $1.50 per 1M output tokens

For most tasks, GPT-4o mini is the sweet spot — it's 33x cheaper than GPT-4o and surprisingly capable for structured tasks, summarization, and classification.

Streaming Responses

For any user-facing application, streaming dramatically improves perceived performance — text appears token by token rather than the user waiting for the full response:

from openai import OpenAI

client = OpenAI()

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a haiku about Python programming."}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()  # newline at end

Building a Simple Chatbot

Here's a complete command-line chatbot you can run right now:

from openai import OpenAI

client = OpenAI()

def run_chatbot():
    print("ChatBot ready! Type 'quit' to exit.")
    print("-" * 40)
    
    messages = [
        {"role": "system", "content": (
            "You are a helpful assistant. Be concise and friendly. "
            "If you don't know something, say so."
        )}
    ]
    
    while True:
        user_input = input("You: ").strip()
        if not user_input:
            continue
        if user_input.lower() in ("quit", "exit", "bye"):
            print("Goodbye!")
            break
        
        messages.append({"role": "user", "content": user_input})
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
                max_tokens=500,
                temperature=0.7,
            )
            reply = response.choices[0].message.content
            messages.append({"role": "assistant", "content": reply})
            print(f"Assistant: {reply}")
            print(f"(Tokens used: {response.usage.total_tokens})")
        except Exception as e:
            print(f"Error: {e}")
            messages.pop()  # remove the failed user message

if __name__ == "__main__":
    run_chatbot()

Best Practices

Error Handling

Always wrap API calls in try/except blocks. Common errors: RateLimitError (too many requests), AuthenticationError (bad API key), APIConnectionError (network issues). Use exponential backoff for retries.

Rate Limits

Free tier accounts have very low rate limits. Paid accounts have higher limits. Check your limits in the OpenAI dashboard under Limits. If you hit limits in production, implement a queue.

Cost Management

Set max_tokens to cap response length and prevent runaway costs
Use GPT-4o mini for tasks that don't require GPT-4o quality
Trim conversation history when it gets long (keep the last N turns)
Set spending limits in the OpenAI dashboard

Next Steps

Now that you have the basics down, explore: function calling (structured JSON outputs), the Assistants API (built-in thread management), fine-tuning for domain-specific tasks, and the vision API for image understanding. The OpenAI cookbook on GitHub has hundreds of working examples for real-world use cases.