Why Use the ChatGPT API Instead of the Web Interface?
The ChatGPT web interface is great for casual use, but the API unlocks a completely different tier of capability. With the API you can embed AI directly into your own applications, automate workflows, process thousands of documents, and build products that your users can interact with — all without them ever leaving your platform.
Compared to the web interface, the API gives you: programmatic control over every request, the ability to set system-level instructions, fine-grained access to different model versions, streaming output for better UX, and full visibility into token usage for cost management.
Prerequisites
- Node.js 18+ or Python 3.8+ installed on your machine
- An OpenAI account at platform.openai.com
- A credit card added to your OpenAI account (required for API access)
- Basic familiarity with running terminal commands
Getting Your API Key
Head to platform.openai.com, sign in, and click your profile icon in the top right. Select API Keys from the menu. Click Create new secret key, give it a name (e.g. "my-first-project"), and copy the key immediately — you won't be able to see it again.
Store your API key as an environment variable, never hardcode it in source files:
# Linux / macOS
export OPENAI_API_KEY="sk-..."
# Windows (PowerShell)
$env:OPENAI_API_KEY = "sk-..."
# Or create a .env file (and add it to .gitignore!)
OPENAI_API_KEY=sk-...
Your First API Call
Install the official library and make your first call:
# Python
pip install openai
# Node.js
npm install openai
Python Example
from openai import OpenAI
client = OpenAI() # reads OPENAI_API_KEY from environment
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain what a REST API is in two sentences."}
]
)
print(response.choices[0].message.content)
JavaScript / Node.js Example
import OpenAI from "openai";
const client = new OpenAI(); // reads OPENAI_API_KEY from environment
const response = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain what a REST API is in two sentences." }
],
});
console.log(response.choices[0].message.content);
Understanding the Response Object
The API returns a rich response object. Here are the key fields you'll use:
- choices[0].message.content — the text of the AI's reply
- choices[0].finish_reason — why the model stopped:
stop(normal),length(hit max_tokens),content_filter - usage.prompt_tokens — tokens used in your input (system + user messages)
- usage.completion_tokens — tokens generated in the response
- usage.total_tokens — sum of both; this is what you're billed for
- model — the exact model version that processed the request
Chat History and Multi-Turn Conversations
Unlike traditional stateless APIs, the ChatGPT API is also stateless — it doesn't remember previous requests. To create a conversation, you pass the entire chat history in every request. This is simple but requires you to manage the history yourself:
from openai import OpenAI
client = OpenAI()
conversation_history = [
{"role": "system", "content": "You are a friendly coding tutor."}
]
def chat(user_message):
conversation_history.append({"role": "user", "content": user_message})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=conversation_history
)
assistant_message = response.choices[0].message.content
conversation_history.append({"role": "assistant", "content": assistant_message})
return assistant_message
print(chat("What is a for loop?"))
print(chat("Can you show me an example in Python?"))
print(chat("What about JavaScript?"))
Notice that each call includes the full history, so the model knows the context of "What about JavaScript?" refers to for loops, not something new.
Understanding Tokens and Pricing
Tokens are chunks of text — roughly 4 characters or 0.75 words in English. "Hello, world!" is about 4 tokens. "The quick brown fox jumps over the lazy dog" is about 10 tokens.
OpenAI charges separately for input tokens (your messages) and output tokens (the model's reply). Output tokens are more expensive because generating text is more compute-intensive than reading it.
As of 2025, approximate pricing:
- GPT-4o: $5 per 1M input tokens, $15 per 1M output tokens
- GPT-4o mini: $0.15 per 1M input tokens, $0.60 per 1M output tokens
- GPT-3.5 Turbo: $0.50 per 1M input tokens, $1.50 per 1M output tokens
For most tasks, GPT-4o mini is the sweet spot — it's 33x cheaper than GPT-4o and surprisingly capable for structured tasks, summarization, and classification.
Streaming Responses
For any user-facing application, streaming dramatically improves perceived performance — text appears token by token rather than the user waiting for the full response:
from openai import OpenAI
client = OpenAI()
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Write a haiku about Python programming."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # newline at end
Building a Simple Chatbot
Here's a complete command-line chatbot you can run right now:
from openai import OpenAI
client = OpenAI()
def run_chatbot():
print("ChatBot ready! Type 'quit' to exit.")
print("-" * 40)
messages = [
{"role": "system", "content": (
"You are a helpful assistant. Be concise and friendly. "
"If you don't know something, say so."
)}
]
while True:
user_input = input("You: ").strip()
if not user_input:
continue
if user_input.lower() in ("quit", "exit", "bye"):
print("Goodbye!")
break
messages.append({"role": "user", "content": user_input})
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
max_tokens=500,
temperature=0.7,
)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
print(f"Assistant: {reply}")
print(f"(Tokens used: {response.usage.total_tokens})")
except Exception as e:
print(f"Error: {e}")
messages.pop() # remove the failed user message
if __name__ == "__main__":
run_chatbot()
Best Practices
Error Handling
Always wrap API calls in try/except blocks. Common errors: RateLimitError (too many requests), AuthenticationError (bad API key), APIConnectionError (network issues). Use exponential backoff for retries.
Rate Limits
Free tier accounts have very low rate limits. Paid accounts have higher limits. Check your limits in the OpenAI dashboard under Limits. If you hit limits in production, implement a queue.
Cost Management
- Set max_tokens to cap response length and prevent runaway costs
- Use GPT-4o mini for tasks that don't require GPT-4o quality
- Trim conversation history when it gets long (keep the last N turns)
- Set spending limits in the OpenAI dashboard
Next Steps
Now that you have the basics down, explore: function calling (structured JSON outputs), the Assistants API (built-in thread management), fine-tuning for domain-specific tasks, and the vision API for image understanding. The OpenAI cookbook on GitHub has hundreds of working examples for real-world use cases.