I recently implemented a feature here on my own blog that uses OpenAI's GPT to help me correct spelling and punctuation in posted blog comments. Because I was curious, and because the scale is so small, I take the same prompt and fire it off three times. The pseudo code looks like this:


for model in ("gpt-5", "gpt-5-mini", "gpt-5-nano"):
    response = completion(
        model=model,
        api_key=settings.OPENAI_API_KEY,
        messages=messages,
    )
    record_response(response)

The price difference is large. That's easy to measure it's on their pricing page.

The quality of the responses is harder. I'm still working on that using my personal judgement by comparing the various results.

But the speed difference is fairly large. I measure how long the whole thing takes and now I can calculate the median (P50) and the 90th percentile (P90) and the results currently are:


   model    |        p50         |        p90
------------+--------------------+-------------------
 gpt-5      |  27.34671401977539 | 43.84699220657348
 gpt-5-mini |  9.814127802848816 | 16.00238153934479
 gpt-5-nano | 24.380277633666992 | 32.99455285072327

That's in seconds. The smaller the better.

Caveat: I still consider myself a noob when it comes to using the OpenAI API. What I have is a relatively simple application and the amount of money spent is pennies. There might be ways to tune this. Also, at this point I only have about 40 data points but I'll analyze it again in the future when I have more.

Comments

Peter Bengtsson

Note-to-self; the query:

select
    model, count(*),
    percentile_cont(0.5) within group (order by took_seconds) as p50,
    percentile_cont(0.9) within group (order by took_seconds) as p90
from llmcalls_llmcall group by model;

Your email will never ever be published.

Previous:
Benchmarking oxlint vs biome December 12, 2025 Bun, Rust, TypeScript
Related by category:
A Python dict that can report which keys you did not use June 12, 2025 Python
Using AI to rewrite blog post comments November 12, 2025 Python
Native connection pooling in Django 5 with PostgreSQL June 25, 2025 Python
Combining Django signals with in-memory LRU cache August 9, 2025 Python