I was reading your blog and had a question about this:

“I noticed that my coworker was prompting for specific technical implementations, and Claude was struggling, pulling in too much context and taking an unfocused approach, whereas I would have been much more vague and general to start and refined as I saw whether Claude was on the right track.”

Would you be able to elaborate more on your thought process on how to prompt an LLM to code a task for you based on its complexity? [I …] would appreciate any examples you have from experience.

Who is finding LLMs useful and who is not? And why is this the case?

Since around two years ago, I’ve been floored by the fact that I can, with increasing adherence to my instructions, get a model to write software and complete tasks for me given a good description of what I want.

This technology feels world-changing to me. I feel this way because it has changed my world and the things I am capable of accomplishing in fixed time given the skills I possess today.

Lots of language model providers implement the OpenAI API spec. These look similar in shape but often behave differently in subtle ways. Anthropic’s prefill sequences are one such example.

I wasn’t able to find a canonical definition of this spec. In practice, we can show the basic shape of the API for chatting with a few examples.

OpenAI:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "Hi"},
      {"role": "assistant", "content": ""}
    ]
  }'
{
  "id": "chatcmpl-BcibN0BPNN8ysOymTxonvePn0m3n6",
  "object": "chat.completion",
  "created": 1748567613,
  "model": "gpt-4.1-2025-04-14",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today? 😊",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 10,
    "total_tokens": 22,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_799e4ca3f1"
}

Anthropic:

Today, I ran into an issue where I wanted to use repomix to pack a large codebase into a single file to pass to an LLM, but I couldn’t paste the output into any of the UIs I typically use. The React apps all became sluggish as I waited for ~500,000 tokens to paste.

Enter llm.

I solved this problem with a bash one-liner:

llm -m gemini-2.5-pro-preview-05-06 "The provided context is all the code of an old codebase. Analyze this code and come up with high impact, meaningful improvements to make the codebase easier to work with." < repomix.output

The API and subsequent inference took about 85 seconds, and I had my response.

These days I use agents that write code often. When I am trying to build a new feature, I first write a markdown spec, then point the agent at it and send it on its way.

There are a lot of tools and choices today in the agent space. I regularly use 3-4 different ones, and I expect that number to continue to vary.

When you send an agent off to write code, you need to wait. If you have more work to do, especially work that is unrelated to the current changes the agent is making, it would be nice to unblock that work as well.

RSS feeds for blogs and things you write or create are great. If you read a lot, you probably also have a lot of articles you’ve read that you share with others and occasionally revisit.

You can save and share these with very little effort! Doing so is immensely valuable. If I find your blog and like what you write, often I will also like what you like to read.

This post is an edit and repost of my rant from Bluesky

Some problems with vibe coding

Having done a lot of vibe coding lately, I think I’ll move away from it (for now) as a primary approach to build any software that I care about, even a little bit. Current agents eventually fail to adhere to some prompt despite various attempts and approaches. Whenever this happens and I look in the codebase I am usually mortified by what I find.