Learned more about the post-training phase of fine-tuning LLMs and how the model initially goes through a pre-training phase.
From there, it is fine-tuned to contribute to a token stream with a human user, using prompt tokens to demarcate whether a message was written by the user or the assistant.
For example
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
Finally, labs have continued to improve model benchmark performance using further fine-tuning, like RLHF, where humans pick the best of a set of responses from the model, and the model is further fine-tuned on this data.
I found an interesting library for building 3D games specifically “built for Cursor” called viber3d
.
I assume the name is a reference to “vibe coding”.
This is the first library I have seen ship a starter scaffold with Cursor rules.
This is an interesting development in how frameworks are being built now.
Since language models don’t know about brand new frameworks, frameworks are shipping with content that will aid language models in using them, since coding with models seems to be an increasingly popular way to code.
I’ve built a few prototypes with the OpenAI voice to text API with code largely written using Cursor.
This has been fast and easy to incorporate into Next.js apps.
I can add an audio-recording-to-text feature to any app in a couple of minutes, ready for use in a production environment.
There are several other options for voice to text as well.
MacOS has a built-in voice to text feature and there are several other Whisper wrappers available, some that can run locally.
Talking is much faster than typing and allows me to capture raw thoughts faster, which I can then refine later.
LLMs are also quite good at structuring these raw thoughts into a more refined form that I can then edit.
I’d like to see authors being surprised by what readers end up learning from their material.
Because the author is not just sending out something static.
They’re sending out a program which is capable of emergent behavior.
So the reader will be able to try out different things and discover things the author hadn’t intended.
I’ve been playing around with this idea I am calling “idea projection”.
The concept is that you can take raw ideas, logs or notes (like all the logs on my site, for example) and create projections of them into different forms using a model and a target structure.
The prompting approach looks something like this
<files>
{files_xml}
</files>
<structure>
{structure}
</structure>
<instructions>
Given the above files, your job is to create a new document that structures the contents of the files in adherence with the <structure>, maintaining the original voice and phrasing as much as possible.
Output the complete document content directly, without any replacement blocks.
</instructions>
What this allows you to do is give the model a ton of files and have it transform that input content into the target structure.
This approach can work for everything from
A great read by Harper about writing code with LLMs.
One passage that particularly resonated with me
When I describe this process to people I say “you have to aggressively keep track of what’s going on because you can easily get ahead of yourself.”
For some reason I say “over my skies” a lot when talking about LLMs. I don’t know why. It resonates with me. Maybe it’s because it is beautiful smooth powder skiing, and then all of a sudden you are like “WHAT THE FUCK IS GOING ON!,” and are completely lost and suddenly fall off a cliff.
I’ve long been a skeptic but I think goose
may be one of the best LLM tools for coding.
I read Declan’s article on new technology adoption and the problems posed by models.
I haven’t had as much trouble as it seems Declan has in getting models to write vanilla js, but I share the concern about new technology/framework adoption and how we’ll solve this problem if models end up authoring most of the code that gets written.
Maybe projects like llms.txt and similar can help by providing context to models on how to use a technology, but finding a way to do this is now a meaningful hurdle to using something new and the slop filled web seems to be making it harder to push forward the knowledge cutoff of the foundational models.
Two, random interesting learnings from Julia’s survey results:
The first
ctrl + l clears the terminal.
I used to rely on cmd + k for this but when using the embedded terminal in Cursor, it opens a completion panel so I had been briefly stuck typing clear
.
The second
which gets the all time shell command history.
This is useful for history 0 | grep <query>
.
I use fzf
and ctrl + r to search through the history typically but this is still nice to know.
This had be an edge case I never quite resolved in my switch to zsh.
I was used to getting full history in bash with history
.
I’ve been doing a lot of vibe coding lately (though I only recently learned there was a term for this).
My most recent projects include a Krea-like image editor.
I am currently working on adding inpainting using a mask and prompt.
I am also developing a tool that captures raw thoughts and processes them through an LLM, following some structure to summarize, deduplicate and consolidate.
The general goal is to synthesize and organize ideas for a project, article, or just to help with understanding where my ideas are leading.