Today, I played around with Matt Rickard’s ReLLM library, another take on constraining LLM output, in this case, with regex. I tried to use it to steer a language model to generate structure (JSON) from unstructured input. This exercise is sort of like parsing or validating JSON with regex – it’s not a good idea. Complicated regular expressions to describe JSON are hard to write. I do love the demo that generates acronyms though.
I’ve been trying to find a simple way to host a website that formats and serves a nice looking version of a recipe written in markdown. There are a few open source projects available, but nothing that has fit the bill yet. I briefly had the idea to try out something with Next.js and mdx but I found when I scaffolded a new app that I didn’t even recognize the project structure.
I did a bit more work with Clojure today. My imperative programming habits are still bleeding through. The exercise introduced cond as a sort of case statement for flow control. I wrote a simple cond statement but was getting a bizarre runtime error: (defn my-fn [x] (cond (x < 0) "negative" (x = 0) "zero" (x > 0) "positive" ) ) user=> (my-fn 1) Execution error (ClassCastException) at user/my-fn (REPL:4). class java.
I’ve following the work Exercism has been doing for several years now. I used their product to learn a bit of Elixir and Go a few years back. I got an email from the today promoting a focus on functional programming languages in the month of June. I decided to learn a bit of Clojure, since I’ve been working with the JVM lately. I’ve done a few of the exercises and my takeaways so far are
The API does not just change without us telling you. The models are static there. — Logan.GPT (@OfficialLoganK) May 31, 2023 Logan says any changes to the model would have been communicated. It seems some folks have data that show the model’s degradation. As competition emerges in the space, it could be a problem for OpenAI if they lose user trust on model versioning and evolution. Tried to setup Falcon 40B.
Tons of reports on HN that GPT-4 has gotten significantly worse. Are people experiencing this? — Nabeel S. Qureshi (@nabeelqu) May 31, 2023 A number of folks are reporting gpt-4 appears to be performing less impressively as of late (additional conversation). I was using gpt-4 to write code earlier today, and anecdotally, can say it seems to be less effective at code generation. It still writes working code but the code, but the tests cases it provided aren’t always correct and it seems to be less “expert level” than I recall initially.
I’ve been following Eric’s posts about SudoLang since the first installment back in March. I’ve skimmed through the spec and the value proposition quite compelling. SudoLang seeks to allow programmers all levels to instruct LLMs and can also be transpiled into your programming language of choice. While I’m still early in my understanding of how to use this technique, it’s one I’m following closely and continuing to experiment with.
What if we set GPT-4 free in Minecraft? ⛏️ I’m excited to announce Voyager, the first lifelong learning agent that plays Minecraft purely in-context. Voyager continuously improves itself by writing, refining, committing, and retrieving *code* from a skill library. GPT-4 unlocks… — Jim Fan (@DrJimFan) May 26, 2023 NVIDIA researchers introduce an LLM-based agent with “lifelong learning” capabilities that can navigate, discover, and accomplish goals in Minecraft without human intervention.
The Alexandria Index is building embeddings for large, public data sets, to make them more searchable and accessible. That people produce HTML with string templates is telling us something. I think about this phenomena often, though I personally find most string template systems that produce HTML difficult to use. Django templates, Handlebars, Rails ERB, Hugo templates just to name a few. My experience has been these systems are difficult to debug and are practically their own full programming languages.
I’ve seen a lot of “GPT detection” products floating around lately. Sebastian discusses some of the products and their approaches in this article. Some products claim to have developed an “algorithm with an accuracy rate of text detection higher than 98%”. Unfortunately, this same algorithm determined a GPT-4 generated response from the prompt “write a paragraph in the style of Edgar Allan Poe” was 0% AI GPT. In my experience, you don’t need to try very hard to trick “AI-detection” systems.