Peeking under the hood with ChatGPT plugins
OpenAI announced plugins to extend ChatGPT last week, and did so in a surprising way. Many rightly pointed out that these look like “apps” as part of a platform play, with one big exception - there is no lock-in. The way these plugins are designed is as open as I can possibly imagine, which means we can peek under the hood at how they are made. That is, if you know where to look. It also offers us a glimpse into potential business ramifications, which we’ll take a look at at the end of this post.
A plugin has two parts: a manifest file, which looks a lot like an app store listing, and an OpenAPI spec (Note: this is OpenAPI not OpenAI). The manifest file is always called ai-plugin.json and stored at the root of the domain of the API being accessed in a folder called .well-known.
Notice the name of the file isn’t OpenAI-plugin or ChatGPT-plugin, but AI-plugin. Combine that tidbit with the fact that it is stored in the .well-known folder, which has been a pseudo standard place to put other site configs, and you can see that great care is being placed into being as standard and open as possible. It seems to me, OpenAI wants this to be the standardized way for any and all AI plugins, not just theirs/ChatGPT. In the same way that robots.txt, which is often also stored in the .well-known folder, is the industry standard way to moderate how search engines “plugin” to your site. I expect we’ll see Alpaca/LLaMa support these plugins soon.
While OpenAI and its partners don’t publicly publish exactly which domain the manifest is on, we can make an educated guess. I was able to find half of them on my first try and published a list for you. If you find others, you can add them. Manifest files are typically boring and there isn’t much to see here except for the prompts that are used to prime the integration.
Wolfram’s is by far the longest one I’ve seen. The prompt provides lots of coaching:
When solving any multi-step computational problem, do not send the whole problem at once to getWolframAlphaResults. Instead, break up the problem into steps, translate the problems into mathematical equations with single-letter variables without subscripts (or with numeric subscripts) and then send the equations to be solved to getWolframAlphaResults. Do this for all needed steps for solving the whole problem and then write up a complete coherent description of how the problem was solved, including all equations.
I say coaching, but this is programming! That is basically an if-then-else logic statement complete with objects/variables like getWorlframAlphaResults. Herein lies the strange simplicity of the LLM movement: You don’t code. Or you do - but you do it with human-like plain-text communication.
An aspect of this new programming is seen in these manifests. Many think human-style programming can’t provide sufficient specificity, and it is true that code is 100% explicit/deterministic. But consider that humans have ways to achieve some determinism in our language. Typically we’ll explain a task to a colleague and follow a general description, which might be misinterpreted, with examples. The Speak plugin does this over and over: “Examples: \"how do I say 'do you know what time it is?' politely in German\", \"say 'do you have any vegetarian dishes?' in spanish\”.
The second part of the plugin is the OpenAPI spec. These can be stored anywhere but are linked in the manifest. Specs like these are not new. They have been around for a decade and are used by developers and other APIs to know what to expect from a given API. Contained is a field that was intended for humans, but Open AI says we can just re-use that for AI-plugins. In theory, that means all the APIs are ChatGPT ready, we just need to slap a manifest pointer on them! But in reality, that isn’t how these plugins are written. Some are writing new specs for their AI-plugins- like Slack’s being called “ai-plugin.yaml”
Finally, anyone can develop these plugins today. ChatGPT has a waitlist for developers, but you don’t have to wait. You can host an ai-plugin.json and API specs today. You just can’t point ChatGPT at them, but if Alpaca or another model supported plugins, your definitions would work immediately. To that end, I’ve created this open source catalog of all the AI-Plugins out there, including third party plugins not contained in the announcement like Datasette from Simon Wilson (whose writing about creating that plugin was a big inspiration for this post)
I promised a note on the business model, and here's what I can say: OpenAI had a chance to make a walled garden here but didn’t. I speculate the strategy must be to set a standard and make it hard for competitors (Google’s Bard?) to not backtrack their approaches and follow. It also makes it easy for developers to add plugins, which means when the AI “app store” for ChatGPT launches to the public, it will be full of robust and capable plugins out of the gate.
The way OpenAI builds ChatGPT plugins suggests they want an open standard. This also means we can inspect the new plugins for tips on building more. You can find the examples mentioned above and more at ai-plugins.xyz
Originally published April 6, 2023.