Abstracting a Vercel OpenAI Handler for Quick AI Tools
I've recently jumped into prompt engineering with both feet, but found myself rewriting a lot of the same boilerplate code when engineering new prompts. If you take a look at 2 of the tools I've added to this site:
They are pretty close to the same UI, but the API code for them was almost identical. So? DRY.
tldr;
Vercel AI + Open AI makes building new AI experiences fast. Adding a little abstraction makes it even faster. 💥
The Original Vercel AI API Route
To get started with Vercel AI, I basically just ripped their code from the docs here. No biggie. I added a little training data, and some code examples, so that my first attempt at a streaming AI API route looked like this:
import { OpenAIStream, StreamingTextResponse } from 'ai'
import OpenAI from 'openai'
import { kv } from '@vercel/kv'
const DEV = process.env.NODE_ENV === 'development'
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
})
const modelOptions = {
model: 'gpt-3.5-turbo',
stream: true,
}
const training = [
{ role: 'system', content: '...' },
{ role: 'user', content: '...' },
{ role: 'assistant', content: '...' },
{ role: 'user', content: '...' },
{ role: 'assistant', content: '...' },
];
export default async function handler(req) {
const { messages } = await req.json()
const key = `${prefix}-${JSON.stringify(messages)}`
// Check if we have a cached response in production
const cached = DEV ? null : await kv.get(key)
if (cached) {
const chunks = cached.split(' ')
// Emulate the "stream" interface, even if we're cached
const stream = new ReadableStream({
async start(controller) {
for (const chunk of chunks) {
const bytes = new TextEncoder().encode(chunk + ' ')
controller.enqueue(bytes)
await new Promise((r) =>
setTimeout(
r,
// get a random number between 10ms and 50ms to simulate a random delay
Math.floor(Math.random() * 40) + 10
)
)
}
controller.close()
},
})
return new StreamingTextResponse(stream)
}
const response = await openai.chat.completions.create({
...modelOptions,
messages: [...training, ...messages],
})
const stream = OpenAIStream(response, {
async onCompletion(completion) {
// Cache the response. Note that this will also cache function calls.
if (DEV) {
return;
}
await kv.set(key, completion)
await kv.expire(key, 60 * 60)
},
})
return new StreamingTextResponse(stream)
}
Not bad, considering it even implements caching for cost savings on responses. But then, when I went to build the second AI API route (the one for my spec tool), I realized I was basically identically copying all the meat of this handler, with the exception of the training data.
There are a couple of ways you could handle this:
- Add the training data in an api request to a single ChatGPT completion endpoint
- Abstract the creation of AI api handlers into a function that accepts additional training data
Ultimately, I opted for #2, simply because having all that training data in React components seemed like a weird place for it to live, and I like the flexibility of different API endpoints per bot 🤖.
Maybe I want to extend them, or chain them... who knows. This is all new to me.
Anyway, here's the simple abstraction, and a sample implementation, for your reusing pleasure.
import { OpenAIStream, StreamingTextResponse } from 'ai'
import OpenAI from 'openai'
import { kv } from '@vercel/kv'
const DEV = process.env.NODE_ENV === 'development'
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
})
const defaultModelOptions = {
model: 'gpt-3.5-turbo',
stream: true,
}
export default function createAIHandler({
prefix,
training = [],
modelOptions = defaultModelOptions,
}) {
// Return a function, that is the handler
return async function (req) {
const { messages } = await req.json()
const key = `${prefix}-${JSON.stringify(messages)}`
// Check if we have a cached response
const cached = DEV ? null : await kv.get(key)
if (cached) {
const chunks = cached.split(' ')
const stream = new ReadableStream({
async start(controller) {
for (const chunk of chunks) {
const bytes = new TextEncoder().encode(chunk + ' ')
controller.enqueue(bytes)
await new Promise((r) =>
setTimeout(
r,
// get a random number between 10ms and 50ms to simulate a random delay
Math.floor(Math.random() * 40) + 10
)
)
}
controller.close()
},
})
return new StreamingTextResponse(stream)
}
const response = await openai.chat.completions.create({
...modelOptions,
messages: [...training, ...messages],
})
const stream = OpenAIStream(response, {
async onCompletion(completion) {
// Cache the response. Note that this will also cache function calls.
if (DEV) {
return;
}
await kv.set(key, completion)
await kv.expire(key, 60 * 60)
},
})
return new StreamingTextResponse(stream)
}
}
Pretty simple wrapper, eh?
Now, here's how you use it in a sample AI route for generating animal names:
import createAIHandler from '@/lib/createAIHandler'
export const runtime = 'edge'
const training = [
{ role: 'system', content: `
You are a bot that can ONLY respond with made up animal names.
Include a person's name, an adjective, a color, and an animal name.
<example>
Bart, the Slippery Green Walrus.
</example>
` },
{ role: 'user', content: 'Give me an animal name' },
{ role: 'assistant', content: 'Sue, the Hilarious Yellow Antelope.' },
{ role: 'user', content: 'How are you?' },
{ role: 'assistant', content: 'Jeff, the Angry Red Hare.' },
]
export default createAIHandler({
prefix: 'animal-bot',
training,
})
Lastly, wire it up in your component:
import { useChat } from 'ai/react'
export default function Page() {
const { messages, handleSubmit, input, handleInputChange } = useChat({
api: '/api/ai/animal-names',
});
return (
<>
// Your chat UX
</>
)
}
I'm not gonna go into building forms and chat UX here, but good starting places are:
Hope this helps you ramp up on your AI skillz, and happy coding! 😎
If you enjoyed this article, please consider following me on Twitter