Introduction
This article explains how the two “Playing with Generative AI” sample experiences in the Marketplace were built. To access these samples, use the "Marketplace" tab of the Experiences panel in Composer. You can also download them from their online Marketplace pages: Windows version / Non-Windows version.
The first version of the experience is optimized for running in Player on Windows. The other version is optimized for running in Player on all its other supported platforms.
Both experiences contain examples of how to use generative AI within an Intuiface experience. The first example concerns image generation; the second concerns response to a question. Generative AI uses artificial intelligence to create new text, images, code, and more in response to a prompt.
About the experience
When either experience is run, it verifies that two API keys are present. If present, the user can choose between image generation and answering a question.
For image generation, the user can interactively build a prompt by selecting items in three separate carousels. The prompt is created, and an image is generated.
For answering a question, the user has two options:
- Select a pre-written question in a carousel.
OR - Scan a QR code with a phone, type in any prompt on that phone, and submit it to the running experience.
Playing the experience
Run it in Composer
To successfully run the experience in Composer, you must select the appropriate version of Play Mode:
- "Playing with Generative AI - Windows" version
In the Composer "Project" menu, select "Simulate Player on Windows" - "Playing with Generative AI - Player Next Gen" version
In the Composer "Project" menu, select "Simulate Player on all other Platforms (Web, Android, etc.)".
How it works
This experience uses two Generative AI services provided by OpenAI: 1) DALL-E 2 for image generation, and 2) GPT-3.5 for answering questions. Both services are accessible through a Web API, each requiring an interface asset.
DALL-E 2 Interface Asset for Image Generation
The DALL-E 2 interface asset is named ‘generations’ and was created using API Explorer. This interface asset (IA) is identical in both experience versions. As with all interface assets created using API Explorer, this IA can be opened and edited anytime.
The input to this IA is an API key and a prompt. The response is a 512x512 image.
GPT-3.5 Interface Asset for Question Answering
The GPT-3.5 interface asset is named ‘OpenAIGPT’ and was hand-coded. There are two versions of this IA:
- .NET-based, created for the Windows version of this experience. The interface asset is named OpenAIGPT.
- TypeScript-based, created for the non-Windows version of this experience. The interface asset is named generations.
Both GPT-3.5 IAs were hand-coded because the GPT Web API includes arrays, a feature not yet supported in API Explorer. Both the .NET and TypeScript versions of the IA have also been compiled, so their code cannot be viewed or edited.
API/Credential Key Configuration
Both experiences require one API key and one credential key. These keys must be added to the experiences before you run them. If either is missing, the experience will display a warning message immediately after launch.
API Key
Both the DALL-E and GPT IAs require an OpenAI API key. Every new OpenAI account can create unlimited keys, which can collectively be used to create a preset number of free images or question responses. OpenAI accounts can be funded to pay for use beyond the free limit.
Using Composer, paste your OpenAI API key into the interface asset named "OpenAI API Key":
Credential Key
To facilitate communication between a mobile phone and the running Intuiface experience, the experience must contain one of your Intuiface account’s credential keys. This key uniquely identifies your experience among all the experiences running worldwide, ensuring personal devices can communicate with it directly.
To create a credential key, log into your Intuiface account and visit the Credential Keys page on My Intuiface. Click the “Create new credential key” button and create a key whose scope includes (but doesn’t have to be limited to) “Web Triggers”. Copy the resulting key.
Using Composer, paste your Credential Key into the interface asset named ‘WebTrigger Credential Key’:
Prompt Construction
Like all Generative AI systems, DALL-E and GPT respond to a prompt - a complete sentence (or question) specifying the desired information. These sample experiences generate prompts based on user selection.
For DALL-E image generation, the prompt is created by combining the style, flower, and weather selections with the help of a Text Concatenation Interface Asset. The “Create Image” button then triggers an action which is a call to the DALL-E Web API, using the API key and prompt. (click image to enlarge)
For the GPT question response, the prompt is created using one of two methods:
- Through submission of a prewritten question found in the scene.
- Through the submission of a prompt written on a mobile device.
The resulting prompt is submitted to GPT when the "Send Question" button is pressed in the prompt scene.
Receiving questions from a mobile device
After scanning the QR code in the GPT scene, the experience loaded in the personal mobile device can send any prompt to the running experience using web triggers. The webpage itself is a simple experience built using Composer and deployed to the web. Our article about web triggers discusses how communication between the mobile device and the main experience is accomplished.
In Composer, the "Web Triggers" Interface Asset constantly listens for a message from a mobile device. Its "Message is received" trigger is tripped whenever a prompt sent from a mobile device is detected. This prompt is displayed onscreen and then submitted to the GPT service with the OpenAI API key.
Comments
0 comments
Please sign in to leave a comment.