Jakey Bot is a Gemini-based chatbot with personality, powered by Gemini 1.5 Pro and Flash
This chatbot is designed to utilize the Gemini API and combine with best Python and Discord APIs to create a helpful chatbots
Jakey AI is available as Discord Bot. Standalone UI is coming soon
- It uses the latest and greatest Gemini 1.5 models with extensive multimodal capabilties, this chatbot can accept text, images, video, and text files to input. With models to choose from
- Enables and exposes AI tools and features such as JSON mode, code execution, function calling, and system instructions for personality
- It can summarize messages and integrate to Discord
- Chat history per guild or user session (chat history is stored under pickle that snapshots the Gemini API chat history objects)
- Gemini API requests are asynchronous
Core dependencies is Python with PIP, depending on your distribution, pip must be installed separately along with venv. If you want to enable music chatbot mode, you'll also need to install ffmpeg/openjdk
- Read message history (see #faq for privacy implications)
- Embed messages (required for rendering text more than 4096 and for most commands)
- Send messages (obviously)
- Attach files
- Create webhooks
- Create slash commands
- Voice related features such as connect, disconnect
- Python 3.10+ with pip
If you use Linux distros, I strongly require you to install Python with venv support due to PEP 0668 and PEP 453 for rationale.
- OpenJDK 17 with ffmpeg
Needed for voice commands (wavelink/lavalink)
Once you activated your enviornment and has pip ready, you can run
pip3 install -r requirements.txt
After you installed the dependencies, don't run main.py
just yet. You must run these commands before installing, since Wavelink installs discord.py
as dependency and we use py-cord
due to ease of use
pip3 uninstall py-cord discord.py
pip3 install py-cord
After you install the required dependencies, head over to dev.env.template and save it as dev.env
in the gitroot directory
Required fields to configure:
TOKEN
- Your Discord Bot TokenGOOGLE_AI_TOKEN
- Gemini API token, please see this link to obtain API keys (Its free)SYSTEM_USER_ID
- Its strongly advisable you to use your Discord user ID for administrative commands like eval. You probably don't want me to control your infrastructure 😉
Please see CONFIG.md for more information about configuration.
You can enable VC-related commands such as /voice play
(which plays videos from YouTube and other supported sources) by downloading Lavalink jar file and placing it as wavelink/Lavalink.jar
in project's root directory.
Activate voice by placing Lavalink.jar
from lavalink releases and rename application.yml.template
to application.yml
and run java -jar Lavalink.jar
in separate session before starting the bot.
After everything is configured, you can run main.py
Get started by asking Jakey /ask prompt:Who are you and how can I get started
By default, it uses Gemini 1.5 Flash because it's cheap, widely used, and has the same multimodal and contextual capabilities as Pro but it is statistically nerfed in terms of performance and diverse domain understanding, but it is much better than 1.0 Pro and GPT-3.5 and on-parity (in some cases outclasses) with the first GPT-4 model snapshot from March 2023. Please see the LLM arena for comparison
Jakey provides commands such as:
/ask
- Ask Jakey anything!- Get started by asking
/ask
prompt:
Hey Jakey, I'm new, tell me your commands, features, and capabilities
- Accepts file attachments in image, video, audio, text files, and PDFs (with images) by passing
attachment:
parameter - JSON mode with
json_mode:True
- Ephemeral conversation with
append_hist:True
- You can choose between Gemini 1.5 Flash or Gemini 1.5 Pro using
model:
parameter
- Get started by asking
/sweep
- Clear the conversation/feature
- Extend Jakey skills by activating chat tools! (Clears conversation when feature are set)/imagine
- Create images using Stable Diffusion 3/summarize
- Summarize the current text channel or thread and gather insights into a single summary thanks to Gemini 1.5 Flash's long context it can understand conversations even from the past decade!/mimic
- Mimics other users using webhook/voice
- Basic streaming audio functionality from YouTube, soundcloud and more!
Jakey also has apps which is used to take action on a selected message. Such as explain, rephrase, or suggest messages.
This is FAQ for people using this bot, please see FAQ for technical users to understand how data is stored or how the code works under the hood.
Personality is implemented in the chatbot so to make it more human-like. However, it is based on a guy and Jakey's name is based on Jake which is mostly a masculine name (and no, don't expect Jakey to be your AI girlfriend). Prefer to keep it neutral however.
Web Search (beta) can be used by enabling it under /feature
command capability named "Web Search with DuckDuckGo" and ask queries with keywords like "Search the web"
Web search performs in two steps
- It searches the query through DuckDuckGo API and collects the links needed for page summarization
- The list of URLs is then being scrapped and agregates them so the model can understand them
The maximum number of queries can be used is 6 to prevent tokens from depleting so quickly due to large articles and causing slower responses as context builds up. It does not use embeddings at the moment.
Its recommended to use Gemini 1.5 Pro to better utilize Tool use but Flash also works. Keep in mind that the model sometimes cannot pick up the tool schema needed to perform web search action, if it fabricates its responses, explicitly tell the model to search the web.
Using web search can affect the response overall performance, due to number of pages are being passed depending through the query through the model which is quite similar to attaching a single 20 page PDF being processed. Its recommended to use web search sparingly if you want the model to be aware with certain information. You can also tell the model how many searches it can perform (but queries are maximum to 6) optimally 2-3 searches.
Depending on a website, some pages may not be used for responses that does not have extractable textual data.
You can also attach HTML files manually as part of attachment if you want a single page summarization
Yes, both 1.5 Pro and Flash are free to use, and the latter is used by default (overriden by model:
parameter)
The only limit is rate limit. 1.5 pro rate limits are usually lowest than flash.
If you have an account with higher rate limits, we suggest to self-host this bot and use your own API keys from AI studio with billing enabled to serve your users. Vertex AI and other non-Google AI models are not supported at this time.
You can use /ask
, /imagine
and /sweep
commands in the bot's DM once you install this app by tapping "Add app" in its profile card and clicking "Try it yourself" otherwise you will get "Integration error" when directly using these commands in DMs.
Keep in mind that after installing the app to yourself, mentioned commands are exposed anywhere even if the bot is not authorized in guilds you've joined. Using /ask
and /sweep
commands are not supported outside DMs or guilds where the bot is authorized despite it can be visible from anywhere if its installed by user scope. This is due to because some actions like ctx.send
will prematurely end the command with Missing Access
error.