Solutions
- A framework for creating virtual assistants for specific purposes.
 - Language model integration: uses LLMs (e.g. OpenAI GPT-4, LaMDa, LLaMA) to parse inputs and generate responses.
 - Task-driven autonomous agent: given a single starting objective, agents can generates a sequence of actions to achieve it and execute it.
 - Visual avatar: uses Live2D Cubism for animated characters
 - Voice input: uses OpenAI's Whisper model to accept voice inputs
 - Voice output: uses the VITS model for voice output
 - Communication with multiple agents: multiple agent instances can communicate with other agents.
 - External actions: agents can interact with the outside world by executing actions, e.g. search on Google, send an email, etc. This allows you to build plugins to extend the agent's capabilities.
 - Time travelling, branching and logging: as the agent prompts, states and actions are logged as pure JSON objects and vectors, time-travelling and branching is possible.
 - Data source insights: you can supply data sources that the agent can look into (e.g. SQL databases. Agents can create embeddings in vector databases. Hook into CrawlGPT.
 
Tech Stack
Implementations
Notes
- Data consumption and intake
- Fine-tuning the model
 - Using vector databases by creating embeddings
 
 - Integrates into Poom's Personal Dashboard
 - We need to create APIs for building and registering commands and actions
 - Might just integrate Langchain as it's easier than rolling my own
 - Challenge: runs entirely on the browser (?)
 - Creates multiple agents for each purposes (e.g. Yuuka agent for finance management)
 
Inspirations