Bolna helps you create AI Agents which can be instructed to do tasks beginning with:

  • An Input medium
    • For voice based conversations the agent input could be a microphone or a phone call
    • For text based conversations the agent could take inputs via keyboard
    • For visual based conversations the agent could take inputs in the form of images (Coming soon)
  • An ASR
    • ASR converts the input to a LLM compatible format so it can pass it to the chosen LLM
  • A LLM
    • LLM takes the input from ASR and generates the appropriate response and passes it to the TTS or Image Generation model depending on the ttype of conversation the Agent is being built for
  • A TTS / Image Generation Model
    • Takes the LLM response and generates a compatible output to pass on to the output component
  • An Output component
    • Similar to the input component, this will pass the compatible text/voice/image to the output medium


Bolna provides the functionality to instruct your agent to execute tasks once the conversation has ended.

  • Summarization task
  • Extraction task
  • Webhook task