AI over SMS – building your own personal assistant

Artificial Intelligence (AI) has dramatically transformed our interaction with technology. One such intriguing application is an AI-powered personal assistant over SMS, which allows users to tap into the power of AI using a basic SMS functionality. This concept is not standalone but forms a small part of a larger project we are working on: WITson.ai.

WITson is an ambitious project aiming to bridge the gap between complex AI technologies and everyday users, by integrating a range of web technologies, APIs, mobile applications, and more. The AI over SMS assistant is a small piece of the WITson ecosystem, designed to provide AI’s capabilities to users irrespective of their device or connectivity limitations.

System Overview

Developing the AI-powered SMS assistant as a component of the WITson project involves a host of interconnected components. Each one plays a vital role in ensuring seamless functionality and high-quality user interactions. The components can be broadly categorized into five groups: 

  1. SMS Gateway: This is the communication bridge between the system and the mobile networks. It allows the application to send and receive SMS messages from the users. This gateway could either be a direct connection to the telecom carrier or through third-party gateway providers.
  2. AI Model: At the heart of the SMS-based AI assistant lies the powerful language model (for example, GPT-4 developed by OpenAI). The model is capable of generating human-like text, making it the perfect tool to drive an AI assistant that can understand and respond to a wide variety of user queries.
  3. Server: The server acts as the central control unit for the SMS assistant. It performs multiple critical tasks such as:
    • Processing incoming SMS: This includes parsing the content of the message and any associated metadata (e.g., sender’s number, timestamp).
    • Interacting with the AI model: The server communicates with the AI model, sending it user queries and receiving generated responses.
    • Maintaining conversation states: The server tracks ongoing conversations, ensuring that the AI model’s responses are contextually relevant.
    • Sending responses: Once a response is generated, the server routes it back to the user via the SMS Gateway.
  4. Database: The database is a key element that ensures the continuity and relevance of the conversations with the AI assistant. It stores all conversation data, including a unique identifier for each conversation (typically the phone number), the messages exchanged, and any additional context. This allows the AI assistant to handle multiple simultaneous conversations and maintain context across different sessions with the same user.
  5. Queuing System: Given the scale at which the WITson project operates, the system will need to handle a large volume of messages simultaneously. A queuing system allows the application to manage these messages efficiently, without overwhelming the server or the AI model. It also serves as a buffer when there are response delays from the AI model. Incoming messages are stored in the queue and are processed by a worker service that can operate asynchronously, thus ensuring smooth operation even under high loads or during unexpected delays.

Each of these components is integral to the functioning of the AI-powered SMS assistant within the larger WITson project. Their effective interaction is what makes the assistant robust, efficient, and capable of providing a high-quality user experience. It’s also worth noting that while each of these components is complex in its own right, their integration adds another layer of complexity to the system, requiring thoughtful design and meticulous implementation.

Design and Implementation

SMS Gateway

The design and implementation of the SMS Gateway require careful consideration, given its role as the touchpoint between the user and the WITson system. Firstly, an appropriate SMS gateway service must be selected, one that guarantees reliable and timely delivery of messages.

After setting up the account with the selected provider, the application will need to be integrated with the gateway provider’s API. The gateway will trigger a callback to the specified endpoint whenever an SMS is received. The data structure of the callback (usually JSON or XML) will contain details like the sender’s number, the message content, and other metadata. The design of the endpoint needs to be done in such a way that it can parse this information and pass it to the next part of the system.

AI Model

The integration of the AI model is at the heart of the assistant’s ability to understand and respond to user queries. This process involves setting up an account to get API credentials, which will be used to make HTTP POST requests to the API endpoint. These requests contain the user’s query in a specific format defined by the AI model.

Given that AI model responses may be extensive and could exceed the SMS character limit, it’s crucial to design the application to handle this. The response from the model may need to be parsed and split into several SMS messages. Another consideration is the ‘temperature’ parameter in the API request that controls the randomness of the model’s responses – this will be fine-tuned based on testing to ensure the quality of responses.

Server and Database

The server is designed to be robust, scalable, and capable of multitasking – handling incoming SMS, interacting with the AI model, maintaining conversation states, and sending responses back through the SMS Gateway.

To maintain the conversation states, the server interacts with a database. The schema of the database is designed to store conversations in an organized manner, with each conversation linked to a unique identifier (typically the phone number). This allows the assistant to keep track of ongoing conversations and provide contextual responses.

Queuing System

The queuing system plays a vital role in managing incoming message traffic, especially during peak hours. Designing this system requires considering the average message volume and any possible spikes. The system will be scalable enough to handle an increase in volume without affecting performance or responsiveness.

When an SMS is received, it’s placed into the queue from where a separate worker service reads it and performs the necessary processing. This service works asynchronously, enabling it to handle multiple tasks simultaneously without blocking operations.

The queuing system also assists in situations where there are delays or unresponsiveness from the AI model. By allowing the worker service to wait for the response without halting other tasks, the queuing system ensures smooth operation.

Error Handling and Retry Mechanism

The design of the error handling and retry mechanism is critical to the system’s robustness. If there’s no response from the AI model within a specified time, the worker service can attempt to resend the request. The number of retries, delay between retries, and the conditions that trigger a retry need to be configured based on the model’s typical response times and reliability.

In the event of continued non-responsiveness from the AI model, a failover strategy will be in place. This would involve sending a generic error message to the user, notifying them of the issue and possibly suggesting they try again later.

Error Handling and Retry Mechanism

Error Handling

The first step towards robust error handling is to anticipate potential failures. Given the multiple components involved in the WITson SMS assistant – SMS Gateway, AI Model, Server, Database, and Queuing system – there are numerous points where errors might occur. These could include network errors, service downtime, server failures, database inconsistencies, or unavailability of resources.

For each type of anticipated failure, specific exception handling logic will be designed. This includes:

  1. Logging: When an error occurs, it will be logged with sufficient information to allow debugging. This will include details of the error, the state of the system when the error occurred, and any error messages.
  2. User Communication: Depending on the nature of the error, it might be necessary to communicate with the user. This will be in the form of a simple acknowledgement of their query and an assurance of a delayed response, or an error message suggesting they try again later.
  3. Service Continuity: In some cases, it may be possible to continue with a degraded service while the error is being resolved. For example, if the AI model is slow to respond, the system would still accept incoming messages and queue them for processing when the service is restored.

Retry Mechanism

When dealing with dependencies like the AI model or SMS Gateway, the unavailability of these services or network issues could result in temporary failures. A Retry Mechanism is designed to automatically repeat failed operations after a defined interval, increasing the chances of a successful operation.

The retry mechanism will be designed to consider:

  1. Exponential Backoff: This strategy involves increasing the wait time between retries exponentially. This reduces the load on the system and the dependencies while increasing the likelihood of success in subsequent attempts.
  2. Jitter: To avoid a situation where multiple instances of the service retry simultaneously (known as “retry storms”), adding ‘jitter’ (random variation) to the delay times can help spread out the load.
  3. Maximum Retry Limit: There will be a limit on the number of retries to avoid endlessly repeating failed operations. When this limit is reached, the system might switch to a failover strategy or alert the user about the issue.
  4. Selective Retrying: Not all failures are transient, and retrying could be futile or even detrimental in certain cases. The system will be able to differentiate between errors that may resolve on retry (like a network blip) and those that won’t (like a bad request to the AI model).

AI over SMS as a Text User Interface (TUI)

A Text User Interface (TUI) is a type of user interface that uses text and symbols instead of graphical elements to represent the information and actions available to the user. Unlike a Command-Line Interface (CLI) where the user must remember and type specific commands, a well-designed TUI guides the user through interactions, making the system more accessible and easier to use.

In the context of the WITson SMS assistant, TUI translates into a conversation with the AI model. The user sends a query or command via SMS, the AI model processes it, generates a response, and sends it back to the user as an SMS. The entire interaction takes place in a conversational manner, similar to chatting with a human assistant.

Benefits of AI over SMS as a TUI

  1. Ubiquity and Accessibility: SMS is one of the most widely-used communication methods worldwide, and it doesn’t require a smartphone or an internet connection. This makes the SMS assistant accessible to a vast audience, including users with low-end phones or those in regions with poor internet connectivity.
  2. Ease of Use: A well-designed TUI is intuitive and easy to use. Users don’t need to learn specific commands or navigate through complex menus. They can ask questions or give commands in natural language, making the system more user-friendly.
  3. Asynchronous Communication: Unlike voice assistants that require the user’s undivided attention during interactions, text-based communication is inherently asynchronous. Users can send a message and check the response at their convenience, making it a more flexible interaction method.

Design Considerations for AI over SMS as a TUI

Building an effective TUI requires careful consideration of the user’s experience. Here are a few aspects that will be taken into account:

  1. Language Understanding and Context Awareness: The AI model needs to be proficient at understanding natural language queries and maintaining the context of the conversation. This is key to providing relevant and accurate responses.
  2. SMS Character Limit: SMS messages have a character limit (160 characters in English). The assistant will be designed to handle this, splitting long responses into multiple SMS messages if necessary.
  3. Response Time: While SMS allows for asynchronous communication, users still expect timely responses. The system will be designed to process queries and generate responses as quickly as possible.
  4. Error Handling: As with any system, effective error handling is crucial. If the assistant doesn’t understand a query or encounters an issue, it will be able to communicate this to the user in a clear and friendly manner.

Conclusion

The AI over SMS assistant forms a vital piece of the ambitious WITson project, embodying the project’s vision of making AI accessible to all and serve the needs of people who prefer to use SMS. It involves thoughtful design and meticulous implementation, from setting up the SMS Gateway, queuing system, and error handling, to interacting with the AI model and maintaining conversation states. While this is a complex task, it promises a powerful, user-friendly AI tool that can be accessed from anywhere at any time, making it a significant step towards the broader vision of the WITson project.

Together our conversations can expand solutions and value

We look forward to helping you bring your ideas and solutions to life.
Share the Post:

2 Responses

  1. Hi! This sounds great! Nice description of what would be needed. The blog was a year ago — what’s the current status of the AI over SMS Personal Assistant project?

    1. Hey Robert, we actually never released it out to the public. Internally we call it WITSon – spin on our company name WITCo and IBM’s Watson. So far it has been working well.

Leave a Reply

Your email address will not be published. Required fields are marked *