The Gemini Android application beta now includes a feature allowing users to attach audio files like MP3s to chat conversations. This functionality, observed by Android Authority in version 16.30.59.sa.arm64 of the Google app beta, introduces a “Talk live about this” prompt upon file attachment. While present, the audio processing capabilities within the beta are not yet fully operational.
Upon attaching an audio file, users are presented with the option to either type a question or select the “talk live” prompt. Current observations indicate Gemini does not consistently process the audio input. In some instances, the application ignores the attached audio file entirely. In other cases, Gemini may generate responses that do not correlate with the audio content, exhibiting behavior consistent with chatbot hallucinations.
Despite the current limitations in the Android beta, the Gemini API already supports audio input. Developers can utilize the API to submit audio files and request various processing tasks. These tasks include generating descriptions of audio content, summarizing spoken information, and transcribing speech. The API also accommodates specific timestamp requests, such as processing segments from “2:30 to 3:29.” Supported audio formats for the API include MP3, WAV, and FLAC.
The integration of audio file attachment in the Gemini Android app is likely an ongoing development effort by Google. There is no official confirmation regarding a specific launch date for this feature. Image upload functionality is currently widely available within the Gemini Android application, suggesting audio support represents a subsequent progression in the app’s capabilities.