Convert ogg byte array to wav byte array Python
I want to convert ogg byte array/bytes with Opus codec to wav byte array/bytes without saving to disk. I have downloaded audio from telegram api and it is in byte array format with .ogg extension. I do not want to save it to filesystem to eliminate filesystem io latencey. Currently what I am doing is after saving the audio file in .ogg format using code the below code using telegram api for reference https://docs.python-telegram-bot.org/en/stable/telegram.file.html#telegram.File.download_to_drive
# listen for audio messages async def audio(update, context): newFile = await context.bot.get_file(update.message.voice.file_id) await newFile.download_to_drive(output_path)
subprocess.call(["ffmpeg", "-i", output_path, output_path.replace(".ogg", ".wav"), '-y'], stderr=subprocess.DEVNULL, stdout=subprocess.DEVNULL)
async def audio(update, context): newFile = await context.bot.get_file(update.message.voice.file_id) byte_array = await newFile.download_as_bytearray()
to get byte_array and now I want this byte_array to be converted to wav without saving to disk and without using ffmpeg. Let me know in comments if something is unclear. Thanks! Note: I have setted up a telegram bot at the backend which listens for audios sent to private chat which I do manually for testing purposes.
Edit your question with a solution that writes the data to the disk. Please check the codec of the OGG container.
I am not writing the data to disk simply fetching the byte array using telegram api. The codec of OGG container is Opus that I found using ffprobe
1 Answer 1
We may write the OGG data to FFmpeg stdin pipe, and read the encoded WAV data from FFmpeg stdout pipe.
My following answer describes how to do it with video, and we may apply the same solution to audio.
The example assumes that the OGG data is already downloaded and stored in bytes array (in the RAM).
-------------------- Encoded --------- Encoded ------------ | Input OGG encoded | OGG data | FFmpeg | WAV data | Store to | | stream | ----------> | process | ----------> | BytesIO | -------------------- stdin PIPE --------- stdout PIPE -------------
The implementation is equivalent to the following shell command:
Linux: cat input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
Windows: type input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
The example uses ffmpeg-python module, but it’s just a binding to FFmpeg sub-process (FFmpeg CLI must be installed, and must be in the execution path).
Execute FFmpeg sub-process with stdin pipe as input and stdout pipe as output:
ffmpeg_process = ( ffmpeg .input('pipe:', format='ogg') .output('pipe:', format='wav') .run_async(pipe_stdin=True, pipe_stdout=True) )
The input format is set to ogg , the output format is set to wav (use default encoding parameters).
Assuming the audio file is relatively large, we can’t write the entire OGG data at once, because doing so (without «draining» stdout pipe) causes the program execution to halt.
We may have to write the OGG data (in chunks) in a separate thread, and read the encoded data in the main thread.
Here is a sample for the «writer» thread:
def writer(ffmpeg_proc, ogg_bytes_arr): chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important). n_chunks = len(ogg_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end). remainder_size = len(ogg_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size). for i in range(n_chunks): ffmpeg_proc.stdin.write(ogg_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process. if (remainder_size > 0): ffmpeg_proc.stdin.write(ogg_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process. ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process.
The «writer thread» writes the OGG data in small chucks.
The last chunk is smaller (assume the length is not a multiple of chuck size).
At the end, stdin pipe is closed.
Closing stdin finish encoding the data, and closes FFmpeg sub-process.
In the main thread, we are starting the thread, and read encoded «WAV» data from stdout pipe (in chunks):
thread = threading.Thread(target=writer, args=(ffmpeg_process, ogg_bytes_array)) thread.start() while thread.is_alive(): wav_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
For reading the remaining data, we may use ffmpeg_process.communicate() :
# Read the last encoded chunk. wav_chunk = ffmpeg_process.communicate()[0] out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
import ffmpeg import base64 from io import BytesIO import threading async def download_audio(update, context): # The method is not not used - we are reading the audio from as file instead (just for testing). newFile = await context.bot.get_file(update.message.voice.file_id) bytes_array = await newFile.download_as_bytearray() return bytes_array # Equivalent Linux shell command: # cat input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav # Equivalent Windows shell command: # type input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav # Writer thread - write the OGG data to FFmpeg stdin pipe in small chunks of 1KBytes. def writer(ffmpeg_proc, ogg_bytes_arr): chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important). n_chunks = len(ogg_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end). remainder_size = len(ogg_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size). for i in range(n_chunks): ffmpeg_proc.stdin.write(ogg_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process. if (remainder_size > 0): ffmpeg_proc.stdin.write(ogg_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process. ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process. if False: # We may assume that ogg_bytes_array is the output of download_audio method ogg_bytes_array = download_audio(update, context) else: # The example reads the decode_string from a file (for testing"). with open('input.ogg', 'rb') as f: ogg_bytes_array = f.read() # Execute FFmpeg sub-process with stdin pipe as input and stdout pipe as output. ffmpeg_process = ( ffmpeg .input('pipe:', format='ogg') .output('pipe:', format='wav') .run_async(pipe_stdin=True, pipe_stdout=True) ) # Open in-memory file for storing the encoded WAV file out_stream = BytesIO() # Starting a thread that writes the OGG data in small chunks. # We need the thread because writing too much data to stdin pipe at once, causes a deadlock. thread = threading.Thread(target=writer, args=(ffmpeg_process, ogg_bytes_array)) thread.start() # Read encoded WAV data from stdout pipe of FFmpeg, and write it to out_stream while thread.is_alive(): wav_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file". # Read the last encoded chunk. wav_chunk = ffmpeg_process.communicate()[0] out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file". out_stream.seek(0) # Seek to the beginning of out_stream ffmpeg_process.wait() # Wait for FFmpeg sub-process to end # Write out_stream to file - just for testing: with open('test.wav', "wb") as f: f.write(out_stream.getbuffer())
AudioConverter
I have some old music in a lossless format. Now that I am constantly jumping between computers, I wanted it to be converted in a more universal format such as mp3 so that I can play it with the simplest of players. I also wanted to avoid having to stream my music on cloud platforms. Upon a cursory and naive scan on the web, I found that existing scripts are defunct (again cursory) or was not as simple as I would like it to be. I did not want to download a GUI for a one time use or upload a directory of music online to have it be converted on some server and download it again either. Instead, I wrote this quick CLI to do it for me.
Setup
Install ffmpeg
Go follow the pydub tutorial on how to set up ffmpeg on the various platforms.
Install CLI
pip install --upgrade AudioConverter
Usage
audioconvert convert INPUT_DIRECTORY OUTPUT_DIRECTORY TARGET_FORMATThis will recursively search the INPUT_DIRECTORY for files with music extensions. Each file found will then be converted to the TARGET_FORMAT and placed in the OUTPUT_DIRECTORY with the same name but updated extension.
The —verbose/-v flag must be provided before the convert command. This will enable debugging logs and allow you to monitor progress.
For example — to convert the contents of the directory input/ , containing files of type .m4a and .flac , outputting to directory output/ , converting to type .mp3 run:
audioconvert convert input/ output/ --output-format .mp3
Experimental
Audio can be passed to be converted to specific codecs. This is an experimental now feature as it has no error checking that certain codecs are compatible with your desired output audio format. Depending on ffmpeg and/or pydub , there may or may not be error logging.
To use the new experimental feature:
audioconvert convert input/ output/ --output-format .wav --codec pcm_mulaw
Accepted Formats
Due to not being super savvy with audio formats, I hard coded the extensions that are searched for in the INPUT_DIRECTORY and acceptable TARGET_FORMAT . Here is a list of formats I thought were popular:
Supported Codec
Как конвертировать ogg в wav?
Не нашёл документацию, как использовать это в python.
Не подскажите, как конвертировать с помощью этой библиотеки файл ogg в mav?
Простой 5 комментариев
Есть документация.
Просто возможности утилиты настолько неразнообразны, что их можно описать в пару примеров или одним скриншотом справки, что и было продемонстировано.
Ваш вопрос, вероятно, будет иметь решение в виде такой команды.
ftransc -f wav filename.ogg
Error opening 'audio.ogg': File contains data in an unimplemented format.
s1veme, ну ftransc работает через командную строку. Вот как работать с ней в питоне
То есть в итоге должно получиться что-то такое:
os.system(‘ftransc -f wav file.ogg’)
Тимур Покровский, только что попытался сделать через библиотеку. Но теперь выкидывает ошибку, что такого файла нет.
from pydub import AudioSegment ogg_audio = AudioSegment.from_file("audio.ogg", format="ogg") ogg_audio.export("audio1.wav", format="wav")
FileNotFoundError: [WinError 2] Не удается найти указанный файл
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning
При импорте данной библиотеке — он даже посмотреть текущею директорию не даёт)
Dr. Bacon, пробовал, что-то связанное с ffmpeg, при pip install ffmpeg
#!/usr/bin/env python3.8 # -*- coding: utf-8 -*- import os import soundfile as sf data, samplerate = sf.read('/home/delvin/music/rus_tech_remix.ogg') sf.write('/home/delvin/tmp/new_file.wav', data, samplerate) os.system('file /home/delvin/tmp/new_file.wav')
RuntimeError: Error opening 'audio.ogg': File contains data in an unimplemented format.
Вроде бы это работает только на Linux.
SoundFile может читать и записывать звуковые файлы. Чтение/запись файлов поддерживается через libsndfile — бесплатную кроссплатформенную библиотеку с открытым исходным кодом (LGPL) для чтения и записи множества различных форматов звуковых файлов с семплами, которые работают на многих платформах, включая Windows, OS X и Unix. Доступ к нему осуществляется через CFFI, который является интерфейсом внешней функции для Python, вызывающего код C. CFFI поддерживается для CPython 2.6+, 3.x и PyPy 2.0+. SoundFile представляет аудиоданные в виде массивов NumPy.
Тебе же питон говорит: «Файл содержит данные в нереализованном формате.»
Попробуй другой файл с полным путем БЕЗ пробелов(хоть в корень положи и напиши c:\test.ogg)