Building a speech recognizer AI/ML model in Python (Part 3 of 6 — Generating audio signals)
In the previous part, we learned how audio signals work, in this part let’s see how we can generate one such signal.
We will use NumPy package to generate various audio signals. Since audio signals are amalgamation of sinusoids, we can generate audio signal by introducing some predefined parameters.
Let's generate audio signals using Python:
Create a new Python file and import the packages.
# Import packages
import numpy as np
import matplotlib.pyplot as plt
from scipy.io.wavfile
import write
Define the output audio file’s name.
# Output file, audio save repo
output_file = "generated_audio.wav"
Specify the audio parameters.
# Specify audio parameters
duration = 4
sampling_freq = 44100
tone_freq = 784
min_val = -4 * np.pi
max_val = 4 * np.pi
Generate audio signal using the defined parameters.
# Generate audio signals
t = np.linspace(min_val, max_val, duration * sampling_freq)
signal = np.sin(2 * np.pi * tone_freq * t)
Normalize and scale the audio signal.
# Scale the signal to 16-bit int values
scaling_factor = np.power(2, 15) - 1
signal_normalized = signal /np.max(np.abs(signal))
signal_scaled = np.int16(signal_normalized * scaling_factor)
Saving the generated audio signal in a file.
# Save the audio signal in a file
write(output_file, sampling_freq, signal_scaled)
Extract the first 200 values for plotting.
# Extract first 200 values
signal = signal[:200]
Construct the time axis in ms.
# Constructing time axis in ms
time_axis = 1000 * np.arange(0, len(signal), 1)/ float(sampling_freq)
Finally let’s plot the signal.
# Plot
plt.plot(time_axis, signal, color = "black")
plt.xlabel('Time (ms)')
plt.ylabel('Amplitude')
plt.title('Generated audio signal')
plt.show()
You should be able to see a file called generated_audio.wav, which you can play using your computer’s media player. It should be a mixture of 784 Hz and sounds like a noise signal.
In the next part (4) we will explore how to synthesize tones to generate music.