Building a speech recognizer AI/ML model in Python (Part 4 of 6 — Synthesizing tones to generate music)
Previously we explored how to generate a monotone, but monotone is not very meaningful as it's just a single frequency.
Let’s use the same principle to synthesize music by stitching different tones together. Similarly to music theory, we will use tones such as A, C, G, and F to create music.
Let’s synthesizing tones to generate music.
Create a new Python file and import the packages.
# Import packages
import numpy as np
import matplotlib.pyplot as plt
from scipy.io.wavfile import write
import json
Let’s define a function which we use to generate a tone based on parameters.
# Synthesize the tone
def tone_synthesizer(freq, duration, amplitude=1.0, sampling_freq=44100):
# Construct the time axis
time_axis = np.linspace(int(0), int(duration), int(duration * sampling_freq))
Construct the audio signal using the parameters specified and return it.
# Construct audio signal
signal = amplitude * np.sin(2 * np.pi * freq * time_axis)
return signal.astype(np.int16)
Define main function.
if _name_=='_main_':
file_tone_single = 'generated_tone_single.wav'
file_tone_sequence = 'generated_tone_sequence.wav'
We will use a tone mapping that contains mapping from tones, such as A, C and G to the corresponding frequencies.
# Source:
mapping_file = 'tone_mapping.json'
# Load the tone to frequency map
with open(mapping_file, 'r') as f:
tone_map = json.loads(f.read())
Let’s generate the F tone.
tone_name = 'F'
duration = 3
amplitude = 12000
sampling_freq = 44100
Extract the corresponding frequency and generate tone using synthesizer function.
# Extract
tone_freq = tone_map[tone_name]
# Generate tone
synthesized_tone = tone_synthesizer(tone_freq, duration, amplitude, sampling_freq)
# Writing the generated tone
write(file_tone_single, sampling_freq, synthesized_tone)
In-order to make it sound like music, we will create a tone sequence.
# Tone sequence
tone_sequence = [('G', 0.4), ('D', 0.5), ('F', 0.3), ('C', 0.6), ('A', 0.4)]
Construct audio signal based on tone sequence.
# Contruct audio signal
signal = np.array([])
for item in tone_sequence:
tone_name = item[0]
For each tone, extract the corresponding frequency.
# Extract the corresponding frequency of the tone
freq = tone_map[tone_name]
duration = item[1]
# Synthesize tone
synthesized_tone = tone_synthesizer(freq, duration, amplitude, sampling_freq)
Append the extraction to the main output signal and save.
# Append signal
signal = np.append(signal, synthesized_tone, axis=0)
# Save audio file
write(file_tone_sequence,l sampling_freq, signal)
Once the final block of code has been run, you should be able to play the audio. The example is a rudimentary example of speech and tone synthesizer used in advanced audio sampling tools.
In the next part (5) we will explore methods to extract speech features.