Midterm Project
SAVI – The Smart Audio Volume Indicator
The project Evinn Quinn, Stephanie Aaron and I decided to take on for our midterm was a system for indicating to cell-phone talkers when they’re speaking too loudly for the context they’re in, a product we call SAVI, or Smart Audio Volume Indicator
Problem Realization
In our project development conversations, we came upon the problem of people speaking at contextually inappropriate volume levels in public spaces. Be it the bus, the elevator or the bank line, we could all recall instances of this happening.
In most cases, these people were on cell phones, and while some may simply not have cared who else they might be bothering, we came to believe most were simply unaware of the volume at which they were speaking.
In our anecdotal observations, we’ve seen that this can lead to embarrassing leaks of private information, and uncomfortable social situations for all involved. Potentially, these situations can escalate to confrontation or violence. In short, it seemed an issue worthy of addressing.
Solution Formulation
Honing in on the frequency with which manifestations of the problem involved cell phones, we decided to pursue a cell phone-specific solution. Equipped with a microphone and a variety of potential outputs, it seemed the perfect platform. As we’re not cell-phone developers, however, we decided to prototype the behavior by building our own device.
We recognized the basic components of a solution were these: We’d need to measure the volume of the ambient noise in a space, measure the volume of the phone user’s speech, compare these values (with some sort of allowance for speaking over background noise,) and provide feedback to the speaker if a certain difference was exceeded.
We considered a variety of possible outputs, from a tapping on the shoulder to a whisper in the ear which got progressively louder as the transgression persisted to some form of vibration. To begin, however, we decided to tackle the input and signal comparison problems.
First Iteration
For the first iteration, we decided to table to output discussion, and concentrate on dealing with inputs and programatic comparison of levels. While we were able to find schematics for a microphone and amplifier circuit, and get ahold of the various parts needed to assemble it, we found the range of values we were able to pick up was too small to even be sure the input was working correctly. Not wanting to waste time, we decided to fake the inputs as well, and deal primarily with the programming and behavioral logic of detection and alerts.
Along those lines, we built a prototype consisting of two potentiometers (one to set the “background” volume, and another the volume of the phone user’s voice) and an LED hooked up to an analog pin as our output (we knew from the beginning that we’d want the output, whatever it would be, to scale with the intensity or persistence of the loud talking.
We fairly quickly arrived at a simple bit of code to compare the two inputs, with a defined “alert delta” required to trigger an alert. Over this threshold, the LED lit in realtime to a brightness determined by the difference between the speech volume and the background volume. Code follows:
//Define fixed items
#define ledPin 9 // LED connected to digital pin 9
#define Mic1 0 // Mic (pot) for background volume connected to analog pin 0
#define Mic2 1 // Mic (pot) for speech volume connected to analog pin 1
#define alertDelta 100 // Minimum difference between background volume and speech volume required to trigger alert
//Initialize Variables
int backgroundVolume = 0; // variable to store the read value
int speechVolume = 0; // variable to store the read value
void setup()
{
pinMode(ledPin, OUTPUT); // sets the pin as an output
Serial.begin(9600);
}
void loop()
{
readVolumes();
compareVolumes();
}
void readVolumes() {
backgroundVolume = analogRead(Mic1); // reads input from the "background" mic
speechVolume = analogRead(Mic2); // reads input from the "Speech" mic
Serial.println(backgroundVolume, DEC);
Serial.println(speechVolume, DEC);
}
void compareVolumes() {
if (speechVolume > (backgroundVolume + alertDelta)) {
alertRoutine();
}
else {
clearRoutine();
}
}
void alertRoutine() {
Serial.println("Alert!");
analogWrite(ledPin, (speechVolume - backgroundVolume) / 4);
}
void clearRoutine() {
analogWrite(ledPin, LOW);
}
Response and Feedback
Faking both inputs and outputs, it was impossible to really test the device in use. We faked up a little enclosure, however, and did a man-behind-the-curtain demo for the class, which was enough to see the potential and get some feedback.
On the input side, we clearly needed a microphone to work, and we determined that the speech one was the most important. The background mic could be faked a little longer. As for output, a number of suggestions came up, but Russ said something particularly intriguing about the essential nature of having the output map immediately and directly to the speech input itself. As we’d identified vibration as a candidate for that output, it was even more important that we find a way to differentiate that feedback from any of the other ways a phone might vibrate.
Both these things in mind, we moved ahead to the next revision.
Second Iteration
The second iteration began to reveal some of the things you can only tell by building and testing. We managed to wrangle microphone input into a range we were comfortable with, and added a vibrator motor as an output.
The first thing we realized we’d have to adjust was the analog output to the motor. Under a certain level, it simply wasn’t enough to start the vibrator motor moving, so we needed to account for that in the code.
The next big discovery was that we simply couldn’t get the microphone input to predictably trigger the alert. In the midst of yelling, it would stop, and it would trigger in the midst of normal speech. We even switched back to the LED as an output to make sure the motor wasn’t part of the problem. We soon realized the problem: speech is never at a continuous volume, the ups and downs are what enable our sounds to be shaped into words. Depending on where we “sampled” the “waveform,” we might hit a peak in regular speech, or a trough in loud speech. Instead, we’d need to look at, and respond to the envelope of the speech, not an individual sample. This meant building an average, and after a little searching we found a jumping-off point for writing code to do just that.
We were able to establish a flexible array, and respond to a rolling average of the speech volume. This smoothed out the bumps, and with some trial and error we were able to get output that wasn’t jumping all over, but was obviously and immediately related to the speech input. It was immediate enough that those who tried it knew inherently what was happening.
Tweaks and Presentation
The day before our presentation in class, new, better microphones with integrated amplifiers arrived, and we quickly integrated them into the project. This did not resolve a major remaining issue, however: every time an alert was triggered, a feedback loop began that all but ensured the device wouldn’t come back from the alert state.
With Rob’s help, we went through our code, and couldn’t find anything that would cause such an issue. All other possibilities excluded, we came to the conclusion that having the motor on the same power bus as the rest of the circuit was, when activated, causing a voltage drop large enough to frustrate the readings from the microphone. As a solution, we separated the power supply to the motor, which seemed to solve the problem.
Finally, to improve the artifact for presentation, we moved the components that didn’t absolutely need to be in the handset itself to an outboard breadboard connected with a ribbon “bus”. Additionally, we created a cleaner enclosure for the whole thing.
Everything held together for the in-house demo, and though the background volume level was still “faked” with a potentiometer and the motor vibration wasn’t as consistent or smooth as it might have been (or was in earlier testing) the concept still exhibited merit and seemed like something that could be successful (if potentially difficult to find a market for) if developed to its natural conclusion.
Code, as used in the final demo it below:
//Define fixed items
#define motorPin 9 // Vibrator motor connected to digital pin 9
#define Mic1 0 // Mic (pot) for background volume connected to analog pin 0
#define Mic2 1 // Mic for speech volume connected to analog pin 1
#define alertDelta 5 // Minimum difference between background volume and speech volume required to trigger alert
//Initialize Audio Variables
int backgroundVolume = 100; // variable to store the read value
int rawMicInput = 0;
int speechVolume = 0; // variable to store the read value
//Averaging Variables
const int numReadings = 5;
int readings[numReadings]; // the readings from the analog input
int index = 0; // the index of the current reading
int total = 0; // the running total
int average = 0; // the average
void setup()
{
pinMode(motorPin, OUTPUT); // sets the pin as an output
Serial.begin(9600);
for (int thisReading = 0; thisReading < numReadings; thisReading++) readings[thisReading] = 0;
}
void loop()
{
readVolumes();
processSpeechVolume();
averageRoutine();
compareVolumes();
}
void readVolumes() {
backgroundVolume = analogRead(Mic1); // reads input from the "background" mic
rawMicInput = analogRead(Mic2); // reads input from the "Speech" mic
Serial.print(backgroundVolume, DEC);
Serial.print(" ");
Serial.println(average, DEC);
Serial.print(" ");
}
// compare the volumes, and determine if an alert is required
void compareVolumes() {
if (average > (backgroundVolume + alertDelta)) {
alertRoutine();
}
else {
clearRoutine();
}
}
void alertRoutine() {
Serial.println("Alert!");
analogWrite(motorPin, ((average - backgroundVolume) / 2 ) + 17 );
}
void processSpeechVolume() { // Normalize output to a positive 0 - 512 range
if (rawMicInput >= 512) {
speechVolume = rawMicInput - 512;
}
else {
speechVolume = 512 - rawMicInput;
}
}
void clearRoutine() { // clear the signal to the motor
analogWrite(motorPin, LOW);
}
void averageRoutine() { // place speechVolume into the array and average
// subtract the last reading:
total= total - readings[index];
readings[index] = speechVolume;
// add the reading to the total:
total= total + readings[index];
// advance to the next position in the array:
index = index + 1;
// if we're at the end of the array...
if (index >= numReadings) {
// ...wrap around to the beginning:
index = 0;
}
// calculate the average:
average = total / numReadings;
}
A quick video of SAVI in action:
SAVI in Use from Jeff Kirsch on Vimeo.
About this entry
You’re currently reading “Midterm Project,” an entry on Jeff Kirsch
- Published:
- 11.18.09 / 12am
- Category:
- Coursework, Fundamentals of Physical Computing
- Tags:


No comments
Jump to comment form | comments rss [?] | trackback uri [?]