
Find all datapoints in an audio file where a tone was recorded and snip them out.

Lutz 2022 年 3 月 17 日
コメント済み: Mathieu NOE 2022 年 3 月 25 日
Hello guys,
i record audio data and to prepare the raw data for later stuff, i have to snip a long recording into smaller pieces. so, I try to find all "tones" in one recording. That means in the end i need a tool that gives me the time stamps to snip the places or sounds out of the data. for a better understanding see the screenshot:
every red dot is a sound. My Idea was to find the highest pich inside the data and cut it with a littlebit buffer time in the front and after out.
At this point my Problem is to find these peaks.
At this point i tryed it with a threshold
th = mean(y2) +3*std(y2);
ind = find(y2>th);
y2Snip = {};
indDiff = find(diff(ind)>5);
for i=1:length(indDiff)+1
if i==1
y2Snip{i} = y2(ind(1)-1:ind(indDiff(1))+1);
elseif i==length(indDiff)+1
y2Snip{i} = y2(ind(indDiff(i-1)+1)-1:ind(end)+1);
y2Snip{i} = y2(ind(indDiff(i-1)+1)-1:ind(indDiff(i))+1);
This works, but it finds much more data points than i need and i cant cut the "one tone" out becaus it finds in "one tone" alot of more data points.
Thanks for Help.


Mathieu NOE
Mathieu NOE 2022 年 3 月 18 日
this would be my suggestion.
the individual segments are selected based on the signal rms value and their length. here you can choose to store the segements that are above a given threshold (here 10% of the max rms value) and longer than min_contiguous_samples samples
the results (time , signal) are stored in a cell array (data_store)
hope it helps !
% load audio data
[data, Fs] = audioread('test_voice_mono.wav');
samples = length(data);
dt = 1/Fs;
time =(0:samples-1)*dt;
%% parameters
min_contiguous_samples = 1000; % store "red" segments only if they are at least this length
threshold = 0.1; % 1 = max (100%) of rms value
samples_befor = 100; % add some samples before the "start" point detected by code below
samples_after = 100; % add some samples after the "end" point detected by code below
%% main loop %%%%
% running rms (buffered) parameters :
rms_buffer = 500; % nb of samples in one buffer (buffer size)
shift = 1; % = rms_buffer-overlap; % nb of samples between 2 contiguous buffers
for ci=1:fix((samples-rms_buffer)/shift +1)
start_index = 1+(ci-1)*shift;
stop_index = min(start_index+ rms_buffer-1,samples);
time_index(ci) = round((start_index+stop_index)/2); % time index expressed as sample unit (dt = 1 in this simulation)
rms_data(ci) = my_rms(data(start_index:stop_index)); %
time_rms = time(time_index);
ind = (rms_data>threshold*max(rms_data));
% now define start en end point of "red" segments
[begin,ends] = find_start_end_group(ind);
length_ind = ends - begin;
ind2= length_ind>min_contiguous_samples; % check if their length is valid (above min_contiguous_samples value)
begin = begin(ind2); % selected points
ends = ends(ind2); % selected points
% define for plot the red rms data
time2 = time_rms(ind);
rms_data2 = rms_data(ind);
% define the begin / ending x, y values of raw data
time2_begin = time_rms(begin);
data_begin = interp1(time,data,time2_begin);
time2_ends = time_rms(ends);
data_ends = interp1(time,data,time2_ends);
legend('signal',['rms above ' num2str(threshold*100) ' %'] ,'begin points','end points');
% store each "red" segment separately (in cell array)
figure(2), hold on
for ci = 1:length(begin)
ind = (time>=time2_begin(ci)-samples_befor*dt & time<=time2_ends(ci)+samples_after*dt);
xx = time(ind);
yy = data(ind);
data_store{ci} = [xx(:) yy(:)]; % 2 columns : time / data
hold off
%%%% end of main file %%%%%%%%%%%
function [begin,ends] = find_start_end_group(ind)
% This locates the beginning /ending points of data groups
D = diff([0,ind,0]);
begin = find(D == 1);
ends = find(D == -1) - 1;
function x_rms = my_rms(x)
x_rms = sqrt(mean(x.^2));
Lutz 2022 年 3 月 25 日
I like your method, but it porduce an error because of the missing variable "my_rms".
is this a bug in the code or am I doing something wrong?
Mathieu NOE
Mathieu NOE 2022 年 3 月 25 日
as you can see this is part of the code (functions are at the end of the code)
my code works on my release R2020b - what are you running ?
function x_rms = my_rms(x)
x_rms = sqrt(mean(x.^2));


