MATLAB Answers

Why do my rgb image is different and less defined than pcolor or imagesc outputs?

37 ビュー (過去 30 日間)
Claudio Eutizi
Claudio Eutizi 2021 年 1 月 11 日
コメント済み: Claudio Eutizi 2021 年 1 月 13 日
I want to automatically extract VGGish features 2D plots from an audio dataset and then save them in a folder.
here's my code I wrote for testing the output image:
datafolder = "F:\UrbanSound8K\structure1";
ads = audioDatastore(datafolder, ...
'IncludeSubfolders',true, ...
'FileExtensions','.wav', ...
'LabelSource','foldernames');
fs0 = 16e3;
[audioIn, fs] = audioread(ads.Files{5});
overlapPercentage = 84;
features = vggishFeatures(audioIn,fs,'OverlapPercentage',overlapPercentage);
features = features.';
im = ind2rgb(im2uint8(rescale(features)),colormap);
im = imresize(im,[224 224]);
imgLoc = fullfile(datafolder_vggishFeatures,char(ads.Labels(i)));
imFileName = strcat(char(ads.Labels(i)),'_',num2str(i),'.jpg');
imwrite(im,fullfile(imgLoc,imFileName));
The output turns out reversed on y-axis and with bad defined contours, unlike imagesc(features) and pcolor(features) outputs.
How can I solve this problem? Thank you.
Here you can find the output from the code previously written and the result of a pcolor command.

  0 件のコメント

サインインしてコメントする。

採用された回答

Image Analyst
Image Analyst 2021 年 1 月 11 日
編集済み: Image Analyst 2021 年 1 月 11 日
The images do not have the same resolution. One has 224 rows while the other is smaller with around 125 rows. It looks like the smaller one may have been subsampled by taking the nearest row so it's not as smooth. Attach the data (image #5 I guess from your code) if you want further help.
By the way, normally pcolor just doesn't show the last row and column, as you can easily prove just by showing a small image like pcolor(magic(3)). So I think you must have done the resampling.
You can switch y axis order by doing
axis ij % Put origin at top, like for images and matrices.
axis xy % Put origin at bottom like for x-y graphs.

  0 件のコメント

サインインしてコメントする。

その他の回答 (2 件)

Claudio Eutizi
Claudio Eutizi 2021 年 1 月 11 日
Thank you for your answer.
How can i attach an audio file? ads.Files{5} is an audio file.
However, I need 224x224 images because of the input layer of the NN I'm using for Deep Learning and I would like to obtain images with the same resolution and smoothness as the pcolor output.

  3 件のコメント

Image Analyst
Image Analyst 2021 年 1 月 11 日
You can zip them up. Make sure the number of elements of your audio file is a multiple of 224 (or 227 for AlexNet). But I'm not sure what image you're displaying with imagesc and what image you're displaying with pcolor (or why you're even using pcolor at all - I never do).
Claudio Eutizi
Claudio Eutizi 2021 年 1 月 11 日
Here's the audio file in the zip.
Here's both imagesc (right) and pcolor (left) outputs. They're the same.
How can I obtain 224x224 png or jpg files with this resolution?
figure;
subplot(1,2,1); imagesc(features);
axis ij
axis xy
subplot(1,2,2); pcolor(features); shading flat;
Image Analyst
Image Analyst 2021 年 1 月 11 日
Try this:
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 14;
files = dir('100*.wav')
fullFileName = fullfile(pwd, '100852-0-0-12.wav');
[y, fs] = audioread(fullFileName);
subplot(3, 1, 1);
plot(y, 'b-', 'LineWidth', 2);
% soundsc(y, fs);
grid on;
xlabel('Time', 'FontSize', fontSize);
ylabel('Signal Amplitude', 'FontSize', fontSize);
numOriginalSamples = size(y, 1)
numMultiplesOf224 = numOriginalSamples / 224 % Hopefully it's an integer.
% Make output vector
vec = zeros(1, ceil(numMultiplesOf224) * 224); % Initialize to a multiple of 224
% Insert sound signal
vec(1:numOriginalSamples) = y;
% Reshape to an image.
image224 = reshape(vec, 224, []);
subplot(3, 1, 2);
imshow(image224, 'ColorMap', jet(256));
axis('on', 'image');
caption = sprintf('Original image is %d rows by %d columns', size(image224, 1), size(image224, 2));
title(caption, 'FontSize', fontSize);
% Resize to 224 x 224
numNewSamples = 224 * 224
xq = linspace(1, numOriginalSamples, numNewSamples);
vec = interp1(1:numOriginalSamples, y, xq, 'spline');
% Reshape to an image.
image224 = reshape(vec, 224, 224);
subplot(3, 1, 3);
imshow(image224, 'ColorMap', jet(256));
axis('on', 'image');
caption = sprintf('Resized image is %d rows by %d columns', size(image224, 1), size(image224, 2));
title(caption, 'FontSize', fontSize);

サインインしてコメントする。


Claudio Eutizi
Claudio Eutizi 2021 年 1 月 12 日
It works with the audio file.
But I need this way of reshaping for VGGish features images like the ones i previously attached here, and not for audio files.
How can I do it?
Thank you so much for the help.

  2 件のコメント

Image Analyst
Image Analyst 2021 年 1 月 12 日
Tell me what folder structure and files do I need to have to run your code. Also, I don't seem to have the vggishFeatures() function. Is it in a special toolbox that I don't have? What does this say:
which -all vggishFeatures
Claudio Eutizi
Claudio Eutizi 2021 年 1 月 13 日
C:\Program Files\MATLAB\R2020b\toolbox\audio\audio\vggishFeatures.m
It's a function that's in the Audio Toolbox.

サインインしてコメントする。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by