Error while using readPDFFormData to extract data from online pdf files

1 回表示 (過去 30 日間)
Ana Egatz-Gomez
Ana Egatz-Gomez 2023 年 12 月 21 日
コメント済み: Ana Egatz-Gomez 2024 年 3 月 6 日
Hi, I have two related questions.
First, when I use the command readPDFFormData with an online pdf, I get an error message. How should I write the URL so it works?
Any help will be greatly appreciated.
filename = "https://www.ncdoi.com/SHIIPCurrentYear/Documents/MAL%20by%20County/2024%20MAPD%20Pender%20County.pdf";
data = readPDFFormData(filename);
Error using readPDFFormData
Unable to open file 'https://www.ncdoi.com/SHIIPCurrentYear/Documents/MAL%20by%20County/2024%20MAPD%20Pender%20County.pdf' for reading: Filename contains an invalid URL scheme..

採用された回答

Anton Kogios
Anton Kogios 2023 年 12 月 22 日
I was not able to get the URL to work directly either (not sure why since I'm pretty sure reading online images such as PNG/JPG works...), but here is a workaround (it just downloads the PDF first):
filenameOnline = "https://www.ncdoi.com/SHIIPCurrentYear/Documents/MAL%20by%20County/2024%20MAPD%20Pender%20County.pdf";
filenameLocal = "test.pdf"; % can set to custom directory
websave(filenameLocal,filenameOnline);
formData = readPDFFormData(filenameLocal)
formData = struct with no fields.
fileText = extractFileText(filenameLocal); % since readPDFFormData returns an empty struct, this is just to make sure we can read the PDF
As for your second question, I can't seem to access the PDFs on the website you mentioned (I think it is because I'm in a different country), but you should be able to just use a for loop. You can also look into using sprintf if the URLs have a repetitive naming system, which I've also demonstrated. Something like:
filenamesOnline = ["url1.pdf";
"url2.pdf";
"url3.pdf"];
for i = 1:length(filenamesOnline)
filenameLocal = sprintf('test%i.pdf',i);
websave(filenameLocal,filenameOnline(i));
formData = readPDFFormData(filenameLocal)
end
I hope this helps and you are able to get it to work for you!

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeStartup and Shutdown についてさらに検索

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by