Extracting data from pdf files
72 ビュー (過去 30 日間)
古いコメントを表示
joseph Frank
2014 年 4 月 19 日
回答済み: Christopher Creutzig
2021 年 4 月 27 日
Hi,
I have around 300 pdf files with 19 pages each. I want to extract from each of them a fraction of a table on page 4 in order to build a research data set. Is i possible to do so using matlab? if so,which toolboxes and functions I need. I have matlab 2013a.
0 件のコメント
採用された回答
Kristian Gennaci
2014 年 4 月 21 日
Hi Joseph,
Have you tried using this File Exchange submission?
This seems like the most promising solution. Alternatively, if you could convert the tables to an excel spreadsheet/CSV format, they can then easily be parsed using MATLAB's Excel/CSV functions:
I'll let you know if I find any other solutions.
Best,
Kristian
0 件のコメント
その他の回答 (1 件)
Christopher Creutzig
2021 年 4 月 27 日
JFTR, since R2017b, extractFileText('filename.pdf','Pages',4) from Text Analytics Toolbox gives you the text on ("physical") page 4 of the PDF, from which you can then extract the parts you need with string operations (extractBetween, regexp, etc.).
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Startup and Shutdown についてさらに検索
製品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!