Convert a table in a pdf to a MATLAB cell structure
36 ビュー (過去 30 日間)
古いコメントを表示
I have a pdf file that contains an Nx9 table of data that I need to turn into a matlab cell structure of an excel file. Some of the (row,column) entries are blank.
So far, I have tried reading the pdf using:
txt = extractFileText('filename.pdf');
This produces a 1x1 string file with multiple spaces breaking up rows in a seemingly random order. The (row,column) combinations do not appear in a logical position in txt. Is there another command that can read a PDF table?
4 件のコメント
dpb
2021 年 1 月 30 日
Which is, it seems, what the scraping utilities do...get the boundaries of the table as rendered and then suck that area up.
Sim
2023 年 3 月 12 日
The following function is not really helpful when a PDFs contains tables with blank cells:
txt = extractFileText('filename.pdf');
Has a new tool been created in the meantime, i.e. between January 2021 and today, middle of March 2023 ?
回答 (2 件)
the cyclist
2023 年 3 月 12 日
2 件のコメント
the cyclist
2023 年 3 月 13 日
I've haven't used it for data that I would have privacy concerns about, but I think there are strong reasons to believe it is safe:
- It's open-source, so you can see all the code on github
- It doesn't seem to send your data anywhere else. Although it might seem like it is sending your data to a web site, it looks to me like it only opens a local browser window.
- It was first built by journalists, who tend to care about privacy (at least of their own data!)
参考
カテゴリ
Help Center および File Exchange で Spreadsheets についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!