Conversion of PDF data to Spreadsheet

8 ビュー (過去 30 日間)
Shaili Bulusu
Shaili Bulusu 2017 年 5 月 24 日
コメント済み: Guillaume 2017 年 5 月 25 日
HI I have several reports in PDF format. I would like to write an m-script to capture the data into spreadsheet. I thought the best method would be to add all the headers to an array, capturing each page's data in the PDF to different sheets in Excel and then populate the fields with the values corresponding to the headers. Is there a better way to achieve this?

回答 (1 件)

Guillaume
Guillaume 2017 年 5 月 24 日
Well, your first hurdle will be to capture the data from the pdf. There is no built-in tool for this in matlab and depending on the structure of the pdf this will be either a fair amount of work (data is actually stored as continuous text in the file) or extremely hard (data is stored as text but scattered through the file, or data is just an image of the text which will require ocr).
pdf is not really designed to transfer structured data to a computer. It's mostly meant to be read by a human.
  2 件のコメント
Guillaume
Guillaume 2017 年 5 月 25 日
Shaili Bulusu's comment posted as an answer moved here:
I understand the difficulties. But I have a script that will read the data for me from the pdf. My query is on the approach of sorting the headers as an array or if there is a better way to capture the data into a spreadsheet.
Guillaume
Guillaume 2017 年 5 月 25 日
More details on what the approach of sorting the headers as an array means would be required to answer your question. What form does the inputs come in, and what form of output do you want?

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSpreadsheets についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by