MATLAB Answers

How to read table from pdf

59 ビュー (過去 30 日間)
Rizwan Khan
Rizwan Khan 2020 年 11 月 22 日
コメント済み: dpb 2020 年 11 月 27 日
I have a pdf, it text within a table
I am able to read the text into a varible, but then i get a string with all the text in it.
i make use of extractFileText to read it into a string.
How can i then turn this text into a table?
I've pasted a sample of the string i read in, it has no table column names, its just actual data
So what i want to do is ignore the first to rows below and from there you see three records (lines)
Each line needs to be a row in the table, and the delimeter between each column value is the three arrows (which i think is a newline)
Weekly Gazettes 1 ↵↵↵
NEW SOUTH WALES WEEKLY ISSUE ↵ ↵↵↵
3 RIVERS ESTATE, 140 001 976 ↵↵↵374 KALKITE RD KALKITE NSW 2627 ↵↵↵Creditor: CONSULT SURVEY GRA PTY LTD ↵↵↵DEFAULT JUDGEMENT (NSW) 02/11/2020 ↵↵↵00262008/20/163, $113,237.00 ↵↵↵
ABCD PROJECTS, 618 354 331 ↵↵↵8 17 GARTMORE AVE BANKSTOWN NSW 2200 ↵↵↵Creditor: WORKERS COMPENSATION NOMINAL I ↵↵↵DEFAULT JUDGEMENT (NSW) 03/11/2020 ↵↵↵00063818/20/METN, $2,553.00 ↵↵↵
ABOUT CONCRETE CONSTRUCTIONS, 156 080 241 ↵↵↵46 NEW HORIZON AVE BAHRS SCRUB QLD 4207 ↵↵↵Creditor: HUSQVARNA AUSTRALIA PTY LTD ↵↵↵DEFAULT JUDGEMENT (NSW) 03/11/2020 ↵↵↵00223837/20/3, $1,298.00 ↵↵↵
AC SHOPFITTING SPECIALIST, 635 292 376 ↵↵↵12 CURTIN ST CABRAMATTA NSW 2166 ↵↵↵Creditor: WORKERS COMPENSATION NOMINAL I ↵↵↵DEFAULT JUDGEMENT (NSW) 06/11/2020 ↵↵↵00266709/20/METN, $5,191.00 ↵↵↵
ACN 607735080, 607 735 080 ↵↵↵14 BARNES ST WOOLGOOLGA NSW 2456 ↵↵↵Creditor: BIDFOOD AUSTRALIA LTD ↵↵↵DEFAULT JUDGEMENT (NSW) 02/11/2020 ↵↵↵00271889/20/METN, $9,891.00 ↵↵↵
  6 件のコメント
Rizwan Khan
Rizwan Khan 2020 年 11 月 25 日
Dear Sir,
if we see my text i pasted from teh variable.
Then, each of those arrows represents a new variable.
How can i loop through them using that arrow (left arrow) as a delimeter?
The record completed after the currency.
So the problem no longer is how i read pdf, i am doing that, the problem now is, how do i loop through that str which has all the pdf content?

サインインしてコメントする。

回答 (1 件)

Mathieu NOE
Mathieu NOE 2020 年 11 月 23 日
hello
I don't know where the function extractFileText comes from
So I'd did it my way : converted the pdf in excel file (on internet) and then was very easy:
T = readtable('weekly-gazettes-12-11-20-converti.xlsx');
C = table2cell(T)
C =
133×2 cell array
{'ABCD PROJECTS, 618 …'} {'DEFAULT JUDGEMENT (…'}
{'ABOUT CONCRETE CONS…'} {'DEFAULT JUDGEMENT (…'}
{'AC SHOPFITTING SPEC…'} {'DEFAULT JUDGEMENT (…'}
{'ACN 607735080, 607 …'} {'DEFAULT JUDGEMENT (…'}
{'ACP ACCOUNTANTS & C…'} {'DEFAULT JUDGEMENT (…'}
etc......
  9 件のコメント
dpb
dpb 2020 年 11 月 27 日
Sure. See split

サインインしてコメントする。

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by