How to import PDF into Excel?

**joeu2004** · June 12th, 2009, 12:22 AM posted to microsoft.public.excel.worksheet.functions

I want to import some data from PDF files into Excel. Is there a
straight-forward way to do this?

For example, see http://muddybuddy.com/pdf/sanjose/results-09.pdf.

What I have done in the past is: open the PDF file, save as text, and write
a VBA macro to read the text file and parse the data line-by-line, putting
it into a worksheet in the form that I require.

The issue is: the data in this particular file does not follow a consistent
pattern when it is saved to text. For example compare the data for "Bib:"
numbers 349, 299, 479 and 1084.

(Aside: Can anyone explain why? The data appears consistently in the PDF
file.)

The issue is not insurmountable. I can recognize and deal with the
different patterns in my parser.

The problem is: I don't know (yet) how many different patterns are
possible. I have found 4 so far. But I would have to look carefully at all
1032 entries to determine if there are other forms.

(Actually, I would simply parse what I know and see what is missing, then
add a parser for the missing pattern. But that's tedious.)

**joeu2004** · June 12th, 2009, 12:29 AM posted to microsoft.public.excel.worksheet.functions

Oops, posted to an unintended m.p.excel NG. Wasn't paying attention when I
posted (sigh). Oh well, I know the right people will see this anyway.

----- original message ------

"JoeU2004" wrote in message
...
I want to import some data from PDF files into Excel. Is there a
straight-forward way to do this?

For example, see http://muddybuddy.com/pdf/sanjose/results-09.pdf.

What I have done in the past is: open the PDF file, save as text, and
write a VBA macro to read the text file and parse the data line-by-line,
putting it into a worksheet in the form that I require.

The issue is: the data in this particular file does not follow a
consistent pattern when it is saved to text. For example compare the data
for "Bib:" numbers 349, 299, 479 and 1084.

(Aside: Can anyone explain why? The data appears consistently in the PDF
file.)

The issue is not insurmountable. I can recognize and deal with the
different patterns in my parser.

The problem is: I don't know (yet) how many different patterns are
possible. I have found 4 so far. But I would have to look carefully at
all 1032 entries to determine if there are other forms.

(Actually, I would simply parse what I know and see what is missing, then
add a parser for the missing pattern. But that's tedious.)

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode