Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - Extracting text problem
  FAQ FAQ  Forum Search   Register Register  Login Login

Extracting text problem

 Post Reply Post Reply
Author
Message
RobertN View Drop Down
Beginner
Beginner


Joined: 13 May 09
Location: Canada
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote RobertN Quote  Post ReplyReply Direct Link To This Post Topic: Extracting text problem
    Posted: 13 May 09 at 3:04PM

I have created a simple form in Excel with cells that have '@VariableName'

in them. I print to PDF and then open the pdf using QuickPDF and delphi.
I want to scan the pdf for all text that has '@somevariablename' and get the fontsize,coordinates,etc and then convert them into formfields.
The purpose is to create a pdf form filler that i can save the results from.
 
I tried to do a GetPageText(3) but the results don't have any readable text. If I try a pdf with formfields i get the extracted text properly.
 
How do I extract this text ?
 
Thank you,
Robert
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 1:48AM
Hi Robert!

In your case i think the content of "@some..." will be single strings/words ...
So it should be better to use GetPageText(4).

Perhaps it's possible for you to send me a sample of your files and then i'll try to extract the strings with "@some..."?

ingo  [ dot ]  schmoekel  ( at )  ewetel  [ dot ]  net

Cheers, Ingo
 
Back to Top
RobertN View Drop Down
Beginner
Beginner


Joined: 13 May 09
Location: Canada
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote RobertN Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 8:39AM
Hi Ingo,
 
here is a sample pdf file with the "@sometext" in it.
it was generated using excel and printed to PDF via PrimoPDF.
I have tried a few other printer drivers, but the result was the same.
 
I tried  GetPageText() with 0,1,2,3,4 but all with the same result.
I can open it in Acrobat Reader and extract the text without a problem.
 
 
Thank you,
Robert
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 8:51AM
Hi!

I would be careful about the versions of PrimoPDF. They are using the ghostscript-library and with older versions (before 8.15) QuickPDF still has problems while extracting! Your pdf was made with PrimoPDF and ghostscript-version 8.50 ... so this is okay. Looking in the extracted text i can find many variables beginning with "@" ... so i think basically it's working.
Adobe Reader (8.1) and Foxit (3.0) can't find "@sometext", too.
Is it a special moment while adding "@sometext" to the content?
How do you do this?
Any code parts for us here to check?

Cheers, Ingo



Edited by Ingo - 14 May 09 at 8:53AM
Back to Top
RobertN View Drop Down
Beginner
Beginner


Joined: 13 May 09
Location: Canada
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote RobertN Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 9:47AM
here is essentially what i'm doing in Delphi 7.
 
 
procedure TForm1.Button2Click(Sender: TObject);
var oDoc : TQuickPDF0713;
    sTemp,sFilename : string;
begin
  sFilename := 'c:\Temperature_Transmitter_Template.pdf';
  oDoc := TQuickPDF0713.Create;
  try
  if oDoc.UnlockKey('...') = 1
  then begin
         if oDoc.LoadFromFile(sFilename) = 1
         then begin
                sTemp := oDoc.GetPageText(0);
                ShowMessage(sTemp);
                // this returns an empty string
              end
         else begin
                ShowMessage('invalid PDF');
              end;
       end
  else begin
         ShowMessage('Invalid KEY');
       end;
  finally
    FreeAndNil(oDoc);
  end;
end;
The output is blank for GetPagetext() 0,1
for 2 - I get the text coordinates,etc in CSV format
for 3 and 4 - I get the same as 2, but all text is garbled.
Do i need to convert it.
 
sample output :
"UBTAOI+Arial",#000000,6.71,60.1272,118.3487,295.7056,118.3487,295.7056,124.6588,60.1272,124.6588,"())*++,-../*, )0 +-)0)+*.("
 
Thanks,
Robert
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 10:01AM
I've sent an email in this case to Debenu ... ;-)

Cheers, Ingo
Back to Top
deabrew View Drop Down
Newbie
Newbie
Avatar

Joined: 19 Jan 09
Status: Offline
Points: 43
Post Options Post Options   Thanks (0) Thanks(0)   Quote deabrew Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 5:53PM
Hello Robert, Ingo,

I'd like to confirm that Ingo has notified me, and that we will support this issue in a future version (fairly shortly).

Regards, Karl.
Back to Top
RobertN View Drop Down
Beginner
Beginner


Joined: 13 May 09
Location: Canada
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote RobertN Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 8:17PM

I just recreated the PDF sample using DoPDF print driver instead of PrimoPDF and everything works now in detecting the text using QuickPDF.

Thank you again for the quick responses.
Back to Top
deabrew View Drop Down
Newbie
Newbie
Avatar

Joined: 19 Jan 09
Status: Offline
Points: 43
Post Options Post Options   Thanks (0) Thanks(0)   Quote deabrew Quote  Post ReplyReply Direct Link To This Post Posted: 14 May 09 at 8:34PM
Excellent -- note, we have also added support for this functionality within the next build (7.14) of QPL.

Cheers, -Karl
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store