Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > General Discussion
  New Posts New Posts RSS Feed - How to optimize text extraction?
  FAQ FAQ  Forum Search   Register Register  Login Login

How to optimize text extraction?

 Post Reply Post Reply
Author
Message
Dmitry View Drop Down
Team Player
Team Player


Joined: 21 Sep 06
Status: Offline
Points: 47
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dmitry Quote  Post ReplyReply Direct Link To This Post Topic: How to optimize text extraction?
    Posted: 11 Mar 07 at 3:40AM
Hi to all!
I have a question. How to optimize time of executing the function GetPageText? Average time of execution is about one second per page. It's too long for me :-) How to reduce this time?

if I understand correctly during the text extraction qPDF library extract also all images from page. May be it will be more faster not to extract and save to harddrive images ???
Back to Top
marian_pascalau View Drop Down
Debenu Quick PDF Library Expert
Debenu Quick PDF Library Expert


Joined: 28 Mar 06
Location: Germany
Status: Offline
Points: 278
Post Options Post Options   Thanks (0) Thanks(0)   Quote marian_pascalau Quote  Post ReplyReply Direct Link To This Post Posted: 11 Mar 07 at 1:48PM
Dmitry,
there is only one way to influence the Text extraction: the Option parameter.
 
As you may know there are 5 parameters:
0: contents scan
1: internally same as 0
2: contents scan, CVS output
3: CVS text collection with rendering (may read image dictionary)
4: CVS text collection with rendering and word separation.
 
As information for you using the 0-2 Option may bring some improvements.
Back to Top
Dmitry View Drop Down
Team Player
Team Player


Joined: 21 Sep 06
Status: Offline
Points: 47
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dmitry Quote  Post ReplyReply Direct Link To This Post Posted: 12 Mar 07 at 4:38AM
marian_pascalau, yes I know. But I need exactly parameter 5.
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 12 Mar 07 at 5:00AM
". . .
qPDF library extract also all images from page
. . ."

Hi!

The actual library version doesn't extract the images anymore.

Best regards,
Ingo
 
Back to Top
marian_pascalau View Drop Down
Debenu Quick PDF Library Expert
Debenu Quick PDF Library Expert


Joined: 28 Mar 06
Location: Germany
Status: Offline
Points: 278
Post Options Post Options   Thanks (0) Thanks(0)   Quote marian_pascalau Quote  Post ReplyReply Direct Link To This Post Posted: 12 Mar 07 at 5:39AM
Hi Dmitry, Hi Ingo,
I cannot follow both of you:
Dmitry, what do you mean with parameter 5?
Ingo, is it now working as expected or this is an error?
 
Marian
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 12 Mar 07 at 7:11AM
Hi Marian!

It's working like accepted...
I think months ago this was fixed...
Here's a thread pointing in the same direction:
http://www.quickpdf.org/forum/search_results_posts.asp?SearchID=20070312070924&KW=asachoi

Best regards,
Ingo

Back to Top
Dmitry View Drop Down
Team Player
Team Player


Joined: 21 Sep 06
Status: Offline
Points: 47
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dmitry Quote  Post ReplyReply Direct Link To This Post Posted: 13 Mar 07 at 5:22AM
marian_pascalau, sorry, I meant parameter 4

Ingo, please give me just direct link to the thread.
Back to Top
marian_pascalau View Drop Down
Debenu Quick PDF Library Expert
Debenu Quick PDF Library Expert


Joined: 28 Mar 06
Location: Germany
Status: Offline
Points: 278
Post Options Post Options   Thanks (0) Thanks(0)   Quote marian_pascalau Quote  Post ReplyReply Direct Link To This Post Posted: 13 Mar 07 at 5:27AM
Dmitry, if you consider a sponsorship and I will try to optimize the text extraction (Option=4) for you. Otherwise you should to use the option 2 and split text with your own program.
Back to Top
Dmitry View Drop Down
Team Player
Team Player


Joined: 21 Sep 06
Status: Offline
Points: 47
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dmitry Quote  Post ReplyReply Direct Link To This Post Posted: 13 Mar 07 at 6:28AM
marian_pascalau
No, thanks
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store