Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > Sample Code
  New Posts New Posts RSS Feed - SetTextExtractionOptions
  FAQ FAQ  Forum Search   Register Register  Login Login

SetTextExtractionOptions

 Post Reply Post Reply
Author
Message
steve View Drop Down
Beginner
Beginner


Joined: 15 May 12
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote steve Quote  Post ReplyReply Direct Link To This Post Topic: SetTextExtractionOptions
    Posted: 15 May 12 at 12:32PM
Hi,

Has anyone managed to use SetTextExtractionOptions?

In my scenario I'm trying to extract text from a PDF which has "small caps" font effect. e.g.
EXTRACT M
which is being extracted as

E XTRACT M E

By default QuickPdf Library seems to use changes in font style / size as a cue for word boundary detection, a feature I was hoping I could disable by setting:

int setSuccess = _pdfLibrary.SetTextExtractionOptions(1, 1);             //(returning 1, so being set)
string pageText = _pdfLibrary.GetPageText(4);

OptionId 1 = Use Font information matching when grouping to separate text blocks
and
1 = Ignore



Any suggestions or help much appreciated!

Steve
Back to Top
steve View Drop Down
Beginner
Beginner


Joined: 15 May 12
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote steve Quote  Post ReplyReply Direct Link To This Post Posted: 15 Jun 12 at 10:29AM
Support kindly responded with the following.  Using (3,1) helped me in the majority of cases:

"I also find that using all three options works pretty well.

 

  SetTextExtractionOptions(1,1);

  SetTextExtractionOptions(2,1);

  SetTextExtractionOptions(3,1);

 

You may option 6 might improve the results slightly also.

 

  SetTextExtractionOptions(6,1);

"


Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store