<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : Testing QuickPDF for text extraction performance</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : General Discussion : Testing QuickPDF for text extraction performance]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Mon, 04 May 2026 10:59:27 +0000</pubDate>
  <lastBuildDate>Thu, 22 Mar 2012 10:58:16 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=2152</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : With many modern PDF files options...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9372.html#9372</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 22 Mar 12 at 10:58AM<br /><br /><div><br></div>With many modern PDF files options 0, 1, 2 don't extract text from some complex PDF's. &nbsp;Option 8 is an improved version of option 0. &nbsp;Options 3,4,5,6,7 use the improved extraction logic.<div><br></div><div>Andrew.</div>]]>
   </description>
   <pubDate>Thu, 22 Mar 2012 10:58:16 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9372.html#9372</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance :   Ingo wrote:sTxt += QP.DAExtractPageText(FH,...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9146.html#9146</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1857">pcunite</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 10:23PM<br /><br /> <table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by Ingo" alt="Originally posted by Ingo" style="vertical-align: text-bottom;" /> <strong>Ingo wrote:</strong><br /><br />sTxt += QP.DAExtractPageText(FH, PR, 0);0 should be the fastest.If 0 is useful for you depends on what you wanna do with the text.</td></tr></table> <br /><br />I only want to know if the word "blah" appears in the PDF file. I understand that I can't do this with image only PDF files ... that is okay. Thus I load all the strings into a buffer and then I'll search myself for "blah".<br />]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 22:23:40 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9146.html#9146</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : sTxt += QP.DAExtractPageText(FH,...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9145.html#9145</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 10:16PM<br /><br />sTxt += QP.DAExtractPageText(FH, PR, 0);<br><br>0 should be the fastest.<br>If 0 is useful for you depends on what you wanna do with the text.<br><br>]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 22:16:33 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9145.html#9145</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : Thank you for your help. I&amp;#039;m...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9144.html#9144</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1857">pcunite</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 10:07PM<br /><br />Thank you for your help. I'm testing the sample function below. Is this the fastest way? I just want to make sure I'm doing all I can. Also, I don't understand <a href="http://www.quickpdflibrary.com/help/quickpdf/DASetTextExtracti&#111;nOpti&#111;ns.php" target="_blank">DASetTextExtractionOptions</a> ... should I use it to optimize anything?<br /><br /><br /><br /><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Quote" alt="Quote" style="vertical-align: text-bottom;" /> <br />size_t GetText_PDF(std::wstring & sF, std::wstring & sTxt)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int FH, PR, iPages;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;// Open file readonly<br />&nbsp;&nbsp;&nbsp;&nbsp;FH = QP.DAOpenFileReadOnly(sF, L"");<br />&nbsp;&nbsp;&nbsp;&nbsp;if(FH == 0){return FILE_ERROR_OPEN;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;// Get page count<br />&nbsp;&nbsp;&nbsp;&nbsp;iPages = QP.DAGetPageCount(FH);<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;// loop over pages<br />&nbsp;&nbsp;&nbsp;&nbsp;for(int i = 1; i &lt;= iPages; i++)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  // Get a page reference to the current page<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  PR = QP.DAFindPage(FH, i);<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  // Extract the text from the current page<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  sTxt += QP.DAExtractPageText(FH, PR, 8);<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;// Close file<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;QP.DACloseFile(FH);<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return FILE_SUCCESS;<br />}</td></tr></table><br /><span style="font-size:10px"><br /><br />Edited by pcunite - 15 Feb 12 at 10:21PM</span>]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 22:07:49 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9144.html#9144</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : Hi!This library offers over 500...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9143.html#9143</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 9:05PM<br /><br />Hi!<br><br>This library offers over 500 functions for a low price.<br>Textextraction was already in the first versions many years ago.<br>So this should be stable but it won't be optimized specially for <br>textextraction.<br>Personal opinions will be different so you have to try.<br><br>Cheers, Ingo<br>]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 21:05:59 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9143.html#9143</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : Well, yes I&amp;#039;ve read some...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9141.html#9141</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1857">pcunite</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 9:00PM<br /><br />Well, yes I've read some of the materials. I'm looking at about 5 different solutions and wanted my hand held a little :)<br /><br />I know how to use you're library, just wanted a fuzzy feeling that it is up to the task for my requirements. Some PDF libraries are more for creation or editing ... I just want the text as fast as I can. Is QuickPDF optimized for this?<br /><br />P.S.<br />I did not find it referenced anywhere, but can the .LIB version work with C++ Builder 2007 or is that for only Visual Studio? The .DLL version is fine, just wondering.<br />]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 21:00:26 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9141.html#9141</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : So you didn&amp;#039;t read the documents...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9139.html#9139</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 8:46PM<br /><br />So you didn't read the documents from the original support pages of the publishers ;-)<br><br>Hi!<br><br>The searching could be done with your programming language<br>and the textextraction could be done with QuickPDF with several <br>kinds of options.<br>Your performance-question: It always depends on ... Try it ;-)<br>http://www.quickpdflibrary.com/help/quickpdf/Extraction.php<br><br>Cheers and welcome here,<br>Ingo<br><br>]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 20:46:37 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9139.html#9139</guid>
  </item> 
  <item>
   <title><![CDATA[Testing QuickPDF for text extraction performance : I am evaluating the QuickPDF library...]]></title>
   <link>http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9138.html#9138</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1857">pcunite</a><br /><strong>Subject:</strong> 2152<br /><strong>Posted:</strong> 15 Feb 12 at 8:40PM<br /><br />I am evaluating the QuickPDF library (.dll version) for use in a C++ application. The only functionally I need it to extract the text. The entire PDF's text will be placed in memory and then I'll search for keyword terms.<br /><br />Is QuickPDF suitable for this type of work and offer good performance?<br />]]>
   </description>
   <pubDate>Wed, 15 Feb 2012 20:40:20 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/testing-quickpdf-for-text-extraction-performance_topic2152_post9138.html#9138</guid>
  </item> 
 </channel>
</rss>