<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : DAExtractPageText problem</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : General Discussion : DAExtractPageText problem]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Mon, 04 May 2026 07:06:54 +0000</pubDate>
  <lastBuildDate>Wed, 23 Feb 2011 19:02:00 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=1667</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[DAExtractPageText problem : DAExtractPageText with Options=4...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7630.html#7630</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1542">billycl</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 23 Feb 11 at 7:02PM<br /><br />DAExtractPageText with <i>Options=4</i> return<br><i>&nbsp;TQuickPDF0723.AddArcToPath(CenterX,</i>&nbsp;&nbsp;&nbsp; as 1 word<br>I think now only space character is delimiter <br>Is it possible (in future) to define more delimiters "(),.:-" <br>I see this result <br><i>&nbsp;tquickpdf0723. &nbsp; &nbsp; addarctopath(&nbsp;&nbsp;&nbsp; centerx,</i><br>in software which use Adobe Acrobat Pro (acrobat = slow)<br><br>]]>
   </description>
   <pubDate>Wed, 23 Feb 2011 19:02:00 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7630.html#7630</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem : hi, the algoritm is corrupted,...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7345.html#7345</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1465">Giuseppe</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 13 Dec 10 at 11:05AM<br /><br />hi, the algoritm is corrupted, you must use a work around, set deltax and deltay and remake the words...]]>
   </description>
   <pubDate>Mon, 13 Dec 2010 11:05:39 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7345.html#7345</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem :  Thank you very much for your...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7336.html#7336</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1478">dpreznik</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 06 Dec 10 at 8:20PM<br /><br />Thank you very much for your answer.< id="gwProxy" ="">< ="ifofjsCall==''jsCall;elsesetTimeout'jsCall',500;" id="jsProxy" =""><div id="ref"></div>]]>
   </description>
   <pubDate>Mon, 06 Dec 2010 20:20:07 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7336.html#7336</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem : Hi Dmitriy!  You can only extract...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7335.html#7335</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 06 Dec 10 at 8:18PM<br /><br />Hi Dmitriy!<DIV>&nbsp;</DIV><DIV>You can only extract images you had inserted in the same session.</DIV><DIV>No chance on other documents.</DIV><DIV>&nbsp;</DIV><DIV>Cheers, Ingo</DIV>]]>
   </description>
   <pubDate>Mon, 06 Dec 2010 20:18:42 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7335.html#7335</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem :   Ingo wrote:Hi Dmitriy!Try option...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7332.html#7332</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1478">dpreznik</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 06 Dec 10 at 12:33PM<br /><br /><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by Ingo" alt="Originally posted by Ingo" style="vertical-align: text-bottom;" /> <strong>Ingo wrote:</strong><br /><br />Hi Dmitriy!<BR><BR>Try option "0" ... The same or is it better?</td></tr></table> <DIV>Hi Ingo,</DIV><DIV>&nbsp;</DIV><DIV>Thank you for your answer. No, it is not better.</DIV><DIV><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by Ingo" alt="Originally posted by Ingo" style="vertical-align: text-bottom;" /> <strong>Ingo wrote:</strong><br /><br /><BR>Generally you can say that extraction works<BR>like the textcontent was inserted. First in first out.<BR>If the first word on a page is "ello" and at the end<BR>of the page you see this and insert a "H" before<BR>the "ello", while extraction the "H" was extracted<BR>at the end of the page-content.<BR><BR>With option "4" you can extract word by word with<BR>position-data. Regarding these position data you can<BR>contain the real textrows by your own. There's no<BR>support by QuickPDF.</td></tr></table> </DIV><DIV>Probably that is what happened to me. And I think there is no solution for it that I could use.<BR><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by Ingo" alt="Originally posted by Ingo" style="vertical-align: text-bottom;" /> <strong>Ingo wrote:</strong><br /><br /><BR>BTW: A small warning... Don't mix DA-functions with<BR>non-DA-functions - this won't work ;-)&nbsp;<BR></td></tr></table> </DIV><DIV>Thank you very much for the warning.</DIV><DIV>&nbsp;</DIV><DIV>May I ask one more question?</DIV><DIV>I found Quick PDF Lite. Would it support extracting images from a PDF document? I tried it, but don't know yet how to apply those methods, that are different from the Professional Quick PDF.</DIV><DIV>I would use it with C++.</DIV><DIV>Thank you very much.</DIV><DIV>Dmitriy</DIV>]]>
   </description>
   <pubDate>Mon, 06 Dec 2010 12:33:30 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7332.html#7332</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem :   Paddy wrote:Are you using the...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7331.html#7331</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1478">dpreznik</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 06 Dec 10 at 12:27PM<br /><br /><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by Paddy" alt="Originally posted by Paddy" style="vertical-align: text-bottom;" /> <strong>Paddy wrote:</strong><br /><br />Are you using the DLL edition or the ActiveX edition? And also, does your PDF contain any Unicode characters?</td></tr></table> <DIV></DIV><DIV></DIV>Hi Paddy,<DIV>&nbsp;</DIV><DIV>I am using DLL edition. I am not sure if my PDF contains Unicode characters.</DIV>]]>
   </description>
   <pubDate>Mon, 06 Dec 2010 12:27:48 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7331.html#7331</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem : Hi Dmitriy!Try option &amp;#034;0&amp;#034;...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7325.html#7325</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 04 Dec 10 at 9:56AM<br /><br />Hi Dmitriy!<br><br>Try option "0" ... The same or is it better?<br>Generally you can say that extraction works<br>like the textcontent was inserted. First in first out.<br>If the first word on a page is "ello" and at the end<br>of the page you see this and insert a "H" before<br>the "ello", while extraction the "H" was extracted<br>at the end of the page-content.<br><br>With option "4" you can extract word by word with<br>position-data. Regarding these position data you can<br>contain the real textrows by your own. There's no<br>support by QuickPDF.<br><br>BTW: A small warning... Don't mix DA-functions with<br>non-DA-functions - this won't work ;-)<br><br>Cheers and welcome here,<br>Ingo<br>&nbsp;<br>]]>
   </description>
   <pubDate>Sat, 04 Dec 2010 09:56:14 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7325.html#7325</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem : Are you using the DLL edition...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7324.html#7324</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1280">Paddy</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 03 Dec 10 at 8:16PM<br /><br />Are you using the DLL edition or the ActiveX edition? And also, does your PDF contain any Unicode characters?]]>
   </description>
   <pubDate>Fri, 03 Dec 2010 20:16:19 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7324.html#7324</guid>
  </item> 
  <item>
   <title><![CDATA[DAExtractPageText problem : Dear experts,     I am trying...]]></title>
   <link>http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7323.html#7323</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1478">dpreznik</a><br /><strong>Subject:</strong> 1667<br /><strong>Posted:</strong> 03 Dec 10 at 5:53PM<br /><br />Dear experts,<DIV></DIV><DIV></DIV><DIV></DIV><DIV>&nbsp;</DIV><DIV>I am trying to create an application in C# to extract text from pdf. I am using <FONT size=2 face=C&#111;nsolas><FONT size=2 face=C&#111;nsolas>DAExtractPageText() method. But the text returned by this method is distorted. Some characters are missing, and blank spaces are inserted here and there within words. </FONT></FONT></DIV><DIV><FONT size=2 face=C&#111;nsolas>Could you please tell me if it is possible to fix it?</FONT></DIV><DIV><FONT size=2 face=C&#111;nsolas></FONT>&nbsp;</DIV><DIV><FONT size=2 face=C&#111;nsolas>Thank you very much,</FONT></DIV><DIV><FONT size=2 face=C&#111;nsolas></FONT>&nbsp;</DIV><DIV><FONT size=2 face=C&#111;nsolas>Dmitriy</FONT></DIV>]]>
   </description>
   <pubDate>Fri, 03 Dec 2010 17:53:11 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7323.html#7323</guid>
  </item> 
 </channel>
</rss>