<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : Height of the extracted text</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : I need help - I can help : Height of the extracted text]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Sun, 05 Apr 2026 17:17:05 +0000</pubDate>
  <lastBuildDate>Wed, 29 Aug 2012 14:22:15 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=2376</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[Height of the extracted text :   It is quite that ! ]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10034.html#10034</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2030">emgi</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 2:22PM<br /><br />It is quite that !]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 14:22:15 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10034.html#10034</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text : If it is graphical then I suspect...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10033.html#10033</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 2:18PM<br /><br />If it is graphical then I suspect you are rendering the PDF to an image. &nbsp;You could use this image and the bounding box to extract the word into a smaller image and then analyse the smaller image to find the extent of the whitespace. &nbsp;You can then adjust the values from QPL by the whitespace values that you have calculated.<div><br></div><div>Andrew.</div>]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 14:18:30 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10033.html#10033</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text :    Thank you for your answer.It...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10032.html#10032</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2030">emgi</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 2:12PM<br /><br />Thank you for your answer.<br>It would be really useful for my tool.<br>It is a tool to detect and verify the content of various documents.<br>To do this, the user defines graphal areas and a list of rules for each area.<div>&nbsp;</div><div>My other solution is to analyze the rendered image and thereby deduce the character size. However, the processing time may be very long.</div><div>&nbsp;</div>Regards,<br>Emmanuel<span style="font-size:10px"><br /><br />Edited by emgi - 29 Aug 12 at 2:14PM</span>]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 14:12:36 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10032.html#10032</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text : I have just realised that the...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10029.html#10029</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 11:38AM<br /><br /><div><br></div><div>I have just realised that the individual character bounding boxes are not easily available in the font files. &nbsp;We don't need to use the individual character heights when rendering fonts as this is taken care of by the font renderer built in to Windows. &nbsp;</div><div><br></div><div>Every font has a different way of storing this information and it would take some considerable effort to extract and store the required values. &nbsp;</div><div><br></div><div>The character widths are freely available directly from the PDF structure itself. &nbsp;The character bounding boxes would need to be extracted from each different font type. &nbsp;This would also slow down the rendering process also.</div><div><br></div><div>It would not be a quick fix to extract this information and it is very unlikely that I can get the developers to implement this feature at the moment.</div><div><br></div><div>Andrew.</div><span style="font-size:10px"><br /><br />Edited by AndrewC - 29 Aug 12 at 2:04PM</span>]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 11:38:47 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10029.html#10029</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text :   Hi Andrew,I&amp;#039;m writing...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10027.html#10027</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2030">emgi</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 6:36AM<br /><br />Hi Andrew,<div>I'm writing a tool to capture and analyse text that uses graphical areas on rendered pages.</div><div>That's why i need these data.</div><div>Regards,</div><div>Emmanuel</div>]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 06:36:32 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10027.html#10027</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text : Quick PDF Library returns the...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10026.html#10026</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 29 Aug 12 at 3:11AM<br /><br />Quick PDF Library returns the full font cell height. The cell height is defined as the Font Ascent + Font Descent. &nbsp;Using these values makes it much easier to group characters and into words and words into lines for the advanced text extraction options.<div><div><br></div><div>I am wondering why you need the actual character bounding boxes of each word ? &nbsp;</div></div><div><br></div><div>Andrew.</div>]]>
   </description>
   <pubDate>Wed, 29 Aug 2012 03:11:23 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10026.html#10026</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text :    Thank you so.Sure that QuickPdfLib...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10000.html#10000</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2030">emgi</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 21 Aug 12 at 3:43PM<br /><br /><p>Thank you so.<br>Sure that QuickPdfLib is stable library i'm using it from long time ago with success !<br>I don't think that is a bug but i had&nbsp;never do that before.<br>So, i will do some other tests and post my question on the official support pages.<br>Best regards,</p><p>Emmanuel</p><div>&nbsp;</div><span style="font-size:10px"><br /><br />Edited by emgi - 21 Aug 12 at 4:04PM</span>]]>
   </description>
   <pubDate>Tue, 21 Aug 2012 15:43:09 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post10000.html#10000</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text : So you should substract a little...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9999.html#9999</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 21 Aug 12 at 3:15PM<br /><br />So you should substract a little bit.<br>Make some tries for matching percentage.<br>Where's the problem?<br>If you think it's an error you should post it on the official support pages.<br>This here is the user-user-forum.<br>QP is a stable library with many years of development now - i've never had a similar question like yours ;-)<br><br>Cheers, Ingo<br>]]>
   </description>
   <pubDate>Tue, 21 Aug 2012 15:15:48 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9999.html#9999</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text :     Hi Ingo,Thank you for...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9998.html#9998</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2030">emgi</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 21 Aug 12 at 2:43PM<br /><br /><p>Hi Ingo,<br>Thank you for your response.</p><div>That's what i do (as we can see in code below)</div><div>But, the boxes (in blue) are higher than the rendered words (in red) .</div><div>&nbsp;</div><div><table width="99%"><tr><td><pre class="BBcode"></div><div><font color="#2b91af" size="2"><font color="#2b91af" size="2">String </font></font><font size="2">txt = pdf.GetPageText(4);</font></div><font size="2"></font><div></pre></td></tr></table></div><div>&nbsp;</div><div><img src="http://freemuse.free.fr/tests/QuickPdf.png" height="126" width="395" border="1" /></div><div>&nbsp;</div><div>Regards</div><span style="font-size:10px"><br /><br />Edited by emgi - 21 Aug 12 at 2:44PM</span>]]>
   </description>
   <pubDate>Tue, 21 Aug 2012 14:43:48 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9998.html#9998</guid>
  </item> 
  <item>
   <title><![CDATA[Height of the extracted text :  Hi emgi!If you use the extract...]]></title>
   <link>http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9997.html#9997</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=111">Ingo</a><br /><strong>Subject:</strong> 2376<br /><strong>Posted:</strong> 21 Aug 12 at 2:20PM<br /><br />Hi emgi!<br><br>If you use the extract option "word by word" then the font height should be correct.<br>Or you should have a look on the x-/y-values for the string-boxes.<br>Have a look in the online reference here:<br>http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePageText.php<br><br>Cheers and welcome here,<br>Ingo<br><br><span style="font-size:10px"><br /><br />Edited by Ingo - 21 Aug 12 at 2:21PM</span>]]>
   </description>
   <pubDate>Tue, 21 Aug 2012 14:20:16 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/height-of-the-extracted-text_topic2376_post9997.html#9997</guid>
  </item> 
 </channel>
</rss>