Debenu Quick PDF Library - PDF SDK Community Forum : DAExtractPageText problem

Debenu Quick PDF Library - PDF SDK Community Forum : DAExtractPageText problem http://www.quickpdf.org/forum/ Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved. Mon, 22 Jun 2026 12:38:51 +0000 Wed, 23 Feb 2011 19:02:00 +0000 http://blogs.law.harvard.edu/tech/rss Web Wiz Forums 11.01 360 www.quickpdf.org/forum/RSS_post_feed.asp?TID=1667 <![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]> http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png http://www.quickpdf.org/forum/ <![CDATA[DAExtractPageText problem : DAExtractPageText with Options=4...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7630.html#7630 Author: billycl
Subject: 1667
Posted: 23 Feb 11 at 7:02PM

DAExtractPageText with Options=4 return
TQuickPDF0723.AddArcToPath(CenterX, as 1 word
I think now only space character is delimiter
Is it possible (in future) to define more delimiters "(),.:-"
I see this result
tquickpdf0723. addarctopath( centerx,
in software which use Adobe Acrobat Pro (acrobat = slow)

]]> Wed, 23 Feb 2011 19:02:00 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7630.html#7630 <![CDATA[DAExtractPageText problem : hi, the algoritm is corrupted,...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7345.html#7345 Author: Giuseppe
Subject: 1667
Posted: 13 Dec 10 at 11:05AM

hi, the algoritm is corrupted, you must use a work around, set deltax and deltay and remake the words...]]> Mon, 13 Dec 2010 11:05:39 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7345.html#7345 <![CDATA[DAExtractPageText problem : Thank you very much for your...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7336.html#7336 Author: dpreznik
Subject: 1667
Posted: 06 Dec 10 at 8:20PM

Thank you very much for your answer.< id="gwProxy" ="">< ="ifofjsCall==''jsCall;elsesetTimeout'jsCall',500;" id="jsProxy" ="">

]]> Mon, 06 Dec 2010 20:20:07 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7336.html#7336 <![CDATA[DAExtractPageText problem : Hi Dmitriy! You can only extract...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7335.html#7335 Author: Ingo
Subject: 1667
Posted: 06 Dec 10 at 8:18PM

Hi Dmitriy!

You can only extract images you had inserted in the same session.

No chance on other documents.

Cheers, Ingo

]]> Mon, 06 Dec 2010 20:18:42 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7335.html#7335 <![CDATA[DAExtractPageText problem : Ingo wrote:Hi Dmitriy!Try option...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7332.html#7332 Author: dpreznik
Subject: 1667
Posted: 06 Dec 10 at 12:33PM

Ingo wrote:

Hi Dmitriy!

Try option "0" ... The same or is it better?

Hi Ingo,

Thank you for your answer. No, it is not better.

Ingo wrote:

Generally you can say that extraction works
like the textcontent was inserted. First in first out.
If the first word on a page is "ello" and at the end
of the page you see this and insert a "H" before
the "ello", while extraction the "H" was extracted
at the end of the page-content.

With option "4" you can extract word by word with
position-data. Regarding these position data you can
contain the real textrows by your own. There's no
support by QuickPDF.

Probably that is what happened to me. And I think there is no solution for it that I could use.

Ingo wrote:

BTW: A small warning... Don't mix DA-functions with
non-DA-functions - this won't work ;-)

Thank you very much for the warning.

May I ask one more question?

I found Quick PDF Lite. Would it support extracting images from a PDF document? I tried it, but don't know yet how to apply those methods, that are different from the Professional Quick PDF.

I would use it with C++.

Thank you very much.

Dmitriy

]]> Mon, 06 Dec 2010 12:33:30 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7332.html#7332 <![CDATA[DAExtractPageText problem : Paddy wrote:Are you using the...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7331.html#7331 Author: dpreznik
Subject: 1667
Posted: 06 Dec 10 at 12:27PM

Paddy wrote:

Are you using the DLL edition or the ActiveX edition? And also, does your PDF contain any Unicode characters?

Hi Paddy,

I am using DLL edition. I am not sure if my PDF contains Unicode characters.

]]> Mon, 06 Dec 2010 12:27:48 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7331.html#7331 <![CDATA[DAExtractPageText problem : Hi Dmitriy!Try option "0"...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7325.html#7325 Author: Ingo
Subject: 1667
Posted: 04 Dec 10 at 9:56AM

Hi Dmitriy!

Try option "0" ... The same or is it better?
Generally you can say that extraction works
like the textcontent was inserted. First in first out.
If the first word on a page is "ello" and at the end
of the page you see this and insert a "H" before
the "ello", while extraction the "H" was extracted
at the end of the page-content.

With option "4" you can extract word by word with
position-data. Regarding these position data you can
contain the real textrows by your own. There's no
support by QuickPDF.

BTW: A small warning... Don't mix DA-functions with
non-DA-functions - this won't work ;-)

Cheers and welcome here,
Ingo

]]> Sat, 04 Dec 2010 09:56:14 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7325.html#7325 <![CDATA[DAExtractPageText problem : Are you using the DLL edition...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7324.html#7324 Author: Paddy
Subject: 1667
Posted: 03 Dec 10 at 8:16PM

Are you using the DLL edition or the ActiveX edition? And also, does your PDF contain any Unicode characters?]]> Fri, 03 Dec 2010 20:16:19 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7324.html#7324 <![CDATA[DAExtractPageText problem : Dear experts, I am trying...]]> http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7323.html#7323 Author: dpreznik
Subject: 1667
Posted: 03 Dec 10 at 5:53PM

Dear experts,

I am trying to create an application in C# to extract text from pdf. I am using DAExtractPageText() method. But the text returned by this method is distorted. Some characters are missing, and blank spaces are inserted here and there within words.

Could you please tell me if it is possible to fix it?

Thank you very much,

Dmitriy

]]> Fri, 03 Dec 2010 17:53:11 +0000 http://www.quickpdf.org/forum/daextractpagetext-problem_topic1667_post7323.html#7323