Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > Sample Code
  New Posts New Posts RSS Feed - extract text
  FAQ FAQ  Forum Search   Register Register  Login Login

extract text

 Post Reply Post Reply
Author
Message
kavaler View Drop Down
Beginner
Beginner
Avatar

Joined: 19 Jan 10
Location: Baku
Status: Offline
Points: 13
Post Options Post Options   Thanks (0) Thanks(0)   Quote kavaler Quote  Post ReplyReply Direct Link To This Post Topic: extract text
    Posted: 27 Jan 10 at 2:08PM
Hello
I scanned the document
Has save it by name doc1.pdf
How can I from Delphi take the text from doc1.pdf?
Back to Top
JanN View Drop Down
Senior Member
Senior Member


Joined: 29 Oct 05
Location: Germany
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote JanN Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 2:31PM
Hi,

QuickPdf is not able to extract text from image-only pdf files. Therefor you will need special OCR tools like OmniPage or Abbyy.
Back to Top
kavaler View Drop Down
Beginner
Beginner
Avatar

Joined: 19 Jan 10
Location: Baku
Status: Offline
Points: 13
Post Options Post Options   Thanks (0) Thanks(0)   Quote kavaler Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 4:02PM
Can you tell me what I should   do for this purpose?
Back to Top
JanN View Drop Down
Senior Member
Senior Member


Joined: 29 Oct 05
Location: Germany
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote JanN Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 4:13PM
If I were you, I would buy OmniPage Professional (version is available for around 100$ on the net). It is able to recognize the text in scanned documents and convert them to searchable pdf files or to text files.
Back to Top
kavaler View Drop Down
Beginner
Beginner
Avatar

Joined: 19 Jan 10
Location: Baku
Status: Offline
Points: 13
Post Options Post Options   Thanks (0) Thanks(0)   Quote kavaler Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 4:21PM
How can I use it  (OmniPage Proffessional) from Delphi?
Back to Top
JanN View Drop Down
Senior Member
Senior Member


Joined: 29 Oct 05
Location: Germany
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote JanN Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 4:33PM
Google is your friend... ;)

OmniPage Professional is a standalone product. You can configure it to grab files from a specified folder and output the convertet files to another. Then you can work with those resulting files in Delphi.
Back to Top
Shotgun Tom View Drop Down
Senior Member
Senior Member
Avatar

Joined: 14 Aug 09
Location: Phoenix, AZ
Status: Offline
Points: 53
Post Options Post Options   Thanks (0) Thanks(0)   Quote Shotgun Tom Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 5:42PM
Be aware that stand alone products, like OmniPage, are not redistributable.  That means if you are just doing this for your own use then they will work fine. 
 
If, however, you are creating a program for others you'll need to obtain a SDK that may be distributed with your product.  The SDK's are quite abit more expensive.  You would be looking at prices that range between $400 and $3000.
 
Tom
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 27 Jan 10 at 9:04PM
Hi Tom!

There's a better (less expensive) method...
The delphi-solution from kavaler and an additional OmniPage Pro ;-)

Cheers, Ingo
Back to Top
dsola View Drop Down
Team Player
Team Player


Joined: 28 Oct 05
Location: Croatia
Status: Offline
Points: 34
Post Options Post Options   Thanks (0) Thanks(0)   Quote dsola Quote  Post ReplyReply Direct Link To This Post Posted: 05 Feb 10 at 12:50PM
Hi,
Try this
   http://jocr.sourceforge.net/index.html

It's free and results for me were satisfactory.
If the document is scanned well and have "normal" fonts maybe this will be enough
.

If You need working example just ask.
registered QuickPDF user
Back to Top
kavaler View Drop Down
Beginner
Beginner
Avatar

Joined: 19 Jan 10
Location: Baku
Status: Offline
Points: 13
Post Options Post Options   Thanks (0) Thanks(0)   Quote kavaler Quote  Post ReplyReply Direct Link To This Post Posted: 06 Feb 10 at 9:48AM
hello
Can you tell me how can I use it in Delhpi?
Back to Top
dsola View Drop Down
Team Player
Team Player


Joined: 28 Oct 05
Location: Croatia
Status: Offline
Points: 34
Post Options Post Options   Thanks (0) Thanks(0)   Quote dsola Quote  Post ReplyReply Direct Link To This Post Posted: 08 Feb 10 at 8:23AM
Hi,
I'll post delphi code shortly but for now try this.

This is content of test.cmd file

rem begin
 djpeg -grayscale -pnm YourPictureName.jpg YourPictureName.pnm
 gocr -i YourPictureName.pnm -o YourExtractedText.txt
rem end

test.cmd, djpeg.exe, gocr.exe, YourPictureName.jpg are together in the same directory.

With this You can test if this method satisfies Your needs.
registered QuickPDF user
Back to Top
dsola View Drop Down
Team Player
Team Player


Joined: 28 Oct 05
Location: Croatia
Status: Offline
Points: 34
Post Options Post Options   Thanks (0) Thanks(0)   Quote dsola Quote  Post ReplyReply Direct Link To This Post Posted: 09 Feb 10 at 7:13AM
Here is part of Delphi code

// ttt.pbm  - image for OCR (withh TFreeBitmap JPG can be converted to PBM or useing djpeg.exe)
// ttt.txt - result of OCR
procedure TOIBCaptchaKiller.Do_OCR;
var
  StartupInfo : TStartupInfo;
  ProcessInfo : TProcessInformation;
  Res:boolean;
  cmdLine:array[0..512] of char;
  lpExitCode: DWORD;
begin
  FillChar (StartupInfo, SizeOf(StartupInfo), 0);
  StartupInfo.cb := SizeOf(StartupInfo);
  StartupInfo.wShowWindow := SW_SHOWNORMAL;//SW_HIDE
  if  CreateProcess (nil,Pchar('gocr -i ttt.pbm -o ttt.txt'),nil,nil,FALSE, 0,nil,nil, StartupInfo, ProcessInfo) then begin

  GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode);

      while lpExitCode = STILL_ACTIVE do begin         // sve dok se test applikacija ne ugasi nejdi nikud
        sleep(100);
        Application.ProcessMessages;
        GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode);
      end;

    CloseHandle(ProcessInfo.hProcess);
    CloseHandle(ProcessInfo.hThread);

    end;
end;

registered QuickPDF user
Back to Top
kavaler View Drop Down
Beginner
Beginner
Avatar

Joined: 19 Jan 10
Location: Baku
Status: Offline
Points: 13
Post Options Post Options   Thanks (0) Thanks(0)   Quote kavaler Quote  Post ReplyReply Direct Link To This Post Posted: 11 Feb 10 at 10:12AM
hello
I can't understand how  may I to use it in Delphi
this is my e-mail
islam261@gmail.com
adress please send me example(s)  about this
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store