Print Page | Close Window

DAGetImageListCount does not work

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: General Discussion
Forum Description: Discussion board for Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3432
Printed Date: 22 Nov 24 at 6:33PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: DAGetImageListCount does not work
Posted By: skoschke
Subject: DAGetImageListCount does not work
Date Posted: 11 Jan 17 at 12:36PM
If i open different PDF files and use QP.DAGetImageListCount to get images from these files, i get no images.

If i open the file in any pdf viewer, i can see a background image.

What can i do to extract the images?

Second problem:

if i use QP.ExtractPageTextBlocks(3) i get text from pdf file, but in some cases i get wrong text, for example "PhoNe nuMber" insted of "Phone number" what i can see in any pdf viewer?
 
Stefan



Replies:
Posted By: tfrost
Date Posted: 11 Jan 17 at 12:59PM
I suggest you show your actual code, so that people can first eliminate obvious problems.


Posted By: skoschke
Date Posted: 11 Jan 17 at 1:32PM
Code with DirectAccess:

  procedure TForm1.ReadImages(fn: string; var l: Tstringlist);
var
  i: integer;
  imagecount: integer;
  imageid: integer;
  x: single;
  y: single;
  b: single;
  h: single;
  IL: integer;
  PR: integer;
  p: integer;
  ImageData: ansistring; // !!!!!! wegen Stream !!!
  seitenhoehe: single;
  FH: integer;
begin
  // Für DirectAccess mit FileHandle öffnen
  FH := PDFLibrary.DAOpenFile(fn, '');
  // alte Streams löschen
  for i := 1 to High(streamarray) do
    streamarray.Free;
  setlength(streamarray, 1);
  for p := 1 to PDFLibrary.PageCount do
  begin
    PR := PDFLibrary.DAFindPage(FH, p);
    IL := PDFLibrary.DAGetPageImageList(FH, PR);
    seitenhoehe := PDFLibrary.DAGetPageHeight(FH, PR);
    for i := 1 to PDFLibrary.DAGetImageListCount(FH, IL) do
    begin
      // Read the image data
      ImageData := PDFLibrary.DAGetImageDataToString(FH, IL, i);
      // Determine the location and size of the image on the page
      x := PDFLibrary.DAGetImageDblProperty(FH, IL, i, 501);
      y := seitenhoehe - PDFLibrary.DAGetImageDblProperty(FH, IL, i, 502);
      b := PDFLibrary.DAGetImageDblProperty(FH, IL, i, 503) -
        PDFLibrary.DAGetImageDblProperty(FH, IL, i, 501);
      h := PDFLibrary.DAGetImageDblProperty(FH, IL, i, 502) -
        PDFLibrary.DAGetImageDblProperty(FH, IL, i, 508);
      // für jedes gefundene Bild Streamarray verlängern und speichern
      setlength(streamarray, length(streamarray) + 1);
      streamarray[high(streamarray)] := TMemorystream.Create;
      streamarray[high(streamarray)].Position := 0;
      streamarray[high(streamarray)].WriteBuffer(Pointer(ImageData)^,
        length(ImageData));
      // in Liste eintragen
      l.Add('P:' + p.ToString);
      l.Add('{' + 'Image' + high(streamarray).ToString + '}');
      l.Add('X:' + x.ToString);
      l.Add('Y:' + y.ToString);
      l.Add('B:' + b.ToString);
      l.Add('H:' + h.ToString);
    end; // End image loop
  end;
  PDFLibrary.DACloseFile(FH);
end;

and with normal Access:

procedure TForm1.ReadImages2();
var
  i, k: integer;
  IL: integer;
  ic: integer;
  it: integer;
  gid: integer;
  filename : string;
begin
  for i := 1 to PDFLibrary.PageCount do
  begin
    // Select current page
    PDFLibrary.SelectPage(i);
    // Get list of images on the page
    IL := PDFLibrary.GetPageImageList(0);
    // Count number of images in the list
    ic := PDFLibrary.GetImageListCount(IL);
    for k := 1 to ic do
    begin
      // Iterate through each image and get the
      // image type and image ID
      it := PDFLibrary.GetImageListItemIntProperty(IL, k, 400);
      gid := PDFLibrary.GetImageListItemIntProperty(IL, k, 405);
      // Choose the approrpriate file extenion based on
      // the returned image type
      case it of
        1:
          filename := 'c:\temp\image-' + gid.ToString + '-' + k.ToString + '.jpg';
        2:
          filename := 'c:\temp\image-' + gid.ToString + '-' + k.ToString + '.bmp';
        3:
          filename := 'c:\temp\image-' + gid.ToString + '-' + k.ToString + '.tif';
        4:
          filename := 'c:\temp\image-' + gid.ToString + '-' + k.ToString + '.png';
      end;
      // Save the selected image to disk
      PDFLibrary.SaveImageListItemDataToFile(IL, k, 0, filename);
    end;
  end
end;

The result of
DAGetImageListCount in example 1 and
GetImageListCount in example 2 is 0
Both codes don't give me images back, in different pdf files, in other pdf files the images are given back correctly!

Is it possible that the images i see are watermarks ore stamps?
How to read out these?

Stefan


Posted By: Ingo
Date Posted: 11 Jan 17 at 8:39PM
Hi Stefan,

only embedded (inserted as files) images can be extracted from pdf.
It's similar to a word document:
You can insert a screenshot from the clipboard or you can insert an image file - that's different.
Images which are not really enbedded/inserted you can't extract using QuickPDF-functionality.

Your second issue will have to do with a strange font or something similar - so QuickPDF have some trouble in recognition.
Please keep in mind that QuickPDF is still a light weight pdf library instead of - for example - an adobe installation which comes around with 100 mb ;-)
It will help to upload the pdf-file anywhere - so we could make tests with own routines...

Cheers and welcome here,
Ingo



-------------
Cheers,
Ingo



Posted By: skoschke
Date Posted: 12 Jan 17 at 3:02PM
Hi Ingo,

thank you for welcome , i understand now and will go another way in this case:

I try to use RenderPageToStream so i get the full page as background for my new pdf file...

Stefan


Posted By: Ingo
Date Posted: 13 Jan 17 at 10:46AM
Hi Stefan,

yes... RenderPage is the only way out of your issue ;-)



-------------
Cheers,
Ingo




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk