Print Page | Close Window

Search text and get the bound boxes

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: Sample Code
Forum Description: Share Debenu Quick PDF Library sample code with other forum members
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3293
Printed Date: 29 Mar 24 at 11:43AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Search text and get the bound boxes
Posted By: Ingo
Subject: Search text and get the bound boxes
Date Posted: 27 Mar 16 at 1:41PM
from our user Pinozzy:
Thanks for sharing!

I'm using the Viewer SDK (with c#) for search some text in a document. 
I need to localize that text and get the co-ordinates of the single results. 

My approach is now this: 

SearchPDFText("my string");

what I obtain is the number of the occurrences. 
Now I need the bound rectangles of theese occurrences. 

How can I do that?

. . .

This is how I solved my problem. 
This method returns results from a search by Regex.

Bye!

----

public struct FindResult
    {
      public string Text { get; set; }
      public RectangleF Rectangle { get; set; }
      public int Page { get; set; }
    }

    public override List<FindResult> SearchPattern(int pageIndex, string pattern)
    {
      var retVal = new List<FindResult>();
      var dpl = Document.DPL;
      dpl.SetTextExtractionWordGap(1);
      dpl.SetTextExtractionOptions(3, 0);
      var regex = new Regex(pattern);
      for (var i = 0; i < Pages; i++)
      {
        if(pageIndex > 0 && (pageIndex - 1) != i) continue; 
        var id = dpl.ExtractPageTextBlocks(4);
        dpl.SelectPage(i);
        for (var f = 1; f <= dpl.GetTextBlockCount(id); f++)
        {
          var text = dpl.GetTextBlockText(id, f);
          var match = regex.Match(text);
          if (!match.Success) continue;
          var res = new FindResult
          {
            Rectangle = new RectangleF(
              (float)dpl.GetTextBlockBound(id, f, 7), 
              (float)dpl.GetTextBlockBound(id, f, 8),
              (float)dpl.GetTextBlockBound(id, f, 5) - (float)dpl.GetTextBlockBound(id, f, 7),
              (float)dpl.GetTextBlockBound(id, f, 6) - (float)dpl.GetTextBlockBound(id, f, 4)
            ),
            Page = i+1, Text = text
          };
          retVal.Add(res);
        }
        dpl.ReleaseTextBlocks(id);
      }
      return retVal;
    }



-------------
Cheers,
Ingo




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk