Sample Code - C# - Extract pages based on a keyword match

Print Page | Close Window

C# - Extract pages based on a keyword match

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: Sample Code
Forum Description: Share Debenu Quick PDF Library sample code with other forum members
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2325
Printed Date: 15 Dec 25 at 11:14AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com

Topic: C# - Extract pages based on a keyword match

Posted By: AndrewC
Subject: C# - Extract pages based on a keyword match
Date Posted: 02 Jul 12 at 7:03AM

This code will iterate through all pages in a PDF file and if the extracted text contains the 'keyword' then the page is added to a list and all matching pages are extracted into a new document.

Of course, you can make the matching more complex to suit your needs.

Andrew.

string keyword = "garden";

string extractPages = "";

int foundCount = 0;

QP.LoadFromFile("originalfile.pdf", "");

// Iterate through each page in the document

for (int page = 1; page <= QP.PageCount(); page++)

{

// look for pages that match

QP.SelectPage(page);

string TextContent = QP.GetPageText(0); // Can also use option 8.

if (TextContent.Contains(keyword)) // we found a page

{

if (foundCount != 0)

extractPages = extractPages + ",";

extractPages = extractPages + page.ToString();

foundCount++;

}

if (foundCount > 0)

{

QP.ExtractPageRanges(extractPages);

QP.SaveToFile("out.pdf");

}

else

MessageBox.Show("Keyword not found");

QP.RemoveDocument(QP.SelectedDocument());

}