Extract Text from PowerPoint Presentation in C#, VB.NET
As we work, we often use PowerPoint Presentation to help us finish our projects. But sometimes we need other formats to meet diffrent work needs. In such a case, you may run into situations where you want to extract text from PowerPoint Presentation to other applications like Microsoft Word or WordPad to reduce its size. Through the slide.GetAllTextFrame()method provided by Spire.Presentation for .NET, it allows you to extract text from Table, TextBox, shape, shapeGroup, and symbols. You can extract text from the whole PowerPoint presentation. This article will show how to extract text from PowerPoint Presentation in C#, VB.NET from the following two parts.
Extract Text from PowerPoint Presentation to WordPad
Extract Text from Whole PowerPoint Presentation
Install Spire.Doc for .NET
To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet .
PM> Install-Package Spire.Doc
Extract Text from PowerPoint Presentation to WordPad
The following are the steps to perform this operation.
Specific Steps:
l Create a new instance of Presentation and load the sample PowerPoint file.
l Initialize a new instance of StringBuilder class, append extracted text from Presentation to StringBuilder.
l Create a new .txt file and write in the specified string text.
full code :
[C#]
using Spire.Presentation;
using System;
using System.Diagnostics;
using System.IO;
using System.Text;
namespace ExtractText
{
class program
{
static void Main(string[] args)
{
Presentation presentation = new Presentation("sample.pptx", FileFormat.Pptx2010);
StringBuilder sb = new StringBuilder();
foreach (ISlide slide in presentation.Slides)
{
foreach (IShape shape in slide.Shapes)
{
if (shape is IAutoShape)
{
foreach (TextParagraph tp in (shape as IAutoShape).TextFrame.Paragraphs)
{
sb.Append(tp.Text + Environment.NewLine);
}
}
}
}
File.WriteAllText("target1.txt", sb.ToString());
Process.Start("target1.txt");
}
}
}
[VB.NET]
Imports Spire. Presentation
Import's system. diagnostics
Imports System.IO
Imports System.Text
Namespace ExtractText
Class program
Private Shared Sub Main(args As String())
Dim presentation As New Presentation("sample.pptx", FileFormat.Pptx2010)
Dim sb As New StringBuilder()
For Each slide As ISlide In presentation.Slides
For Each shape As IShape In slide.Shapes
If TypeOf shape Is IAutoShape Then
For Each tp As TextParagraph In TryCast(shape, IAutoShape).TextFrame.Paragraphs
sb.Append(tp.Text + Environment.NewLine)
Next
End If
Next
Next
File.WriteAllText("target1.txt", sb.ToString())
Process.Start("target1.txt")
end sub
end class
end namespace
The input PowerPoint document:
The output PowerPoint document:
Extract Text from Whole PowerPoint Presentation
The following are the steps to perform this operation.
Specific Steps:
l Create a new instance of Presentation and load the sample PowerPoint file.
l Instantiate a StringBuilder object
l Using slide.GetAllTextFrame()method to get Text content and append extracted Text from Presentation to StringBuilder.
l Write the extracted Text in .txt and save it to a local path.
full code:
[C#]
using Spire.Presentation;
using System;
using System.Collections;
using System.IO;
using System.Text;
namespace ExtractText
{
class program
{
static void Main( string [] args)
{
//Create a PPT document
Presentation ppt = new Presentation();
//Load the PPT document
ppt.LoadFromFile( "Blue2.pptx" , FileFormat.Pptx2010);
// Instantiate a StringBuilder object
StringBuilder sb = new StringBuilder();
foreach (ISlide slide in ppt.Slides)
{
ArrayList arrayList = slide.GetAllTextFrame();
foreach (String Text in arrayList)
{
Console.Write(Text);
sb.Append(Text + Environment.NewLine);
}
}
// Write the extracted text in .txt and save it to a local path
System.IO.File.WriteAllText( "target.txt" , sb.ToString());
}
}
}
[VB.NET]
Imports Spire.Presentation
Imports System
Imports System.Collections
Imports System.IO
Imports System.Text
Namespace ExtractText
Class Program
Private Shared Sub Main( ByVal args() As String )
'Create a PPT document
Dim ppt As Presentation = New Presentation
'Load the PPT document
ppt.LoadFromFile( "Blue2.pptx" , FileFormat.Pptx2010)
' Instantiate a StringBuilder object
Dim sb As StringBuilder = New StringBuilder
For Each slide As ISlide In ppt.Slides
Dim arrayList As ArrayList = slide.GetAllTextFrame
For Each Text As String In arrayList
Console.Write(Text)
sb.Append((Text + Environment.NewLine))
Next
Next
' Write the extracted Text in .txt and save it to a local path
System.IO.File.WriteAllText( "target.txt" , sb.ToString)
End Sub
End Class
end namespace
The input PowerPoint document:
The output PowerPoint document:
Conclusion:
In this article, we introduce the method of Extracting Text from PowerPoint Presentation. In addition, we also have other functions, such as Extract Text from a Specific Rectangular Area , Extract Image From PDF , Extract Comments from Word Document and Save in TXT File , etc. Apart from that, if you'd like to learn more, you can visit theSpire.Doc Program Guide Content for .NETto explore more about for Spire.Doc for .NET.
Comments
Post a Comment