Package org.apache.poi.hwpf
Class HWPFDocumentCore
- java.lang.Object
-
- org.apache.poi.POIDocument
-
- org.apache.poi.hwpf.HWPFDocumentCore
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
- Direct Known Subclasses:
HWPFDocument
,HWPFOldDocument
public abstract class HWPFDocumentCore extends POIDocument
This class holds much of the core of a Word document, but without some of the table structure information. You generally want to work with one ofHWPFDocument
orHWPFOldDocument
-
-
Field Summary
Fields Modifier and Type Field Description protected CHPBinTable
_cbt
Contains formatting properties for textprotected FileInformationBlock
_fib
The FIBprotected FontTable
_ft
Holds fonts for this document.protected ListTables
_lt
Hold list tablesprotected byte[]
_mainStream
main document stream bufferprotected ObjectPoolImpl
_objectPool
Holds OLE2 objectsprotected PAPBinTable
_pbt
Contains formatting properties for paragraphsprotected StyleSheet
_ss
Holds styles for this document.protected SectionTable
_st
Contains formatting properties for sections.protected static int
FIB_BASE_LEN
Size of the not encrypted part of the FIBprotected static int
RC4_REKEYING_INTERVAL
[MS-DOC] 2.2.6.2/3 Office Binary Document ...protected static java.lang.String
STREAM_OBJECT_POOL
protected static java.lang.String
STREAM_TABLE_0
protected static java.lang.String
STREAM_TABLE_1
protected static java.lang.String
STREAM_WORD_DOCUMENT
-
Constructor Summary
Constructors Modifier Constructor Description protected
HWPFDocumentCore()
HWPFDocumentCore(java.io.InputStream istream)
This constructor loads a Word document from an InputStream.HWPFDocumentCore(DirectoryNode directory)
This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default.HWPFDocumentCore(POIFSFileSystem pfilesystem)
This constructor loads a Word document from a POIFSFileSystem
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description CHPBinTable
getCharacterTable()
protected byte[]
getDocumentEntryBytes(java.lang.String name, int encryptionOffset, int len)
Reads OLE Stream into byte array - if anEncryptionInfo
is available, decrypt the bytes starting at encryptionOffset.java.lang.String
getDocumentText()
Returns document text, i.e.EncryptionInfo
getEncryptionInfo()
FileInformationBlock
getFileInformationBlock()
FontTable
getFontTable()
ListTables
getListTables()
byte[]
getMainStream()
ObjectsPool
getObjectsPool()
abstract Range
getOverallRange()
Returns the range that covers all text in the file, including main text, footnotes, headers and commentsPAPBinTable
getParagraphTable()
abstract Range
getRange()
Returns the range which covers the whole of the document, but excludes any headers and footers.SectionTable
getSectionTable()
StyleSheet
getStyleSheet()
abstract java.lang.StringBuilder
getText()
Internal method to access document textabstract TextPieceTable
getTextTable()
protected void
updateEncryptionInfo()
static POIFSFileSystem
verifyAndBuildPOIFS(java.io.InputStream istream)
Takes an InputStream, verifies that it's not RTF or PDF, builds a POIFSFileSystem from it, and returns that.-
Methods inherited from class org.apache.poi.POIDocument
clearDirectory, close, createInformationProperties, getDirectory, getDocumentSummaryInformation, getEncryptedPropertyStreamName, getPropertySet, getPropertySet, getSummaryInformation, initDirectory, readProperties, replaceDirectory, validateInPlaceWritePossible, write, write, write, writeProperties, writeProperties, writeProperties
-
-
-
-
Field Detail
-
STREAM_OBJECT_POOL
protected static final java.lang.String STREAM_OBJECT_POOL
- See Also:
- Constant Field Values
-
STREAM_WORD_DOCUMENT
protected static final java.lang.String STREAM_WORD_DOCUMENT
- See Also:
- Constant Field Values
-
STREAM_TABLE_0
protected static final java.lang.String STREAM_TABLE_0
- See Also:
- Constant Field Values
-
STREAM_TABLE_1
protected static final java.lang.String STREAM_TABLE_1
- See Also:
- Constant Field Values
-
FIB_BASE_LEN
protected static final int FIB_BASE_LEN
Size of the not encrypted part of the FIB- See Also:
- Constant Field Values
-
RC4_REKEYING_INTERVAL
protected static final int RC4_REKEYING_INTERVAL
[MS-DOC] 2.2.6.2/3 Office Binary Document ... Encryption: "... The block number MUST be set to zero at the beginning of the stream and MUST be incremented at each 512 byte boundary. ..."- See Also:
- Constant Field Values
-
_objectPool
protected ObjectPoolImpl _objectPool
Holds OLE2 objects
-
_fib
protected FileInformationBlock _fib
The FIB
-
_ss
protected StyleSheet _ss
Holds styles for this document.
-
_cbt
protected CHPBinTable _cbt
Contains formatting properties for text
-
_pbt
protected PAPBinTable _pbt
Contains formatting properties for paragraphs
-
_st
protected SectionTable _st
Contains formatting properties for sections.
-
_ft
protected FontTable _ft
Holds fonts for this document.
-
_lt
protected ListTables _lt
Hold list tables
-
_mainStream
protected byte[] _mainStream
main document stream buffer
-
-
Constructor Detail
-
HWPFDocumentCore
protected HWPFDocumentCore()
-
HWPFDocumentCore
public HWPFDocumentCore(java.io.InputStream istream) throws java.io.IOException
This constructor loads a Word document from an InputStream.- Parameters:
istream
- The InputStream that contains the Word document.- Throws:
java.io.IOException
- If there is an unexpected IOException from the passed in InputStream.
-
HWPFDocumentCore
public HWPFDocumentCore(POIFSFileSystem pfilesystem) throws java.io.IOException
This constructor loads a Word document from a POIFSFileSystem- Parameters:
pfilesystem
- The POIFSFileSystem that contains the Word document.- Throws:
java.io.IOException
- If there is an unexpected IOException from the passed in POIFSFileSystem.
-
HWPFDocumentCore
public HWPFDocumentCore(DirectoryNode directory) throws java.io.IOException
This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embeded documents.- Parameters:
directory
- The DirectoryNode that contains the Word document.- Throws:
java.io.IOException
- If there is an unexpected IOException from the passed in POIFSFileSystem.
-
-
Method Detail
-
verifyAndBuildPOIFS
public static POIFSFileSystem verifyAndBuildPOIFS(java.io.InputStream istream) throws java.io.IOException
Takes an InputStream, verifies that it's not RTF or PDF, builds a POIFSFileSystem from it, and returns that.- Throws:
java.io.IOException
-
getRange
public abstract Range getRange()
Returns the range which covers the whole of the document, but excludes any headers and footers.
-
getOverallRange
public abstract Range getOverallRange()
Returns the range that covers all text in the file, including main text, footnotes, headers and comments
-
getDocumentText
public java.lang.String getDocumentText()
Returns document text, i.e. text information from all text pieces, including OLE descriptions and field codes
-
getText
@Internal public abstract java.lang.StringBuilder getText()
Internal method to access document text
-
getCharacterTable
public CHPBinTable getCharacterTable()
-
getParagraphTable
public PAPBinTable getParagraphTable()
-
getSectionTable
public SectionTable getSectionTable()
-
getStyleSheet
public StyleSheet getStyleSheet()
-
getListTables
public ListTables getListTables()
-
getFontTable
public FontTable getFontTable()
-
getFileInformationBlock
public FileInformationBlock getFileInformationBlock()
-
getObjectsPool
public ObjectsPool getObjectsPool()
-
getTextTable
public abstract TextPieceTable getTextTable()
-
getMainStream
@Internal public byte[] getMainStream()
-
getEncryptionInfo
public EncryptionInfo getEncryptionInfo() throws java.io.IOException
- Overrides:
getEncryptionInfo
in classPOIDocument
- Returns:
- the encryption info if the document is encrypted, otherwise
null
- Throws:
java.io.IOException
- If retrieving the encryption information fails
-
updateEncryptionInfo
protected void updateEncryptionInfo()
-
getDocumentEntryBytes
protected byte[] getDocumentEntryBytes(java.lang.String name, int encryptionOffset, int len) throws java.io.IOException
Reads OLE Stream into byte array - if anEncryptionInfo
is available, decrypt the bytes starting at encryptionOffset. If encryptionOffset = -1, then do not try to decrypt the bytes- Parameters:
name
- the name of the streamencryptionOffset
- the offset from which to start decrypting, use-1
for no decryptionlen
- length of the bytes to be read, useInteger.MAX_VALUE
for all bytes- Returns:
- the read bytes
- Throws:
java.io.IOException
- if the stream can't be found
-
-