Class OLE2ScratchpadExtractorFactory


  • public class OLE2ScratchpadExtractorFactory
    extends java.lang.Object
    Scratchpad-specific logic for OLE2ExtractorFactory and org.apache.poi.extractor.ExtractorFactory, which permit the other two to run with no Scratchpad jar (though without functionality!)

    Note - should not be used standalone, always use via the other two classes

    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static POITextExtractor createExtractor​(DirectoryNode poifsDir)
      Look for certain entries in the stream, to figure it out what format is desired Note - doesn't check for core-supported formats! Note - doesn't check for OOXML-supported formats
      static void identifyEmbeddedResources​(POIOLE2TextExtractor ext, java.util.List<Entry> dirs, java.util.List<java.io.InputStream> nonPOIFS)
      Returns an array of text extractors, one for each of the embedded documents in the file (if there are any).
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • OLE2ScratchpadExtractorFactory

        public OLE2ScratchpadExtractorFactory()
    • Method Detail

      • createExtractor

        public static POITextExtractor createExtractor​(DirectoryNode poifsDir)
                                                throws java.io.IOException
        Look for certain entries in the stream, to figure it out what format is desired Note - doesn't check for core-supported formats! Note - doesn't check for OOXML-supported formats
        Parameters:
        poifsDir - the directory node to be inspected
        Returns:
        the format specific text extractor
        Throws:
        java.io.IOException - when the format specific extraction fails because of invalid entires
      • identifyEmbeddedResources

        public static void identifyEmbeddedResources​(POIOLE2TextExtractor ext,
                                                     java.util.List<Entry> dirs,
                                                     java.util.List<java.io.InputStream> nonPOIFS)
                                              throws java.io.IOException
        Returns an array of text extractors, one for each of the embedded documents in the file (if there are any). If there are no embedded documents, you'll get back an empty array. Otherwise, you'll get one open POITextExtractor for each embedded file.
        Parameters:
        ext - the extractor holding the directory to start parsing
        dirs - a list to be filled with directory references holding embedded
        nonPOIFS - a list to be filled with streams which aren't based on POIFS entries
        Throws:
        java.io.IOException - when the format specific extraction fails because of invalid entires