public class OfficeMetadataExtracter extends TikaPoweredMetadataExtracter
author: -- cm:author title: -- cm:title subject: -- cm:description createDateTime: -- cm:created lastSaveDateTime: -- cm:modified comments: editTime: format: keywords: lastAuthor: lastPrinted: osVersion: thumbnail: pageCount: wordCount:Uses Apache Tika
TikaPoweredMetadataExtracter.HeadContentHandler, TikaPoweredMetadataExtracter.MapCaptureContentHandler, TikaPoweredMetadataExtracter.NullContentHandler
MetadataExtracter.OverwritePolicy
Modifier and Type | Field and Description |
---|---|
static String |
KEY_CREATE_DATETIME |
static String |
KEY_EDIT_TIME |
static String |
KEY_FORMAT |
static String |
KEY_KEYWORDS |
static String |
KEY_LAST_AUTHOR |
static String |
KEY_LAST_PRINTED |
static String |
KEY_LAST_SAVE_DATETIME |
static String |
KEY_OS_VERSION |
static String |
KEY_PAGE_COUNT |
static String |
KEY_PARAGRAPH_COUNT |
static String |
KEY_THUMBNAIL |
static String |
KEY_WORD_COUNT |
static ArrayList<String> |
SUPPORTED_MIMETYPES |
documentSelector, KEY_AUTHOR, KEY_COMMENTS, KEY_CREATED, KEY_DESCRIPTION, KEY_SUBJECT, KEY_TAGS, KEY_TITLE, logger
MEGABYTE_SIZE, metadataExtracterConfig, NAMESPACE_PROPERTY_PREFIX, PROPERTY_COMPONENT_EMBED, PROPERTY_COMPONENT_EXTRACT, PROPERTY_PREFIX_METADATA
Constructor and Description |
---|
OfficeMetadataExtracter() |
Modifier and Type | Method and Description |
---|---|
protected Map<String,Serializable> |
extractSpecific(org.apache.tika.metadata.Metadata metadata,
Map<String,Serializable> properties,
Map<String,String> headers)
Allows implementation specific mappings to be done.
|
protected org.apache.tika.parser.Parser |
getParser()
Returns the correct Tika Parser to process the document.
|
buildParseContext, buildSupportedMimetypes, embedInternal, extractRaw, extractSize, getDocumentSelector, getEmbedder, getExtractorContext, getInputStream, getMetadataSeparator, makeDate, needHeaderContents, setDocumentSelector, setMetadataSeparator
checkIsEmbedSupported, checkIsSupported, embed, extract, extract, extract, filterSystemProperties, getBeanName, getDefaultEmbedMapping, getDefaultMapping, getEmbedMapping, getExecutorService, getLimits, getMapping, getMimetypeService, init, isEmbeddingSupported, isSupported, newRawMap, putRawValue, readEmbedMappingProperties, readEmbedMappingProperties, readGlobalEmbedMappingProperties, readGlobalExtractMappingProperties, readMappingProperties, readMappingProperties, register, setApplicationContext, setBeanName, setDictionaryService, setEmbedMapping, setEmbedMappingProperties, setEnableStringTagging, setExecutorService, setFailOnTypeConversion, setInheritDefaultEmbedMapping, setInheritDefaultMapping, setMapping, setMappingProperties, setMetadataExtracterConfig, setMimetypeLimits, setMimetypeService, setOverwritePolicy, setProperties, setRegistry, setSupportedDateFormats, setSupportedEmbedMimetypes, setSupportedMimetypes
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
embed, isEmbeddingSupported
public static final String KEY_CREATE_DATETIME
public static final String KEY_LAST_SAVE_DATETIME
public static final String KEY_EDIT_TIME
public static final String KEY_FORMAT
public static final String KEY_KEYWORDS
public static final String KEY_LAST_AUTHOR
public static final String KEY_LAST_PRINTED
public static final String KEY_OS_VERSION
public static final String KEY_THUMBNAIL
public static final String KEY_PAGE_COUNT
public static final String KEY_PARAGRAPH_COUNT
public static final String KEY_WORD_COUNT
protected org.apache.tika.parser.Parser getParser()
TikaPoweredMetadataExtracter
TikaAutoMetadataExtracter
which makes use of the Tika auto-detection.getParser
in class TikaPoweredMetadataExtracter
protected Map<String,Serializable> extractSpecific(org.apache.tika.metadata.Metadata metadata, Map<String,Serializable> properties, Map<String,String> headers)
TikaPoweredMetadataExtracter
extractSpecific
in class TikaPoweredMetadataExtracter
Copyright © 2005–2017 Alfresco Software. All rights reserved.