AbstractMappingMetadataExtracter (Alfresco 5.1.4.1 Public Java API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES All Classes

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.alfresco.repo.content.metadata

Class AbstractMappingMetadataExtracter

java.lang.Object

org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter

All Implemented Interfaces:

MetadataExtracter, ContentWorker, MetadataEmbedder, org.springframework.beans.factory.BeanNameAware, org.springframework.beans.factory.Aware, org.springframework.context.ApplicationContextAware

Direct Known Subclasses:

TikaPoweredMetadataExtracter

@org.alfresco.api.AlfrescoPublicApi
public abstract class AbstractMappingMetadataExtracter

extends Object

implements MetadataExtracter, MetadataEmbedder, org.springframework.beans.factory.BeanNameAware, org.springframework.context.ApplicationContextAware

Support class for metadata extracters that support dynamic and config-driven mapping between extracted values and model properties. Extraction is broken up into two phases:

Extract ALL available metadata from the document.
Translate the metadata into system properties.

Migrating an existing extracter to use this class is straightforward:

Construct the extracter providing a default set of supported mimetypes to this implementation. This can be overwritten with configurations.
Implement the extract(org.alfresco.service.cmr.repository.ContentReader, Map) method. This now returns a raw map of extracted values keyed by document-specific property names. The trimPut method has been replaced with an equivalent putRawValue(String, Serializable, Map).
Provide the default mapping of the document-specific properties to system-specific properties as describe by the getDefaultMapping() method. The simplest is to provide the default mapping in a correlated .properties file.
Document, in the class-level javadoc, all the available properties that are extracted along with their approximate meanings. Add to this, the default mappings.

Since:

2.1

Author:

Jesper Steen Møller, Derek Hulley

Nested classes/interfaces inherited from interface org.alfresco.repo.content.metadata.MetadataExtracter

MetadataExtracter.OverwritePolicy

Field Summary
protected static org.apache.commons.logging.Log	logger
static int	MEGABYTE_SIZE
protected org.alfresco.repo.content.metadata.MetadataExtracterConfig	metadataExtracterConfig
static String	NAMESPACE_PROPERTY_PREFIX
static String	PROPERTY_COMPONENT_EMBED
static String	PROPERTY_COMPONENT_EXTRACT
static String	PROPERTY_PREFIX_METADATA

Constructor Summary
protected	AbstractMappingMetadataExtracter() Default constructor.
protected	AbstractMappingMetadataExtracter(Set<String> supportedMimetypes) Constructor that can be used when the list of supported mimetypes is known up front.
protected	AbstractMappingMetadataExtracter(Set<String> supportedMimetypes, Set<String> supportedEmbedMimetypes) Constructor that can be used when the list of supported extract and embed mimetypes is known up front.

Method Summary
protected void	checkIsEmbedSupported(ContentWriter writer) Checks if embedding for the mimetype is supported.
protected void	checkIsSupported(ContentReader reader) Checks if the mimetype is supported.
void	embed(Map<QName,Serializable> properties, ContentReader reader, ContentWriter writer) Embeds the given properties into the file specified by the given content writer.
protected void	embedInternal(Map<String,Serializable> metadata, ContentReader reader, ContentWriter writer) Override to embed metadata values.
Map<QName,Serializable>	extract(ContentReader reader, Map<QName,Serializable> destination) Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.
Map<QName,Serializable>	extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination) Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.
Map<QName,Serializable>	extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination, Map<String,Set<QName>> mapping) Extracts the metadata from the content provided by the reader and source mimetype to the supplied map.
protected abstract Map<String,Serializable>	extractRaw(ContentReader reader) Override to provide the raw extracted metadata values.
protected void	filterSystemProperties(Map<QName,Serializable> systemProperties, Map<QName,Serializable> targetProperties) Filters the system properties that are going to be applied.
String	getBeanName()
protected Map<QName,Set<String>>	getDefaultEmbedMapping() This method provides a best guess of what model properties should be embedded in content.
protected Map<String,Set<QName>>	getDefaultMapping() This method provides a best guess of where to store the values extracted from the documents.
protected Map<QName,Set<String>>	getEmbedMapping() Helper method for derived classes to obtain the embed mappings.
protected ExecutorService	getExecutorService() Gets the `ExecutorService` to be used for timeout-aware extraction.
long	getExtractionTime() Provides an estimate, usually a worst case guess, of how long an extraction will take.
protected MetadataExtracterLimits	getLimits(String mimetype) Gets the metadata extracter limits for the given mimetype.
protected Map<String,Set<QName>>	getMapping() Helper method for derived classes to obtain the mappings that will be applied to raw values.
protected MimetypeService	getMimetypeService()
double	getReliability(String mimetype) TODO - This doesn't appear to be used, so should be removed / deprecated / replaced
protected void	init() Provides a hook point for implementations to perform initialization.
boolean	isEmbeddingSupported(String sourceMimetype) Determines if the extracter works against the given mimetype.
boolean	isSupported(String sourceMimetype) Determines if the extracter works against the given mimetype.
protected Date	makeDate(String dateStr) Convert a date `String` to a `Date` object
protected Map<String,Serializable>	newRawMap() Helper method to fetch a clean map into which raw values can be dumped.
protected boolean	putRawValue(String key, Serializable value, Map<String,Serializable> destination) Adds a value to the map, conserving null values.
protected Map<QName,Set<String>>	readEmbedMappingProperties(Properties mappingProperties) A utility method to convert mapping properties to the Map form.
protected Map<QName,Set<String>>	readEmbedMappingProperties(String propertiesUrl) A utility method to read embed mapping properties from a resource file and convert to the map form.
protected Map<QName,Set<String>>	readGlobalEmbedMappingProperties() A utility method to convert global mapping properties to the Map form.
protected Map<String,Set<QName>>	readGlobalExtractMappingProperties() A utility method to convert global properties to the Map form for the given propertyComponent.
protected Map<String,Set<QName>>	readMappingProperties(Properties mappingProperties) A utility method to convert mapping properties to the Map form.
protected Map<String,Set<QName>>	readMappingProperties(String propertiesUrl) A utility method to read mapping properties from a resource file and convert to the map form.
void	register() Registers this instance of the extracter with the registry.
void	setApplicationContext(org.springframework.context.ApplicationContext applicationContext)
void	setBeanName(String beanName)
void	setDictionaryService(DictionaryService dictionaryService)
void	setEmbedMapping(Map<QName,Set<String>> embedMapping) Set the embed mapping from document metadata to system metadata.
void	setEmbedMappingProperties(Properties embedMappingProperties) Set the properties that contain the embed mapping from model properties to content file metadata.
void	setEnableStringTagging(boolean enableStringTagging) Whether or not to enable the pass through of simple strings to cm:taggable tags
void	setExecutorService(ExecutorService executorService) Sets the `ExecutorService` to be used for timeout-aware extraction.
void	setFailOnTypeConversion(boolean failOnTypeConversion) Set whether the extractor should discard metadata that fails to convert to the target type defined in the data dictionary model.
void	setInheritDefaultEmbedMapping(boolean inheritDefaultEmbedMapping) Set if the embed property mappings augment or override the mapping generically provided by the extracter implementation.
void	setInheritDefaultMapping(boolean inheritDefaultMapping) Set if the property mappings augment or override the mapping generically provided by the extracter implementation.
void	setMapping(Map<String,Set<QName>> mapping) Set the mapping from document metadata to system metadata.
void	setMappingProperties(Properties mappingProperties) Set the properties that contain the mapping from document metadata to system metadata.
void	setMetadataExtracterConfig(org.alfresco.repo.content.metadata.MetadataExtracterConfig metadataExtracterConfig) The metadata extracter config.
void	setMimetypeLimits(Map<String,MetadataExtracterLimits> mimetypeLimits) Sets the map of source mimetypes to metadata extracter limits.
void	setMimetypeService(MimetypeService mimetypeService)
void	setOverwritePolicy(MetadataExtracter.OverwritePolicy overwritePolicy) Set the policy to use when existing values are encountered.
void	setOverwritePolicy(String overwritePolicyStr) Set the policy to use when existing values are encountered.
void	setProperties(Properties properties) The Alfresco global properties.
void	setRegistry(MetadataExtracterRegistry registry) Set the registry to register with.
void	setSupportedDateFormats(List<String> supportedDateFormats) Set the date formats, over and above the ISO8601 format, that will be supported for string to date conversions.
void	setSupportedEmbedMimetypes(Collection<String> supportedEmbedMimetypes) Set the mimetypes that are supported for embedding.
void	setSupportedMimetypes(Collection<String> supportedMimetypes) Set the mimetypes that are supported by the extracter.

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

logger

protected static org.apache.commons.logging.Log logger

MEGABYTE_SIZE

public static final int MEGABYTE_SIZE

See Also:

Constant Field Values

metadataExtracterConfig

protected org.alfresco.repo.content.metadata.MetadataExtracterConfig metadataExtracterConfig

NAMESPACE_PROPERTY_PREFIX

public static final String NAMESPACE_PROPERTY_PREFIX

See Also:

Constant Field Values

PROPERTY_COMPONENT_EMBED

public static final String PROPERTY_COMPONENT_EMBED

See Also:

Constant Field Values

PROPERTY_COMPONENT_EXTRACT

public static final String PROPERTY_COMPONENT_EXTRACT

See Also:

Constant Field Values

PROPERTY_PREFIX_METADATA

public static final String PROPERTY_PREFIX_METADATA

See Also:

Constant Field Values

Constructor Detail

AbstractMappingMetadataExtracter

protected AbstractMappingMetadataExtracter()

Default constructor. If this is called, then isSupported(String) should be implemented. This is useful when the list of supported mimetypes is not known when the instance is constructed. Alternatively, once the set becomes known, call setSupportedMimetypes(Collection).

AbstractMappingMetadataExtracter

protected AbstractMappingMetadataExtracter(Set<String> supportedMimetypes)

Constructor that can be used when the list of supported mimetypes is known up front.

Parameters:

supportedMimetypes - the set of mimetypes supported by default

AbstractMappingMetadataExtracter

protected AbstractMappingMetadataExtracter(Set<String> supportedMimetypes,
Set<String> supportedEmbedMimetypes)

Constructor that can be used when the list of supported extract and embed mimetypes is known up front.

Parameters:

supportedMimetypes - the set of mimetypes supported for extraction by default

supportedEmbedMimetypes - the set of mimetypes supported for embedding by default

Method Detail

setRegistry

public void setRegistry(MetadataExtracterRegistry registry)

Set the registry to register with. If this is not set, then the default initialization will not auto-register the extracter for general use. It can still be used directly.

Parameters:

registry - a metadata extracter registry

setMimetypeService

public void setMimetypeService(MimetypeService mimetypeService)

Parameters:

mimetypeService - the mimetype service. Set this if required.

getMimetypeService

protected MimetypeService getMimetypeService()

Returns:

Returns the mimetype helper

setDictionaryService

public void setDictionaryService(DictionaryService dictionaryService)

Parameters:

dictionaryService - the dictionary service to determine which data conversions are necessary

setSupportedMimetypes

public void setSupportedMimetypes(Collection<String> supportedMimetypes)

Set the mimetypes that are supported by the extracter.

Parameters:

supportedMimetypes - Collection

setSupportedEmbedMimetypes

public void setSupportedEmbedMimetypes(Collection<String> supportedEmbedMimetypes)

Set the mimetypes that are supported for embedding.

Parameters:

supportedEmbedMimetypes - Collection

isSupported

public boolean isSupported(String sourceMimetype)

Determines if the extracter works against the given mimetype.

Specified by:

isSupported in interface MetadataExtracter

Returns:

Returns true if the mimetype is supported, otherwise false.

See Also:

setSupportedMimetypes(Collection)

isEmbeddingSupported

public boolean isEmbeddingSupported(String sourceMimetype)

Determines if the extracter works against the given mimetype.

Specified by:

isEmbeddingSupported in interface MetadataEmbedder

Returns:

Returns true if the mimetype is supported, otherwise false.

See Also:

setSupportedEmbedMimetypes(Collection)

getReliability

public double getReliability(String mimetype)

TODO - This doesn't appear to be used, so should be removed / deprecated / replaced

Specified by:

getReliability in interface MetadataExtracter

Parameters:

mimetype - the mimetype to check

Returns:

Returns 1.0 if the mimetype is supported, otherwise 0.0

See Also:

isSupported(String)

setOverwritePolicy

public void setOverwritePolicy(MetadataExtracter.OverwritePolicy overwritePolicy)

Set the policy to use when existing values are encountered. Depending on how the extractor is called, this may not be relevant, i.e an empty map of existing properties may be passed in by the client code, which may follow its own overwrite strategy.

Parameters:

overwritePolicy - the policy to apply when there are existing system properties

setOverwritePolicy

public void setOverwritePolicy(String overwritePolicyStr)

Parameters:

overwritePolicyStr - the policy to apply when there are existing system properties

setFailOnTypeConversion

public void setFailOnTypeConversion(boolean failOnTypeConversion)

Set whether the extractor should discard metadata that fails to convert to the target type defined in the data dictionary model. This is true by default i.e. if the data extracted is not compatible with the target model then the extraction will fail. If this is false then any extracted data that fails to convert will be discarded.

Parameters:

failOnTypeConversion - false to discard properties that can't get converted to the dictionary-defined type, or true (default) to fail the extraction if the type doesn't convert

setSupportedDateFormats public void setSupportedDateFormats(List<String> supportedDateFormats) Set the date formats, over and above the ISO8601 format, that will be supported for string to date conversions. The supported syntax is described by the SimpleDateFormat Javadocs. Parameters: supportedDateFormats - a list of supported date formats. setInheritDefaultMapping public void setInheritDefaultMapping(boolean inheritDefaultMapping) Set if the property mappings augment or override the mapping generically provided by the extracter implementation. The default is false, i.e. any mapping set completely replaces the default mappings. Note that even when set to true an individual property mapping entry replaces the entry provided by the extracter implementation. Parameters: inheritDefaultMapping - true to add the configured mapping to the list of default mappings. See Also: getDefaultMapping(), setMapping(Map), setMappingProperties(Properties) setBeanName public void setBeanName(String beanName) Specified by: setBeanName in interface org.springframework.beans.factory.BeanNameAware getBeanName public String getBeanName() setApplicationContext public void setApplicationContext(org.springframework.context.ApplicationContext applicationContext) Specified by: setApplicationContext in interface org.springframework.context.ApplicationContextAware setProperties public void setProperties(Properties properties) The Alfresco global properties. setMetadataExtracterConfig public void setMetadataExtracterConfig(org.alfresco.repo.content.metadata.MetadataExtracterConfig metadataExtracterConfig) The metadata extracter config. setEnableStringTagging public void setEnableStringTagging(boolean enableStringTagging) Whether or not to enable the pass through of simple strings to cm:taggable tags Parameters: enableStringTagging - true find or create tags for each string mapped to cm:taggable. false (default) ignore mapping strings to tags. setInheritDefaultEmbedMapping public void setInheritDefaultEmbedMapping(boolean inheritDefaultEmbedMapping) Set if the embed property mappings augment or override the mapping generically provided by the extracter implementation. The default is false, i.e. any mapping set completely replaces the default mappings. Note that even when set to true an individual property mapping entry replaces the entry provided by the extracter implementation. Parameters: inheritDefaultEmbedMapping - true to add the configured embed mapping to the list of default embed mappings. See Also: getDefaultEmbedMapping(), setEmbedMapping(Map), setEmbedMappingProperties(Properties) setMimetypeLimits public void setMimetypeLimits(Map<String,MetadataExtracterLimits> mimetypeLimits) Sets the map of source mimetypes to metadata extracter limits. Parameters: mimetypeLimits - Map getExecutorService protected ExecutorService getExecutorService() Gets the ExecutorService to be used for timeout-aware extraction. If no ExecutorService has been defined a default of Executors.newCachedThreadPool() is used during init(). Returns: the defined or default ExecutorService setExecutorService public void setExecutorService(ExecutorService executorService) Sets the ExecutorService to be used for timeout-aware extraction. Parameters: executorService - the ExecutorService for timeouts setMapping public void setMapping(Map<String,Set<QName>> mapping) Set the mapping from document metadata to system metadata. It is possible to direct an extracted document property to several system properties. The conversion between the document property types and the system property types will be done by the default converter. Parameters: mapping - a mapping from document metadata to system metadata setEmbedMapping public void setEmbedMapping(Map<QName,Set<String>> embedMapping) Set the embed mapping from document metadata to system metadata. It is possible to direct an model properties to several content file metadata keys. The conversion between the model property types and the content file metadata keys types will be done by the default converter. Parameters: embedMapping - an embed mapping from model properties to content file metadata keys setMappingProperties public void setMappingProperties(Properties mappingProperties) Set the properties that contain the mapping from document metadata to system metadata. This is an alternative to the setMapping(Map) method. Any mappings already present will be cleared out. The property mapping is of the form: # Namespaces prefixes namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 namespace.prefix.my=http://www....com/alfresco/1.0 # Mapping editor=cm:author, my:editor title=cm:title user1=cm:summary user2=cm:description The mapping can therefore be from a single document property onto several system properties. Parameters: mappingProperties - the properties that map document properties to system properties setEmbedMappingProperties public void setEmbedMappingProperties(Properties embedMappingProperties) Set the properties that contain the embed mapping from model properties to content file metadata. This is an alternative to the setEmbedMapping(Map) method. Any mappings already present will be cleared out. The property mapping is of the form: # Namespaces prefixes namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 namespace.prefix.my=http://www....com/alfresco/1.0 # Mapping cm\:author=editor cm\:title=title cm\:summary=user1 cm\:description=description,user2 The embed mapping can therefore be from a model property onto several content file metadata properties. Parameters: embedMappingProperties - the properties that map model properties to content file metadata properties getMapping protected final Map<String,Set<QName>> getMapping() Helper method for derived classes to obtain the mappings that will be applied to raw values. This should be called after initialization in order to guarantee the complete map is given. Normally, the list of properties that can be extracted from a document is fixed and well-known - in that case, just extract everything. But Some implementations may have an extra, indeterminate set of values available for extraction. If the extraction of these runtime parameters is expensive, then the keys provided by the return value can be used to extract values from the documents. The metadata extraction becomes fully configuration-driven, i.e. declaring further mappings will result in more values being extracted from the documents. Most extractors will not be using this method. For an example of its use, see the OpenDocument extractor, which uses the mapping to select specific user properties from a document. getEmbedMapping protected final Map<QName,Set<String>> getEmbedMapping() Helper method for derived classes to obtain the embed mappings. This should be called after initialization in order to guarantee the complete map is given. Normally, the list of properties that can be embedded in a document is fixed and well-known.. But some implementations may have an extra, indeterminate set of values available for embedding. If the embedding of these runtime parameters is expensive, then the keys provided by the return value can be used to embed values in the documents. The metadata embedding becomes fully configuration-driven, i.e. declaring further mappings will result in more values being embedded in the documents. readMappingProperties protected Map<String,Set<QName>> readMappingProperties(String propertiesUrl) A utility method to read mapping properties from a resource file and convert to the map form. Parameters: propertiesUrl - A standard Properties file URL location See Also: setMappingProperties(Properties) readGlobalExtractMappingProperties protected Map<String,Set<QName>> readGlobalExtractMappingProperties() A utility method to convert global properties to the Map form for the given propertyComponent. Mappings can be specified using the same method defined for normal mapping properties files but with a prefix of metadata.extracter, the extracter bean name, and the extract component. For example: metadata.extracter.TikaAuto.extract.namespace.prefix.my=http://DummyMappingMetadataExtracter metadata.extracter.TikaAuto.extract.namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 metadata.extracter.TikaAuto.extract.dc\:description=cm:description, my:customDescription readMappingProperties protected Map<String,Set<QName>> readMappingProperties(Properties mappingProperties) A utility method to convert mapping properties to the Map form. See Also: setMappingProperties(Properties) readEmbedMappingProperties protected Map<QName,Set<String>> readEmbedMappingProperties(String propertiesUrl) A utility method to read embed mapping properties from a resource file and convert to the map form. Parameters: propertiesUrl - A standard Properties file URL location See Also: setEmbedMappingProperties(Properties) readGlobalEmbedMappingProperties protected Map<QName,Set<String>> readGlobalEmbedMappingProperties() A utility method to convert global mapping properties to the Map form. Different from readGlobalExtractMappingProperties in that keys are the Alfresco QNames and values are file metadata properties. Mappings can be specified using the same method defined for normal embed mapping properties files but with a prefix of metadata.extracter, the extracter bean name, and the embed component. For example: metadata.extracter.TikaAuto.embed.namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 metadata.extracter.TikaAuto.embed.cm\:description=description See Also: setMappingProperties(Properties) readEmbedMappingProperties protected Map<QName,Set<String>> readEmbedMappingProperties(Properties mappingProperties) A utility method to convert mapping properties to the Map form. Different from readMappingProperties in that keys are the Alfresco QNames and values are file metadata properties. See Also: setMappingProperties(Properties) register public final void register() Registers this instance of the extracter with the registry. This will call the init() method and then register if the registry is available. See Also: setRegistry(MetadataExtracterRegistry), init() init protected void init() Provides a hook point for implementations to perform initialization. The base implementation must be invoked or the extracter will fail during extraction. The default mappings will be requested during initialization. getExtractionTime public long getExtractionTime() Provides an estimate, usually a worst case guess, of how long an extraction will take. This method is used to determine, up front, which of a set of equally reliant transformers will be used for a specific extraction. Specified by: getExtractionTime in interface MetadataExtracter Returns: Returns the approximate number of milliseconds per transformation checkIsSupported protected void checkIsSupported(ContentReader reader) Checks if the mimetype is supported. Parameters: reader - the reader to check Throws: AlfrescoRuntimeException - if the mimetype is not supported checkIsEmbedSupported protected void checkIsEmbedSupported(ContentWriter writer) Checks if embedding for the mimetype is supported. Parameters: writer - the writer to check Throws: AlfrescoRuntimeException - if embedding for the mimetype is not supported extract public final Map<QName,Serializable> extract(ContentReader reader, Map<QName,Serializable> destination) Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. The internal mapping and overwrite policy between document metadata and system metadata will be used. The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String). The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader. Specified by: extract in interface MetadataExtracter Parameters: reader - the source of the content destination - the map of properties to populate (essentially a return value) Returns: Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified. extract public final Map<QName,Serializable> extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination) Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String). The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader. Specified by: extract in interface MetadataExtracter Parameters: reader - the source of the content overwritePolicy - the policy stipulating how the system properties must be overwritten if present destination - the map of properties to populate (essentially a return value) Returns: Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified. extract public Map<QName,Serializable> extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination, Map<String,Set<QName>> mapping) Extracts the metadata from the content provided by the reader and source mimetype to the supplied map. The mapping from document metadata to system metadata is explicitly provided. The overwrite policy is also explictly set. The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String). The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader. Specified by: extract in interface MetadataExtracter Parameters: reader - the source of the content overwritePolicy - the policy stipulating how the system properties must be overwritten if present destination - the map of properties to populate (essentially a return value) mapping - a mapping of document-specific properties to system properties. Returns: Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified. embed public final void embed(Map<QName,Serializable> properties, ContentReader reader, ContentWriter writer) Embeds the given properties into the file specified by the given content writer. * The embedding viability can be determined by an up front call to MetadataEmbedder.isEmbeddingSupported(String). The source mimetype must be available on the ContentAccessor.getMimetype() method of the writer. Specified by: embed in interface MetadataEmbedder Parameters: properties - the model properties to embed reader - the reader for the original source content file writer - the writer for the content after metadata has been embedded filterSystemProperties protected void filterSystemProperties(Map<QName,Serializable> systemProperties, Map<QName,Serializable> targetProperties) Filters the system properties that are going to be applied. Gives the metadata extracter an opportunity to remove properties that may not be appropriate in a given context. Parameters: systemProperties - map of system properties to be applied targetProperties - map of target properties, may be used to provide to the context requried makeDate protected Date makeDate(String dateStr) Convert a date String to a Date object putRawValue protected boolean putRawValue(String key, Serializable value, Map<String,Serializable> destination) Adds a value to the map, conserving null values. Values are converted to null if: it is an empty string value after trimming it is an empty collection it is an empty array String values are trimmed before being put into the map. Otherwise, it is up to the extracter to ensure that the value is a Serializable. It is not appropriate to implicitly convert values in order to make them Serializable - the best conversion method will depend on the value's specific meaning. Parameters: key - the destination key value - the serializable value destination - the map to put values into Returns: Returns true if set, otherwise false newRawMap protected final Map<String,Serializable> newRawMap() Helper method to fetch a clean map into which raw values can be dumped. Returns: Returns an empty map getDefaultMapping protected Map<String,Set<QName>> getDefaultMapping() This method provides a best guess of where to store the values extracted from the documents. The list of properties mapped by default need not include all properties extracted from the document; just the obvious set of mappings need be supplied. Implementations must either provide the default mapping properties in the expected location or override the method to provide the default mapping. The default implementation looks for the default mapping file in the location given by the class name and .properties. If the extracter's class is x.y.z.MyExtracter then the default properties will be picked up at classpath:/alfresco/metadata/MyExtracter.properties. The previous location of classpath:/x/y/z/MyExtracter.properties is still supported but may be removed in a future release. Inner classes are supported, but the '$' in the class name is replaced with '-', so default properties for x.y.z.MyStuff$MyExtracter will be located using classpath:/alfresco/metadata/MyStuff-MyExtracter.properties. The default mapping implementation should include thorough Javadocs so that the system administrators can accurately determine how to best enhance or override the default mapping. If the default mapping is declared in a properties file other than the one named after the class, then the readMappingProperties(String) method can be used to quickly generate the return value: protected Map<> getDefaultMapping() { return readMappingProperties(DEFAULT_MAPPING); } The map can also be created in code either statically or during the call. Returns: Returns the default, static mapping. It may not be null. See Also: setInheritDefaultMapping(boolean inherit) getDefaultEmbedMapping protected Map<QName,Set<String>> getDefaultEmbedMapping() This method provides a best guess of what model properties should be embedded in content. The list of properties mapped by default need not include all properties to be embedded in the document; just the obvious set of mappings need be supplied. Implementations must either provide the default mapping properties in the expected location or override the method to provide the default mapping. The default implementation looks for the default mapping file in the location given by the class name and .embed.properties. If the extracter's class is x.y.z.MyExtracter then the default properties will be picked up at classpath:/x/y/z/MyExtracter.embed.properties. Inner classes are supported, but the '$' in the class name is replaced with '-', so default properties for x.y.z.MyStuff$MyExtracter will be located using x.y.z.MyStuff-MyExtracter.embed.properties. The default mapping implementation should include thorough Javadocs so that the system administrators can accurately determine how to best enhance or override the default mapping. If the default mapping is declared in a properties file other than the one named after the class, then the readEmbedMappingProperties(String) method can be used to quickly generate the return value: protected Map<> getDefaultMapping() { return readEmbedMappingProperties(DEFAULT_MAPPING); } The map can also be created in code either statically or during the call. If no embed mapping properties file is found a reverse of the extract mapping in getDefaultMapping() will be assumed with the first QName in each value used as the key for this mapping and a last win approach for duplicates. Returns: Returns the default, static embed mapping. It may not be null. See Also: setInheritDefaultMapping(boolean inherit) getLimits protected MetadataExtracterLimits getLimits(String mimetype) Gets the metadata extracter limits for the given mimetype. A specific match for the given mimetype is tried first and if none is found a wildcard of "*" is tried, if still not found defaults value will be used Parameters: mimetype - String Returns: the found limits or default values extractRaw protected abstract Map<String,Serializable> extractRaw(ContentReader reader) throws Throwable Override to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if the default mapping doesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations. Raw values must not be trimmed or removed for any reason. Null values and empty strings are Null: Removed Empty String: Passed to the OverwritePolicy Non Serializable: Converted to String or fails if that is not possible Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example: editor: - the document editor --> cm:author title: - the document title --> cm:title user1: - the document summary user2: - the document description --> cm:description user3: - user4: - Parameters: reader - the document to extract the values from. This stream provided by the reader must be closed if accessed directly. Returns: Returns a map of document property values keyed by property name. Throws: Throwable - All exception conditions can be handled. See Also: getDefaultMapping() embedInternal protected void embedInternal(Map<String,Serializable> metadata, ContentReader reader, ContentWriter writer) throws Throwable Override to embed metadata values. An extracter should embed as many of the available properties as is realistically possible. Even if the default mapping doesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations. Parameters: metadata - the metadata keys and values to embed in the content file reader - the reader for the original document. This stream provided by the reader must be closed if accessed directly. writer - the writer for the document to embed the values in. This stream provided by the writer must be closed if accessed directly. Throws: Throwable - All exception conditions can be handled. See Also: getDefaultEmbedMapping() Overview Package Class Use Tree Deprecated Index Help PREV CLASS NEXT CLASS FRAMES NO FRAMES All Classes SUMMARY: NESTED | FIELD | CONSTR | METHOD DETAIL: FIELD | CONSTR | METHOD Copyright © 2005–2018 Alfresco Software. All rights reserved. Java API documentation generated with DocFlex/Javadoc 1.6.1 using JavadocPro template set.