public class HtmlParserContentTransformer extends AbstractContentTransformer2
Since HTML Parser was updated from v1.6 to v2.1, META tags defining an encoding for the content via http-equiv=Content-Type will ONLY be respected if the encoding of the content item itself is set to ISO-8859-1.
Tika Note - could be converted to use the Tika HTML parser, but we'd potentially need a custom text handler to replicate the current settings around links and non-breaking spaces.
StringBean
,
HTML ParsertransformerDebug
transformerConfig
Constructor and Description |
---|
HtmlParserContentTransformer() |
Modifier and Type | Method and Description |
---|---|
String |
getComments(boolean available)
Overridden to supply a comment or String of commented out transformation properties
that specify any (hard coded or implied) supported transformations.
|
boolean |
isTransformableMimetype(String sourceMimetype,
String targetMimetype,
TransformationOptions options)
Only support HTML to TEXT.
|
void |
transformInternal(org.alfresco.service.cmr.repository.ContentReader reader,
org.alfresco.service.cmr.repository.ContentWriter writer,
TransformationOptions options)
Method to be implemented by subclasses wishing to make use of the common infrastructural code
provided by this class.
|
checkTransformable, getExecutorService, getRetryTransformOnDifferentMimeType, getStrictMimeTypeCheck, getTransformationTime, getTransformationTime, isTransformationLimitedInternally, recordError, recordTime, recordTime, register, setAdditionalThreadTimout, setExecutorService, setMetadataExtracterConfig, setRegisterTransformer, setRegistry, setRetryTransformOnDifferentMimeType, setStrictMimeTypeCheck, setUseTimeoutThread, toString, transform, transform, transform
getLimits, getLimits, getLimits, getMaxPages, getMaxSourceSizeKBytes, getMaxSourceSizeKBytes, getPageLimit, getReadLimitKBytes, getReadLimitTimeMs, getTimeoutMs, isPageLimitSupported, isTransformable, isTransformable, isTransformableSize, setLimits, setMaxPages, setMaxSourceSizeKBytes, setMimetypeLimits, setPageLimit, setPageLimitsSupported, setReaderLimits, setReadLimitKBytes, setReadLimitTimeMs, setTimeoutMs, setTransformerDebug
deprecatedSetter, equals, getBeanName, getCommentsOnlySupports, getExtensionOrAny, getMimetype, getMimetypeService, getName, getSimpleName, hashCode, isExplicitTransformation, isSupportedTransformation, onlySupports, setBeanName, setExplicitTransformations, setMimetypeService, setSupportedTransformations, setTransformerConfig, setUnsupportedTransformations
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
getName, isExplicitTransformation
public boolean isTransformableMimetype(String sourceMimetype, String targetMimetype, TransformationOptions options)
isTransformableMimetype
in interface ContentTransformer
isTransformableMimetype
in class AbstractContentTransformerLimits
sourceMimetype
- the source mimetypetargetMimetype
- the target mimetypeoptions
- the transformation optionspublic String getComments(boolean available)
ContentTransformerHelper
AbstractContentTransformerLimits.isTransformableMimetype(String, String, TransformationOptions)
or ContentTransformerWorker.isTransformable(String, String, TransformationOptions)
have been overridden.
See ContentTransformerHelper.getCommentsOnlySupports(List, List, boolean)
which may be used to help construct a comment.getComments
in interface ContentTransformer
getComments
in class ContentTransformerHelper
available
- indicates if the transformer has been registered and is available to be selected.
false
indicates that the transformer is only available as a component of a
complex transformer.public void transformInternal(org.alfresco.service.cmr.repository.ContentReader reader, org.alfresco.service.cmr.repository.ContentWriter writer, TransformationOptions options) throws Exception
AbstractContentTransformer2
transformInternal
in class AbstractContentTransformer2
reader
- the source of the content to transformwriter
- the target to which to write the transformed contentoptions
- a map of options to use when performing the transformation. The map
will never be null.Exception
- exceptions will be handled by this class - subclasses can throw anythingCopyright © 2005–2017 Alfresco Software. All rights reserved.