Skip to content

Commit

Permalink
[DOCS-7307] Add Document Transformation Engine 2.4 docs - initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
anxumalo committed Aug 3, 2023
1 parent d9b20d4 commit 5fe3994
Show file tree
Hide file tree
Showing 10 changed files with 487 additions and 2 deletions.
7 changes: 6 additions & 1 deletion _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -694,13 +694,18 @@ defaults:
toc: "transformation-engine"
support: true
versions:
- 2.4
- 2.3
- 2.2
- scope:
path: "transformation-engine/latest"
values:
version: 2.3
version: 2.4
latest: true
- scope:
path: "transformation-engine/2.3"
values:
version: 2.3
- scope:
path: "transformation-engine/2.2"
values:
Expand Down
20 changes: 19 additions & 1 deletion _data/toc/transformation-engine.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Document Transformation Engine
- version: 2.3
- version: 2.4
pages:
- title: 'Introduction'
path: '/transformation-engine/latest/'
Expand All @@ -17,6 +17,24 @@
path: '/transformation-engine/latest/admin/'
- title: 'Using'
path: '/transformation-engine/latest/using/'
- version: 2.3
pages:
- title: 'Introduction'
path: '/transformation-engine/2.3/'
- title: 'Install'
pages:
- title: 'Overview'
path: '/transformation-engine/2.3/install/'
- title: 'Install with MSI'
path: '/transformation-engine/2.3/install/msi/'
- title: 'Install the SDK'
path: '/transformation-engine/2.3/install/sdk/'
- title: 'Configure'
path: '/transformation-engine/2.3/config/'
- title: 'Administer'
path: '/transformation-engine/2.3/admin/'
- title: 'Using'
path: '/transformation-engine/2.3/using/'
- version: 2.2
pages:
- title: 'Introduction'
Expand Down
29 changes: 29 additions & 0 deletions transformation-engine/2.3/admin/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: Administer the Document Transformation Engine
---

The Document Transformation Engine can be integrated with monitoring tools such as Nagios or Hyperic, by using HTTP REST calls.

The tool should call the Document Transformation Engine URL with a set of parameters and then monitor the response.

Two calls are available:

* Connection tester call

This call is also used by the Alfresco Transformation client to test availability. It checks the transformation service is up and responding.

1. URL: `http://<transformation-host>:<port>/transformation-backend/service/transform/v1/version`

2. HTTP Method: `GET`

3. Make sure that you include basic authentication credentials to your call.

* Transformation execution call

This call gets an Office file from the Transformation Service to check whether the transformation engine is still functioning (the Transformation Service makes an internal post, but the HTTP method is still a GET call). This can be used for more in-depth monitoring.

1. URL: `http://<transformation-host>:<port>/transformation-backend/service/transform/v1/available`

2. HTTP Method: `GET`

3. Make sure that you include basic authentication credentials to your call.
134 changes: 134 additions & 0 deletions transformation-engine/2.3/config/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: Configure the Document Transformation Engine
---

The standalone Document Transformation Engine can be configured using the Web Console. You only need to change the password of the transformation service.

1. Open your browser and navigate to `http://<transformation-host>:<port>/transformation-server/#/settings` or `https://` if you are using SSL.

2. Enter your login name and a password.

By default, the login name is set to `alfresco`, and the password is set to `alfresco`. The login name `alfresco` cannot be changed.

3. Enter a new password, and then click **Change** to save the password.

<!-- WILL NEED ADDING BACK IN FOR 3.2.1
4. To set up SSL with the Document Transformation Engine, update or replace the keystore in the default location: `C:\\Program Files (x86)\\TransformationServer\\tomcat\\conf\\.keystore` using the method described in [Configuring SSL for a test environment]({% link content-services/latest/admin/security.md %}#managealfkeystores).
See [Managing Alfresco keystores]({% link content-services/latest/config/repository.md %}#configure-ssl-for-a-test-environment) for more information about keystores.
## Configure the Alfresco Transformation client
There are three ways to configure the Alfresco Transformation client:
* Using the `alfresco-global.properties` file
* Using a JMX client, if you have installed the Oracle Java SE Development Kit (JDK)
* Using the `default-configuration.properties` file
### Transformation timeout considerations
There are a number of timeout settings in Alfresco Content Services that affect the Document Transformation Engine. These are the defaults:
```bash
content.transformer.default.timeoutMs=120000
transformserver.transformationTimeout=300
transformer.timeout.default=300
```
`content.transformer.default.timeoutMs` is the system transformation timeout (set to 120000 milliseconds by default), but the Document Transformation Engine is controlled by `transformserver.transformationTimeout` and `transformer.timeout.default`. This means that with the default settings, Alfresco Content Services stops processing after 120 seconds, whereas the Document Transformation Engine attempts to transform a document for up to 300 seconds and any results returned after 120 seconds are ignored.
Set the following to configure the Document Transformation Engine to stop processing at the same time as the default system transformation timeout:
```bash
transformserver.transformationTimeout=120
transformer.timeout.default=120
```
### Configuration using the `global-properties.file`
You configure the Alfresco Transformation client by adding the relevant properties to the global properties file.
1. Open the `alfresco-global.properties` file.
2. Add the required properties for configuration settings on the Alfresco Transformation client.
3. Save the `alfresco-global.properties` file, and then restart your server.
The following table shows an overview of the available properties:
| Property | Description |
| -------- | ----------- |
|transformserver.aliveCheckTimeout | Sets the timeout for the connection tester in seconds. If the Document Transformation Engine does not answer in this time interval, it is considered to be off line. The default value is `2`. |
| transformserver.test.cronExpression | Sets the cron expression that defines how often the connection tester will check. The default is every 10 seconds: `0/10 * * * * ?` |
| transformserver.disableSSLCertificateValidation | Set this property to true to allow self-signed certificates (that is, it is not issued by an official Cert Authority). The default is `false`.|
| transformserver.username | The user name used to connect to the Document Transformation Engine. **Note:** **Do not change** from the default `alfresco`. |
| transformserver.password | The password used to connect to the Document Transformation Engine. **Note:** **Always change** the password from the default `alfresco`. |
| transformserver.qualityPreference | There are two values for this property. The default is `QUALITY`. {::nomarkdown}<ul><li>QUALITY: optimizes the preview for quality.</li><li>SIZE: optimizes the preview for size. This is interesting if you have a lot of big Office documents, for example, PPT file over 100 MB.</li></ul>{:/} |
| transformserver.transformationTimeout | Sets the time in seconds to wait for the transformation to complete before assuming that it has hung and therefore stop the transformation. If you are transforming very large or complex files, this time can be increased. The default is `300`. |
| transformserver.url | The URL of your Document Transformation Engine (or the network load balancer if you are using more then one transformation engine). Use `https://` if you want to use encrypted communication between the Alfresco Content Services server and the Document Transformation Engine. |
| transformserver.usePDF_A | Use this setting to transform PDF to PDF/A or to keep PDF/A in PDF/A format. The default is `false`. |
In a normal setup, you will always overwrite the `transformserver.password` and `transformserver.url` properties. If you want to use SSL encryption with the default certificate of the transformation engine, make sure that you set `transformserver.disableSSLCertificateValidation=true`.
### Configuration using JMX
The Alfresco Transformation client configuration parameters are exposed as JMX MBeans, which means that you can view and set the parameters using a JMX client.
See [Using a JMX client to change settings dynamically]({% link content-services/latest/config/index.md %}#using-jmx-client-to-change-settings-dynamically) for instructions on how to connect a JMX client to your server.
### Configuration using the default configuration properties file
You can configure timeout values in the Alfresco Transformation client by adding the relevant properties to the transformation engine configuration file in `C:\\Program Files (x86)\\TransformationServer\\tomcat\\webapps\\transformation-server\\WEB-INF\\classes\\default-configuration.properties`.
Use the code sample to set these timeouts:
```bash
# transformer timeout in seconds
transformer.timeout.default=300
transformer.timeout.word = ${transformer.timeout.default}
transformer.timeout.excel = ${transformer.timeout.default}
transformer.timeout.powerpoint = ${transformer.timeout.default}
```
-->

## Configure DTE with SSL

Below is a very basic example of how to configure Secure Sockets Layer (SSL) for DTE. It forms a good starting point for customers with experience and competencies in DevOps.

1. Edit `C:\Program Files (x86)\TransformationServer\tomcat\conf\server.xml`:

For example:

1. Comment out this connector:

```xml
<Connector executor="tomcatThreadPool"
port="${https.port}" protocol="org.apache.coyote.http11.Http11NioProtocol"
SSLEnabled="true">
<SSLHostConfig>
<Certificate certificateKeystoreFile="conf/.keystore" certificateKeystorePassword="tomcat" type="RSA" />
</SSLHostConfig>
</Connector>
```

2. Uncomment this Connector:

```xml
<Connector executor="tomcatThreadPool"
port="${https.port}" protocol="org.apache.coyote.http11.Http11NioProtocol"
SSLEnabled="true" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS"
keystoreFile="PATH_TO_KEYSTORE" keystorePass="KEYSTORE_PASSWORD" />
```

2. Check the REST configuration URL under: `https://<dte-hostname>:8443/transformation-server/#/settings`:

This should be set to: `https://<dte-hostname>:8443`.

3. Edit `alfresco-global.properties`:

Change `localTransform.transform-dte.url=http:<dte-hostname>:8080/transform-dte`

to `localTransform.transform-dte.url=http:<dte-hostname>:8443/transform-dte`

For more information on configuring SSL on Tomcat, see the Tomcat documentation [SSL/TLS Configuration How-To](https://tomcat.apache.org/tomcat-9.0-doc/ssl-howto.html){:target="_blank"}.
19 changes: 19 additions & 0 deletions transformation-engine/2.3/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: Alfresco Document Transformation Engine
---

The Document Transformation Engine is a stable, fast, and scalable solution for high-quality transformations of Microsoft Office documents (Word, Excel, and PowerPoint only) to PDF. It is an enterprise alternative to LibreOffice. It is an Alfresco Content Services module that is enabled with a license key.

The engine features an open architecture and offers the following features:

* **High quality**: The Document Transformation Engine uses genuine Microsoft Office software to transform Word, Excel, and PowerPoint documents to PDF. This guarantees the handling of the supported file types and pixel-perfect transformations, and it corrects previous layout issues in the Share preview feature.

The Document Transformation Engine can also be used to convert emails to PDFs. This is a useful feature in conjunction with the Outlook Plugin.

* **Scalable**: The Document Transformation Engine communicates with Alfresco Content Services using an HTTP REST API, which means that you can scale up by adding multiple instances of the engine and connecting them through a standard HTTP Network Load Balancer.

* **Stable**: If Microsoft Office can open and transform your document, then so can the Document Transformation Engine. Robust error handling will take care of corrupt and encrypted documents. A Web Console shows you a detailed report if there is a problem during transformation, allowing you to correct documents.

* **Fast**: The Document Transformation Engine is two to three times faster when transforming multi-megabyte Office documents when compared with LibreOffice on the same hardware.

* **Extensible format support**: The Document Transformation Engine supports the transformation of MS Office formats.
60 changes: 60 additions & 0 deletions transformation-engine/2.3/install/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Installation overview
---

The standalone Document Transformation Engine runs on Microsoft Windows and provides file transformations.

## Prerequisites

There are a number of important notes to consider when installing the Document Transformation Engine in addition to the [supported platforms]({% link transformation-engine/2.3/support/index.md %}).

* The Document Transformation Engine requires an installation of [Alfresco Transform Service]({% link transform-service/latest/install/index.md %}).

* The standalone Document Transformation Engine requires the software components to be installed and available on the same machine.

* Only install the English versions of Microsoft Windows Server 2012, Microsoft Windows Server 2016 or Microsoft Windows Server 2019, and Microsoft Office because other languages cause encoding issues resulting in unpredictable behavior.

> **Note:** Although the engine must be configured in English, this has no impact on the transformation language used for documents.
* Microsoft Office 2016 or 2019 32-bit & 64-bit.

* To enable the Document Transformation Engine to work with non-English documents you must install the desired Microsoft Office language pack of the language you want to work with.

* The Document Transformation Engine does not work with Windows non-English regional settings.

* Make sure that the Windows print spooler service is running.

### Sizing

There are a number of recommendations for calculating sizing. You will need:

* Four high clocked cores per engine, with between 4 GB and 6 GB RAM. If you find that you need more power, it is better to add another engine instance with a similar specification than to upgrade the hardware. The reason for this is that Microsoft Office is not very scalable.

* Between 10 GB and 15 GB of free space. Storage is not that important, but if you have lots of large files, you should make sure that creating temporary copies of those files will not slow the system down.

* Gigabit Ethernet.

* At least one CPU for each concurrent transformation that is expected to be processed by the engine.

### Disc I/O bandwidth

Microsoft Office transformations are I/O-heavy, and so on some solutions, I/O contention can be a performance bottleneck. When multiple Word conversions occur in parallel, performance can suffer heavily from poor random read and write speeds.

## Installation

The Document Transformation Engine is installed using an `msi` file where you can select to install a T-Engine at the same time. Alternatively you can install the Document Transformation Engine using the `msi` and use Docker Compose to install the T-Engine. See [Install with MSI]({% link transformation-engine/2.3/install/msi.md %}) for more details. There is also an [SDK that can be installed]({% link transformation-engine/2.3/install/sdk.md %}).

### Set `JAVA_HOME`

If you're using any JDK which does not set a registry key, you need to manually set the `JAVA_HOME` system variable. This mostly happens when using a `zip` package installation of the JDK.

1. Locate your JDK installation (it's most likely in a directory such as `C:\Program Files\jdk-11.x.x`).
2. Search for **Advanced system settings**.
3. Select **View advanced system settings > Environment Variables**.
4. In the **System variables** section, click **New** (or **User variables** for a single user setting).
5. Add the following settings:

* Variable name = `JAVA_HOME`
* Variable value = path to the JDK installation (from step 1).

6. Click **OK** (twice) and finally click **Apply** to save the changes.
Loading

0 comments on commit 5fe3994

Please sign in to comment.