Transformation of PDF created with ilovepdf continues indefinitely


The attached PDF continues to be transformed until the hard disk runs out of space

Steps to reproduce:

1 Setup T Engine 1.3 with AIO Transformer at http://localhost:8090/
2 Go to http://localhost:8090/
3 In the Tika test section set source application/pdf
4 Upload the PDF attached
5 Click Transform

Expected behaviour:

The transformation completes successfully.

Observed behaviour:

The transformation does not complete and the hard disk fills up.


I tested on Windows with the new T-Engine. Customer used Linux and Legacy.

Note (astrachan) - attachments (thread dumps and test PDF) are located in and removed from this ticket.


Windows, Linux

Testcase ID



Kristian Dimitrov
April 16, 2021, 1:34 PM

Confirmed - Fixed.

Tested with docker image built from most recent transform-core master.

Command used: docker run -p 8090:8090 -e PDFBOX_NOTEXTRACTBOOKMARKS_DEFAULT='true' <AIO docker container id>

Note: Bug still reproduces if above flag is not set when the app/docker is deployed.

David Edwards
April 7, 2021, 1:20 PM

Exposes a new variable to the Tika and AIO T-engines to control the default behaviour of the notExtractBookmarksText request parameter, similar to the previous repo workaround.
This variable can be set in 1 of 2 ways:

  1. Through the application-default.yamlfile of the T-engine. Update/add the following variable:

  2. Through Environment Variable (this can be passed through to helm/ docker-compose):
    docker-compose example snippet (““ quote marks are required here):

The default value for this variable is false so that previous functionality is maintained. i.e. if notExtractBookmarksText is not passed then the transformation will, as it always has, attempt to extract the bookmarks text.

Marina Oliveira
March 30, 2021, 8:11 AM

can you suggest potential options and if my help is needed, specific how I can help?

Scott Ashcraft
March 29, 2021, 7:28 PM seems to have been broken for the last couple months. Issue is now with but I don't know current status.

David Edwards
March 26, 2021, 3:15 PM

It looks like the content.transformer.PdfBox.extractBookmarksText=false property has been removed in ACS 7.0.0 and I can confirm that there is currently no way to set extractBookmarksText to false by default. I’m currently looking into updating the tika T-engine, to accept such a parameter, which will also update the AIO engine.





Marco Tonelli


ACT Numbers


Delivery Team

Team 6

Bug Priority

Category 2