OpenAI Generative Image (Dall-E) Nuxeo Integration
Generates an image based on text prompt from the user using OpenAI Dall-E
Configuration
Add to nuxeo.conf the following configurations and replace ${ORGANIZATION}/${API_KEY}:
generative.ai.openai.organization=${ORGANIZATION}
generative.ai.openai.apikey=${API_KEY}
OpenAI Summary (ChatGPT) Nuxeo Integration
Generates a text summary from a file using OpenAI ChatGPT
Tehnical documentation
Architecture
This integration uses a Producer/Consumer pattern for asynchronous processing of the input file.
Input
The input for this integration is a file uploaded in Nuxeo. If the file is not in PDF format, it is converted to a PDF using the registered converters, a process called Blob Conversion.
Processes
The architecture consists of the following core concepts:
- Blob Conversion: This operation converts non-PDF files to the PDF format using registered converters.
- Nuxeo Stream (Bulk Actions): A Nuxeo Bulk Action processes text chunks and assembles the summary asynchronously. It involves:
- SummaryServiceImpl.summaryProducer(): This function converts the document to PDF (if not already), splits text into pages, and produces records.
- PageSummaryComputation: This component calls the OpenAI summary endpoint for each text chunk and saves the result to a Key-Value Store (KVS).
- SummaryDoneComputation: This function saves the merged summary from the KVS of all pages in the correct order once all consumers are done.
- Event Listeners:
- SummaryListener: This listener gets notified as soon as a document is created or modified. If the main blob is dirty, it fires an “extractSummaryEvent” event.
- BulkSummarizeListener: This listener catches the “extractSummaryEvent”, converts the blob to a PDF, and splits the document into pages (producer role). It is configured to run in a dedicated queue named “bulkSummarizeListener”.
Output
The output is a Nuxeo document with the merged summary of all PDF pages. A facet, “SummaryFacet”, is dynamically added to all documents that can be summarized. The summary is saved on the document in a new property: “summary:summary”.
Additionally, the following properties are set:
- “summary:lastComputed”: The last date the summary was computed.
- “summary:status”: Indicates the status of the summary generation – DONE if successful, ERROR if not.
Configuration
Follow the steps below to configure the integration:
- Add the OpenAI API token in nuxeo.conf as
openai.token:<api-token>
. - The default configurations include:
summary.extraction.openai.url=https://api.openai.com/v1/completions
summary.extraction.openai.model=text-davinci-003
summary.extraction.openai.temperature=0.3
summary.extraction.openai.max-tokens=200
summary.extraction.openai.top-p=1
summary.extraction.openai.presence-penalty=0
maretha.summary.maxRetries=10
feature.summary.auto.generation.enabled=true
summary.extraction.enable.mime-types=text/plain,application/pdf,application/msword,application/vnd.openxmlformats-officedocument.wordprocessingml.document
- Enable or disable generating the summary automatically at document creation by setting
feature.summary.auto.generation.enabled
to true or false.
The system supports the following mime-types: text/plain, application/pdf, application/msword, application/vnd.openxmlformats-officedocument.wordprocessingml.document.
Concurrency and retry policies are configured in summary-stream-contrib.xml
. By default, maxRetries
is set to 10, with Nuxeo attempting a retry every 10 seconds. However, this value can be modified as required.