Angular WebUI for the Nuxeo Platform

Maretha Solutions Angular WebUI for the Nuxeo platform

Contact us for licensing opportunities

React WebUI for the Nuxeo Platform

Maretha Solutions React WebUI for the Nuxeo platform

Contact us for licensing opportunities

OpenAI Generative Image (Dall-E) Nuxeo Integration

Generates an image based on text prompt from the user using OpenAI Dall-E

Configuration

Add to nuxeo.conf the following configurations and replace ${ORGANIZATION}/${API_KEY}:

generative.ai.openai.organization=${ORGANIZATION}

generative.ai.openai.apikey=${API_KEY}



OpenAI Summary (ChatGPT) Nuxeo Integration

Generates a text summary from a file using OpenAI ChatGPT

Tehnical documentation

Architecture

This integration uses a Producer/Consumer pattern for asynchronous processing of the input file.

Input

The input for this integration is a file uploaded in Nuxeo. If the file is not in PDF format, it is converted to a PDF using the registered converters, a process called Blob Conversion.

Processes

The architecture consists of the following core concepts:

  1. Blob Conversion: This operation converts non-PDF files to the PDF format using registered converters.
  2. Nuxeo Stream (Bulk Actions): A Nuxeo Bulk Action processes text chunks and assembles the summary asynchronously. It involves:
    • SummaryServiceImpl.summaryProducer(): This function converts the document to PDF (if not already), splits text into pages, and produces records.
    • PageSummaryComputation: This component calls the OpenAI summary endpoint for each text chunk and saves the result to a Key-Value Store (KVS).
    • SummaryDoneComputation: This function saves the merged summary from the KVS of all pages in the correct order once all consumers are done.
  3. Event Listeners:
    • SummaryListener: This listener gets notified as soon as a document is created or modified. If the main blob is dirty, it fires an “extractSummaryEvent” event.
    • BulkSummarizeListener: This listener catches the “extractSummaryEvent”, converts the blob to a PDF, and splits the document into pages (producer role). It is configured to run in a dedicated queue named “bulkSummarizeListener”.

Output

The output is a Nuxeo document with the merged summary of all PDF pages. A facet, “SummaryFacet”, is dynamically added to all documents that can be summarized. The summary is saved on the document in a new property: “summary:summary”.

Additionally, the following properties are set:

  • “summary:lastComputed”: The last date the summary was computed.
  • “summary:status”: Indicates the status of the summary generation – DONE if successful, ERROR if not.

Configuration

Follow the steps below to configure the integration:

  1. Add the OpenAI API token in nuxeo.conf as openai.token:<api-token>.
  2. The default configurations include:
summary.extraction.openai.url=https://api.openai.com/v1/completions
summary.extraction.openai.model=text-davinci-003
summary.extraction.openai.temperature=0.3
summary.extraction.openai.max-tokens=200
summary.extraction.openai.top-p=1
summary.extraction.openai.presence-penalty=0
maretha.summary.maxRetries=10
feature.summary.auto.generation.enabled=true
summary.extraction.enable.mime-types=text/plain,application/pdf,application/msword,application/vnd.openxmlformats-officedocument.wordprocessingml.document
  1. Enable or disable generating the summary automatically at document creation by setting feature.summary.auto.generation.enabled to true or false.

The system supports the following mime-types: text/plain, application/pdf, application/msword, application/vnd.openxmlformats-officedocument.wordprocessingml.document.

Concurrency and retry policies are configured in summary-stream-contrib.xml. By default, maxRetries is set to 10, with Nuxeo attempting a retry every 10 seconds. However, this value can be modified as required.