HTTP Request node
Merge node
Postgres node
+19

WordPress - AI Chatbot to enhance user experience - with Supabase and OpenAI

Published 2 months ago

Created by

dataki
Dataki

Categories

Template description

This is the first version of a template for a RAG/GenAI App using WordPress content.

As creating, sharing, and improving templates brings me joy ๐Ÿ˜„, feel free to reach out on LinkedIn if you have any ideas to enhance this template!

How It Works

This template includes three workflows:

  • Workflow 1: Generate embeddings for your WordPress posts and pages, then store them in the Supabase vector store.
  • Workflow 2: Handle upserts for WordPress content when edits are made.
  • Workflow 3: Enable chat functionality by performing Retrieval-Augmented Generation (RAG) on the embedded documents.

Why use this template?

This template can be applied to various use cases:

  • Build a GenAI application that requires embedded documents from your website's content.
  • Embed or create a chatbot page on your website to enhance user experience as visitors search for information.
  • Gain insights into the types of questions visitors are asking on your website.
  • Simplify content management by asking the AI for related content ideas or checking if similar content already exists. Useful for internal linking.

Prerequisites

  • Access to Supabase for storing embeddings.
  • Basic knowledge of Postgres and pgvector.
  • A WordPress website with content to be embedded.
  • An OpenAI API key
  • Ensure that your n8n workflow, Supabase instance, and WordPress website are set to the same timezone (or use GMT) for consistency.

Workflow 1 : Initial Embedding

This workflow retrieves your WordPress pages and posts, generates embeddings from the content, and stores them in Supabase using pgvector.

Step 0 : Create Supabase tables

Nodes :

  • Postgres - Create Documents Table: This table is structured to support OpenAI embedding models with 1536 dimensions
  • Postgres - Create Workflow Execution History Table

These two nodes create tables in Supabase:

  • The documents table, which stores embeddings of your website content.
  • The n8n_website_embedding_histories table, which logs workflow executions for efficient management of upserts. This table tracks the workflow execution ID and execution timestamp.

Step 1 : Retrieve and Merge WordPress Pages and Posts

Nodes :

  • WordPress - Get All Posts
  • WordPress - Get All Pages
  • Merge WordPress Posts and Pages

These three nodes retrieve all content and metadata from your posts and pages and merge them.
**Important: ** Apply filters to avoid generating embeddings for all site content.

Step 2 : Set Fields, Apply Filter, and Transform HTML to Markdown

Nodes :

  • Set Fields
  • Filter - Only Published & Unprotected Content
  • HTML to Markdown

These three nodes prepare the content for embedding by:

  1. Setting up the necessary fields for content embeddings and document metadata.
  2. Filtering to include only published and unprotected content (protected=false), ensuring private or unpublished content is excluded from your GenAI application.
  3. Converting HTML to Markdown, which enhances performance and relevance in Retrieval-Augmented Generation (RAG) by optimizing document embeddings.

Step 3: Generate Embeddings, Store Documents in Supabase, and Log Workflow Execution

Nodes:

  • Supabase Vector Store
    • Sub-nodes:
      • Embeddings OpenAI
      • Default Data Loader
      • Token Splitter
      • Aggregate
  • Supabase - Store Workflow Execution

This step involves generating embeddings for the content and storing it in Supabase, followed by logging the workflow execution details.

  1. Generate Embeddings: The Embeddings OpenAI node generates vector embeddings for the content.
  2. Load Data: The Default Data Loader prepares the content for embedding storage. The metadata stored includes the content title, publication date, modification date, URL, and ID, which is essential for managing upserts.

โš ๏ธ Important Note : Be cautious not to store any sensitive information in metadata fields, as this information will be accessible to the AI and may appear in user-facing answers.

  1. Token Management: The Token Splitter ensures that content is segmented into manageable sizes to comply with token limits.
  2. Aggregate: Ensure the last node is run only for 1 item.
  3. Store Execution Details: The Supabase - Store Workflow Execution node saves the workflow execution ID and timestamp, enabling tracking of when each content update was processed.

This setup ensures that content embeddings are stored in Supabase for use in downstream applications, while workflow execution details are logged for consistency and version tracking.

This workflow should be executed only once for the initial embedding.
Workflow 2, described below, will handle all future upserts, ensuring that new or updated content is embedded as needed.

Workflow 2: Handle document upserts

Content on a website follows a lifecycleโ€”it may be updated, new content might be added, or, at times, content may be deleted.

In this first version of the template, the upsert workflow manages:

  • Newly added content
  • Updated content

Step 1: Retrieve WordPress Content with Regular CRON

Nodes:

  • CRON - Every 30 Seconds
  • Postgres - Get Last Workflow Execution
  • WordPress - Get Posts Modified After Last Workflow Execution
  • WordPress - Get Pages Modified After Last Workflow Execution
  • Merge Retrieved WordPress Posts and Pages

A CRON job (set to run every 30 seconds in this template, but you can adjust it as needed) initiates the workflow. A Postgres SQL query on the n8n_website_embedding_histories table retrieves the timestamp of the latest workflow execution.

Next, the HTTP nodes use the WordPress API (update the example URL in the template with your own websiteโ€™s URL and add your WordPress credentials) to request all posts and pages modified after the last workflow execution date. This process captures both newly added and recently updated content. The retrieved content is then merged for further processing.

Step 2 : Set fields, use filter

Nodes :

  • Set fields2
  • Filter - Only published and unprotected content

The same that Step 2 in Workflow 1, except that HTML To Makrdown is used in further Step.

Step 3: Loop Over Items to Identify and Route Updated vs. Newly Added Content

Here, I initially aimed to use 'update documents' instead of the delete + insert approach, but encountered challenges, especially with updating both content and metadata columns together. Any help or suggestions are welcome! :)

Nodes:

  • Loop Over Items

  • Postgres - Filter on Existing Documents

  • Switch

    • Route existing_documents (if documents with matching IDs are found in metadata):

      • Supabase - Delete Row if Document Exists: Removes any existing entry for the document, preparing for an update.
      • Aggregate2: Used to aggregate documents on Supabase with ID to ensure that Set Fields3 is executed only once for each WordPress content to avoid duplicate execution.
      • Set Fields3: Sets fields required for embedding updates.
    • Route new_documents (if no matching documents are found with IDs in metadata):

      • Set Fields4: Configures fields for embedding newly added content.

In this step, a loop processes each item, directing it based on whether the document already exists. The Aggregate2 node acts as a control to ensure Set Fields3 runs only once per WordPress content, effectively avoiding duplicate execution and optimizing the update process.

Step 4 : HTML to Markdown, Supabase Vector Store, Update Workflow Execution Table

The HTML to Markdown node mirrors Workflow 1 - Step 2. Refer to that section for a detailed explanation on how HTML content is converted to Markdown for improved embedding performance and relevance.

Following this, the content is stored in the Supabase vector store to manage embeddings efficiently. Lastly, the **workflow execution table is updated. These nodes mirros the Workflow 1 - Step 3 nodes.

Workflow 3 : An example of GenAI App with Wordpress Content : Chatbot to be embed on your website

Step 1: Retrieve Supabase Documents, Aggregate, and Set Fields After a Chat Input

Nodes:

  • When Chat Message Received
  • Supabase - Retrieve Documents from Chat Input
  • Embeddings OpenAI1
  • Aggregate Documents
  • Set Fields

When a user sends a message to the chat, the prompt (user question) is sent to the Supabase vector store retriever. The RPC function match_documents (created in Workflow 1 - Step 0) retrieves documents relevant to the userโ€™s question, enabling a more accurate and relevant response.

In this step:

  1. The Supabase vector store retriever fetches documents that match the userโ€™s question, including metadata.
  2. The Aggregate Documents node consolidates the retrieved data.
  3. Finally, Set Fields organizes the data to create a more readable input for the AI agent.

Directly using the AI agent without these nodes would prevent metadata from being sent to the language model (LLM), but metadata is essential for enhancing the context and accuracy of the AIโ€™s response. By including metadata, the AIโ€™s answers can reference relevant document details, making the interaction more informative.

Step 2: Call AI Agent, Respond to User, and Store Chat Conversation History

Nodes:

  • AI Agent
    • Sub-nodes:
      • OpenAI Chat Model
      • Postgres Chat Memories
  • Respond to Webhook

This step involves calling the AI agent to generate an answer, responding to the user, and storing the conversation history. The model used is gpt4-o-mini, chosen for its cost-efficiency.

Share Template

More Product workflow templates

Google Sheets node
+5

๐Ÿš€ Boost your customer service with this WhatsApp Business bot!

This n8n workflow demonstrates how to automate customer interactions and appointment management via WhatsApp Business bot. After submitting a Google Form, the user receives a notification via WhatsApp. These notifications are sent via a template message. In case user sends a message to the bot, the text and user data is stored in Google Sheets. To reply back to the user, fill in the ReplyText column and change the Status to 'Ready'. In a few seconds n8n will fetch the unsent replies and deliver them one by one via WhatsApp Business node. Customize this workflow to fit your specific needs, connect different online services and enhance your customer communication! ๐ŸŽ‰ Setup Instructions To get this workflow up and running, you'll need to: ๐Ÿ‘‡ Create a WhatsApp template message on the Meta Business portal. Obtain an Access Token and WhatsApp Business Account ID from the Meta Developers Portal. This is needed for the WhatsApp Business Node to send messages. Set up a WhatsApp Trigger node with App ID and App Secret from the Meta Developers Portal. Right after that copy the WhatsApp Trigger URL and add it as a Callback URL in the Meta Developers Portal. This trigger is needed to receive incoming messages and their status updates. Connect your Google Sheets account for data storage and management. Check out the documentation page. โš ๏ธ Important Notes WhatsApp allows automatic custom text messages only within 24 hours of the last user message. Outside with time frame only approved template messages can be sent. The workflow uses a Google Sheet to manage form submissions, incoming messages and prepare responses. You can replace these nodes and connect the WhatsApp bot with other systems.
eduard
Eduard
Notion node
Code node
+6

Notion AI Assistant Generator

This n8n workflow template lets teams easily generate a custom AI chat assistant based on the schema of any Notion database. Simply provide the Notion database URL, and the workflow downloads the schema and creates a tailored AI assistant designed to interact with that specific database structure. Set Up Watch this quick set up video ๐Ÿ‘‡ Key Features Instant Assistant Generation**: Enter a Notion database URL, and the workflow produces an AI assistant configured to the database schema. Advanced Querying**: The assistant performs flexible queries, filtering records by multiple fields (e.g., tags, names). It can also search inside Notion pages to pull relevant content from specific blocks. Schema Awareness**: Understands and interacts with various Notion column types like text, dates, and tags for accurate responses. Reference Links**: Each query returns direct links to the exact Notion pages that inform the assistantโ€™s response, promoting transparency and easy access. Self-Validation**: The workflow has logic to check the generated assistant, and if any errors are detected, it reruns the agent to fix them. Ideal for Product Managers**: Easily access and query product data across Notion databases. Support Teams**: Quickly search through knowledge bases for precise information to enhance support accuracy. Operations Teams**: Streamline access to HR, finance, or logistics data for fast, efficient retrieval. Data Teams**: Automate large dataset queries across multiple properties and records. How It Works This AI assistant leverages two HTTP request toolsโ€”one for querying the Notion database and another for retrieving data within individual pages. Itโ€™s powered by the Anthropic LLM (or can be swapped for GPT-4) and always provides reference links for added transparency.
max-n8n
Max Tkacz
HTTP Request node
Google Drive node
Google Calendar node
+9

Actioning Your Meeting Next Steps using Transcripts and AI

This n8n workflow demonstrates how you can summarise and automate post-meeting actions from video transcripts fed into an AI Agent. Save time between meetings by allowing AI handle the chores of organising follow-up meetings and invites. How it works This workflow scans for the calendar for client or team meetings which were held online. * Attempts will be made to fetch any recorded transcripts which are then sent to the AI agent. The AI agent summarises and identifies if any follow-on meetings are required. If found, the Agent will use its Calendar Tool to to create the event for the time, date and place for the next meeting as well as add known attendees. Requirements Google Calendar and the ability to fetch Meeting Transcripts (There is a special OAuth permission for this action!) OpenAI account for access to the LLM. Customising the workflow This example only books follow-on meetings but could be extended to generate reports or send emails.
jimleuk
Jimleuk
HTTP Request node
Merge node
+13

AI Agent To Chat With Files In Supabase Storage

Video Guide I prepared a detailed guide explaining how to set up and implement this scenario, enabling you to chat with your documents stored in Supabase using n8n. Youtube Link Who is this for? This workflow is ideal for researchers, analysts, business owners, or anyone managing a large collection of documents. It's particularly beneficial for those who need quick contextual information retrieval from text-heavy files stored in Supabase, without needing additional services like Google Drive. What problem does this workflow solve? Manually retrieving and analyzing specific information from large document repositories is time-consuming and inefficient. This workflow automates the process by vectorizing documents and enabling AI-powered interactions, making it easy to query and retrieve context-based information from uploaded files. What this workflow does The workflow integrates Supabase with an AI-powered chatbot to process, store, and query text and PDF files. The steps include: Fetching and comparing files to avoid duplicate processing. Handling file downloads and extracting content based on the file type. Converting documents into vectorized data for contextual information retrieval. Storing and querying vectorized data from a Supabase vector store. File Extraction and Processing: Automates handling of multiple file formats (e.g., PDFs, text files), and extracts document content. Vectorized Embeddings Creation: Generates embeddings for processed data to enable AI-driven interactions. Dynamic Data Querying: Allows users to query their document repository conversationally using a chatbot. Setup N8N Workflow Fetch File List from Supabase: Use Supabase to retrieve the stored file list from a specified bucket. Add logic to manage empty folder placeholders returned by Supabase, avoiding incorrect processing. Compare and Filter Files: Aggregate the files retrieved from storage and compare them to the existing list in the Supabase files table. Exclude duplicates and skip placeholder files to ensure only unprocessed files are handled. Handle File Downloads: Download new files using detailed storage configurations for public/private access. Adjust the storage settings and GET requests to match your Supabase setup. File Type Processing: Use a Switch node to target specific file types (e.g., PDFs or text files). Employ relevant tools to process the content: For PDFs, extract embedded content. For text files, directly process the text data. Content Chunking: Break large text data into smaller chunks using the Text Splitter node. Define chunk size (default: 500 tokens) and overlap to retain necessary context across chunks. Vector Embedding Creation: Generate vectorized embeddings for the processed content using OpenAI's embedding tools. Ensure metadata, such as file ID, is included for easy data retrieval. Store Vectorized Data: Save the vectorized information into a dedicated Supabase vector store. Use the default schema and table provided by Supabase for seamless setup. AI Chatbot Integration: Add a chatbot node to handle user input and retrieve relevant document chunks. Use metadata like file ID for targeted queries, especially when multiple documents are involved. Testing Upload sample files to your Supabase bucket. Verify if files are processed and stored successfully in the vector store. Ask simple conversational questions about your documents using the chatbot (e.g., "What does Chapter 1 say about the Roman Empire?"). Test for accuracy and contextual relevance of retrieved results.
lowcodingdev
Mark Shcherbakov
+3

Generate SEO Seed Keywords Using AI

What this workflow does: This flow uses an AI node to generate Seed Keywords to focus SEO efforts on based on your ideal customer profile. You can use these keywords to form part of your SEO strategy. Outputs: List of 20 Seed Keywords Setup Fill the Set Ideal Customer Profile (ICP) Connect with your credentials Replace the Connect to your own database with your own database Pre-requisites / Dependencies You know your ideal customer profile (ICP) An AI API account (either OpenAI or Anthropic recommended) Made by Simon @ automake.io
simonscrapes
simonscrapes
Google Sheets node
HTTP Request node
Markdown node
+7

โœจ Vision-Based AI Agent Scraper - with Google Sheets, ScrapingBee, and Gemini

Important Notes: Check Legal Regulations: This workflow involves scraping, so ensure you comply with the legal regulations in your country before getting started. Better safe than sorry! Workflow Description: ๐Ÿ˜ฎโ€๐Ÿ’จ Tired of struggling with XPath, CSS selectors, or DOM specificity when scraping ? This AI-powered solution is here to simplify your workflow! With a vision-based AI Agent, you can extract data effortlessly without worrying about how the DOM is structured. This workflow leverages a vision-based AI Agent, integrated with Google Sheets, ScrapingBee, and the Gemini-1.5-Pro model, to extract structured data from webpages. The AI Agent primarily uses screenshots for data extraction but switches to HTML scraping when necessary, ensuring high accuracy. Key Features: Google Sheets Integration**: Manage URLs to scrape and store structured results. ScrapingBee**: Capture full-page screenshots and retrieve HTML data for fallback extraction. AI-Powered Data Parsing**: Use Gemini-1.5-Pro for vision-based scraping and a Structured Output Parser to format extracted data into JSON. Token Efficiency**: HTML is converted to Markdown to optimize processing costs. This template is designed for e-commerce scraping but can be customized for various use cases.
dataki
Dataki

More AI workflow templates

OpenAI Chat Model node
SerpApi (Google Search) node

AI agent chat

This workflow employs OpenAI's language models and SerpAPI to create a responsive, intelligent conversational agent. It comes equipped with manual chat triggers and memory buffer capabilities to ensure seamless interactions. To use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Merge node
+7

Scrape and summarize webpages with AI

This workflow integrates both web scraping and NLP functionalities. It uses HTML parsing to extract links, HTTP requests to fetch essay content, and AI-based summarization using GPT-4o. It's an excellent example of an end-to-end automated task that is not only efficient but also provides real value by summarizing valuable content. Note that to use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Markdown node
+5

AI agent that can scrape webpages

โš™๏ธ๐Ÿ› ๏ธ๐Ÿš€๐Ÿค–๐Ÿฆพ This template is a PoC of a ReAct AI Agent capable of fetching random pages (not only Wikipedia or Google search results). On the top part there's a manual chat node connected to a LangChain ReAct Agent. The agent has access to a workflow tool for getting page content. The page content extraction starts with converting query parameters into a JSON object. There are 3 pre-defined parameters: url** โ€“ an address of the page to fetch method** = full / simplified maxlimit** - maximum length for the final page. For longer pages an error message is returned back to the agent Page content fetching is a multistep process: An HTTP Request mode tries to get the page content. If the page content was successfuly retrieved, a series of post-processing begin: Extract HTML BODY; content Remove all unnecessary tags to recude the page size Further eliminate external URLs and IMG scr values (based on the method query parameter) Remaining HTML is converted to Markdown, thus recuding the page lengh even more while preserving the basic page structure The remaining content is sent back to an Agent if it's not too long (maxlimit = 70000 by default, see CONFIG node). NB: You can isolate the HTTP Request part into a separate workflow. Check the Workflow Tool description, it guides the agent to provide a query string with several parameters instead of a JSON object. Please reach out to Eduard is you need further assistance with you n8n workflows and automations! Note that to use this template, you need to be on n8n version 1.19.4 or later.
eduard
Eduard
Merge node
Telegram node
Telegram Trigger node
+2

Telegram AI Chatbot

The workflow starts by listening for messages from Telegram users. The message is then processed, and based on its content, different actions are taken. If it's a regular chat message, the workflow generates a response using the OpenAI API and sends it back to the user. If it's a command to create an image, the workflow generates an image using the OpenAI API and sends the image to the user. If the command is unsupported, an error message is sent. Throughout the workflow, there are additional nodes for displaying notes and simulating typing actions.
eduard
Eduard
Google Sheets node
HTTP Request node
Merge node
+4

OpenAI GPT-3: Company Enrichment from website content

Enrich your company lists with OpenAI GPT-3 โ†“ Youโ€™ll get valuable information such as: Market (B2B or B2C) Industry Target Audience Value Proposition This will help you to: add more personalization to your outreach make informed decisions about which accounts to target I've made the process easy with an n8n workflow. Here is what it does: Retrieve website URLs from Google Sheets Extract the content for each website Analyze it with GPT-3 Update Google Sheets with GPT-3 data
lempire
Lucas Perret
Google Drive node
Binary Input Loader node
Embeddings OpenAI node
OpenAI Chat Model node
+5

Ask questions about a PDF using AI

The workflow first populates a Pinecone index with vectors from a Bitcoin whitepaper. Then, it waits for a manual chat message. When received, the chat message is turned into a vector and compared to the vectors in Pinecone. The most similar vectors are retrieved and passed to OpenAI for generating a chat response. Note that to use this template, you need to be on n8n version 1.19.4 or later.
davidn8n
David Roberts

Implement complex processes faster with n8n

red icon yellow icon red icon yellow icon