Platform Evolution with LLM Integration

As large language models (LLMs), particularly open-source models, become increasingly capable and accessible, our workflow platform is evolving to incorporate LLM-driven intelligence. The goal is to enable workflows that can reason, automate tasks, and interact with systems more dynamically.

The first focus area is evaluating and integrating open-source LLMs. This includes researching different model families, testing their capabilities, and identifying which models best support reasoning, orchestration, and automation within our platform.

The second focus area is standardizing system access through the MCP protocol. Existing connectors that interact with external systems and data stores are being adapted to MCP and exposed as Tools. By limiting LLM interactions to these MCP-based tools, we create a controlled interface between the model and enterprise systems.

This architecture acts as a guardrail: the LLM cannot directly access systems or perform arbitrary actions. Instead, it operates only through approved tools with defined capabilities, ensuring predictable behavior, security, and governance while still enabling powerful AI-driven workflows.

Workflows with Open-Source LLMs (AI Agents)

By embedding an open-source LLM directly into the platform, workflows will be able to leverage the model’s reasoning capabilities to interpret tasks, make decisions, and orchestrate workflow steps dynamically.

Let's see an example of building a workflow with and without a LLM on the platform.

This workflow is: Refund Request and in simple term it will include the following steps and stages:

→ Verify Order

→ Check Refund Policy

→ Approve or Reject

→ Send Notification (Email, Instance Message, SMS...)

Workflow without LLMs (Agent)

1. Business defines requirements and provide domain knowledge

2. Engineers analyze the process and design the workflow logic

3. AutomateFlow Framework executes the workflow

In #2, engineers write code and configure workflow per #1 requirements.

Even the generic Refund Request Workflow could cover 80% of the use cases but there are still 20% left.

E.g. adding another step between Verify Order and Check Refund Policy. It requires development and testing efforts.

Workflow with LLMs (Agent)

1. Business defines requirements, provide domain knowledge, and tools (Connectors to be converted to MCP Tools)

2. Feed the above requirements to LLM

3. AutomateFlow Framework executes the workflow

In #2, the LLMs may dynamically decide:

→ Retrieve Order Info

→ Check Refund Policy

→ Determine Eligibility

→ Call Refund API

→ Send Notification (Email, Instance Message, SMS...)

Constrain LLM Autonomy because workflows with LLMs are no longer static

Engineers need to put guardrails to these LLMs. In this case, we need to enforce LLMs only act on pre-defined Tools (Retrieve Order Info, Check Refund Policy, Determine Eligibility, and ...) and nothing else.

These Tools will be carried out as Connectors (Convert them to MCP standard protocol which will talk about later) and feed to LLMs along with the requirement and domain knowledge.

The existing Send Notification Connector, which takes in a message and send out a SMS using Twilio API. Now this Connector needs a wrapper to be on MCP standard protocol so LLMs can interact with this tool to send out notification.

Put guardrails in LLMs via a set of tools → Put guardrails the workflow → Deterministic and no surprises!

In #2, engineers write LLM prompts based on business requirements and domain knowledge as well as re-using existing and developing tools and guardrails.

MCP

MCP (Model Context Protocol) – an open standard introduced by Anthropic that lets LLMs interact with external tools and data through a unified interface. It standardizes Tool discovery, invocation, and data access for LLMs (AI agents).

Streamable HTTP – The recommended MCP transport protocol that allows LLMs (AI Agents) to communicate with MCP servers using standard HTTP requests with optional streaming responses.

All Connectors need to and will be built as Tools to support Streamable HTTP so LLMs(AI Agents) can interact them seamlessly!

The official Anthropic Model Context Protocol (MCP) website

Open Souce LLM

The following LLM will be evaluated:

Llama (1B, 3B, 8B) Tested locally on MacBook using Ollama

Qwen (0.6B, 1.8B, 4B, 7B)

DeepSeek (1.3B, 7B)

GPT-OSS (20B)

April 2025:

Tested DeepSeek, the 1.3B is really fast but it is not very stable - some tasks it does a great job but others are bad. 7B is much better but the processing speed is much much slower ;-(

July 2025:

Qwen3 came out in Late April 2025 and on general purposes it does better than DeepSeek. But it does not support structure output.

August 2025, GPT-OSS-20B came out to public and this is overall the best model. It is a much larger model compared with others ones therefore it requires a significant amount of memory (32G vs 4-12G).

Even Llama 3.2 came out late in 2024, it seems to be a good trade off between GPT-OSS and Qwen3 and Deepseek.

Llama, Qwen, and Deepseek models are tested locally on MacBook using Ollama.