Files
LangBot/src/langbot/pkg/platform/sources/websocket_adapter.py
fdc310 564d829e25 Feat/webpage adapter (#2135)
* feat: add web_page_bot adapter and embed widget

- Implemented a new `web_page_bot` adapter for embedding chat widgets on websites.
- Created a new YAML configuration file for `web_page_bot` with necessary metadata and execution details.
- Developed the `WebPageBotAdapter` class to handle message events and manage listeners.
- Added a JavaScript widget for embedding the chat interface, including styles and functionality for user interaction.
- Updated WebSocket handling to support the new bot adapter and manage connections.
- Enhanced the bot form to include pipeline UUID and adapter configuration in the system context.
- Introduced a new dynamic form item type for embed code in the form entity.

* feat(embed): add feedback submission and image upload functionality to embed widget

* feat(embed): add reset session endpoint for embed widget and improve WebSocket image handling

* feat(widget): remove typing indicator display logic from message handling

* fix(embed): security hardening for embed widget

- Add UUID format validation for pipeline_uuid parameters
- Add Cloudflare Turnstile integration for bot protection (optional)
- Add HMAC-signed session tokens for /messages, /reset, /feedback endpoints
- Sanitize error responses (remove internal exception details)
- Sanitize base_url before JS injection
- Fix XSS in markdown link rendering (only allow http/https protocols)
- Fix XSS in image URL extraction (only allow http/https/data protocols)
- Escape widget title in embed code snippet (HTML entity encoding)
- Remove class-level mutable default in WebPageBotAdapter
- Remove duplicate config line and console.log in widget.js
- Add turnstile_site_key and turnstile_secret_key config fields

* style: fix prettier formatting for chained replace calls

* fix(embed): declare listeners as Pydantic field in WebPageBotAdapter

The base class is a Pydantic BaseModel, so listeners must be declared
as a field (with default_factory) rather than assigned in __init__.
Also keep the __init__ to convert positional args to keyword args for
Pydantic compatibility with botmgr's calling convention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(embed): use bot_uuid instead of pipeline_uuid in all embed URLs

Replace pipeline_uuid with bot_uuid in all user-facing embed widget
URLs so internal pipeline identifiers are never exposed. The server
resolves bot_uuid to the owning web_page_bot, validates it is enabled
and has a pipeline bound, then routes internally using pipeline_uuid.

Add a dedicated WebSocket endpoint at /api/v1/embed/<bot_uuid>/ws/connect
instead of reusing the pipeline debug path. Wire WebPageBotAdapter to
proxy reply_message calls through the WebSocket adapter so dashboard
shows the correct adapter name while replies are still delivered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(embed): improve Turnstile config field descriptions

Add guidance on where to obtain the keys (Cloudflare dashboard) and
clarify that leaving them empty disables the feature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(embed): add multi-language support for embed widget

Add a language selector to the web_page_bot config with 8 locales
(en, zh-Hans, zh-Hant, ja, es, ru, th, vi). The backend injects the
locale into widget.js which uses a built-in i18n dictionary for all
user-facing strings (welcome message, placeholder, aria labels, error
messages, powered-by footer).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(embed): use correct select option format for language selector

Options must use name/label (i18n object) format, not value/label
(plain string), to match the dynamic form renderer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(embed): adjust footer padding and link to langbot.app

Increase footer padding for more breathing room from the bottom edge.
Change powered-by link from GitHub repo to langbot.app.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(embed): ignore Enter key during IME composition

Check e.isComposing before treating Enter as send, so confirming
an IME candidate (e.g. Chinese/Japanese input) does not also fire
the message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(embed): center bubble icon and fill entire circle

Make .lb-chat-icon span fill the full bubble area so the logo image
covers the circle completely without exposing the blue background.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(embed): add bubble icon presets selector

Add 6 bubble icon options (LangBot logo, chat bubble, robot, headset,
sparkle, message) configurable in the bot settings. Icons are inline
SVGs in widget.js, selected via a config field injected by the backend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: RockChinQ <rockchinq@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 15:36:14 +08:00

482 lines
19 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""WebSocket适配器 - 支持双向通信的IM系统"""
import asyncio
import logging
import typing
from datetime import datetime
import pydantic
import langbot_plugin.api.definition.abstract.platform.adapter as abstract_platform_adapter
import langbot_plugin.api.entities.builtin.platform.message as platform_message
import langbot_plugin.api.entities.builtin.platform.events as platform_events
import langbot_plugin.api.entities.builtin.platform.entities as platform_entities
import langbot_plugin.api.definition.abstract.platform.event_logger as abstract_platform_logger
from ...core import app
from .websocket_manager import ws_connection_manager, WebSocketConnection
logger = logging.getLogger(__name__)
class WebSocketMessage(pydantic.BaseModel):
"""WebSocket消息格式"""
id: int
role: str # 'user' or 'assistant'
content: str
message_chain: list[dict]
timestamp: str
is_final: bool = False
connection_id: str = ''
"""发送者连接ID"""
class WebSocketSession:
"""WebSocket会话 - 管理单个会话的消息历史"""
id: str
message_lists: dict[str, list[WebSocketMessage]] = {}
"""消息列表 {pipeline_uuid: [messages]}"""
stream_message_indexes: dict[str, dict[str, int]] = {}
"""流式消息索引 {pipeline_uuid: {resp_message_id: message_index}}"""
def __init__(self, id: str):
self.id = id
self.message_lists = {}
self.stream_message_indexes = {}
def get_message_list(self, pipeline_uuid: str) -> list[WebSocketMessage]:
if pipeline_uuid not in self.message_lists:
self.message_lists[pipeline_uuid] = []
return self.message_lists[pipeline_uuid]
def get_stream_message_indexes(self, pipeline_uuid: str) -> dict[str, int]:
if pipeline_uuid not in self.stream_message_indexes:
self.stream_message_indexes[pipeline_uuid] = {}
return self.stream_message_indexes[pipeline_uuid]
class WebSocketAdapter(abstract_platform_adapter.AbstractMessagePlatformAdapter):
"""WebSocket适配器 - 支持双向实时通信"""
websocket_person_session: WebSocketSession = pydantic.Field(exclude=True, default_factory=WebSocketSession)
websocket_group_session: WebSocketSession = pydantic.Field(exclude=True, default_factory=WebSocketSession)
listeners: dict[
typing.Type[platform_events.Event],
typing.Callable[[platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], None],
] = pydantic.Field(default_factory=dict, exclude=True)
ap: app.Application = pydantic.Field(exclude=True)
# 主动推送消息的队列
outbound_message_queue: asyncio.Queue = pydantic.Field(default_factory=asyncio.Queue, exclude=True)
"""后端主动推送消息的队列"""
# 流式输出开关
stream_enabled: bool = pydantic.Field(default=True, exclude=True)
"""是否启用流式输出"""
def __init__(self, config: dict, logger: abstract_platform_logger.AbstractEventLogger, **kwargs):
super().__init__(
config=config,
logger=logger,
**kwargs,
)
self.websocket_person_session = WebSocketSession(id='websocketperson')
self.websocket_group_session = WebSocketSession(id='websocketgroup')
self.bot_account_id = 'websocketbot'
self.outbound_message_queue = asyncio.Queue()
self.stream_enabled = True
async def send_message(
self,
target_type: str,
target_id: str,
message: platform_message.MessageChain,
) -> dict:
"""发送消息 - 这里用于主动推送消息到前端
对于 WebSocket 适配器,我们需要将消息广播到正确的 pipeline 连接。
target_id 可能是 launcher_id如 websocket_xxx或 pipeline_uuid。
我们需要尝试两种方式来确保消息能够送达。
"""
# 获取当前的 pipeline_uuid
pipeline_uuid = self.ap.platform_mgr.websocket_proxy_bot.bot_entity.use_pipeline_uuid
session_type = 'group' if target_type == 'group' else 'person'
# 选择会话
session = self.websocket_group_session if session_type == 'group' else self.websocket_person_session
# 生成唯一消息ID
msg_id = len(session.get_message_list(pipeline_uuid)) + 1
message_data = WebSocketMessage(
id=msg_id,
role='assistant',
content=str(message),
message_chain=[component.__dict__ for component in message],
timestamp=datetime.now().isoformat(),
is_final=True,
)
# 保存到历史记录
session.get_message_list(pipeline_uuid).append(message_data)
# 直接广播到当前pipeline的连接
await ws_connection_manager.broadcast_to_pipeline(
pipeline_uuid,
{
'type': 'response',
'session_type': session_type,
'data': message_data.model_dump(),
},
session_type=session_type,
)
return message_data.model_dump()
async def reply_message(
self,
message_source: platform_events.MessageEvent,
message: platform_message.MessageChain,
quote_origin: bool = False,
) -> dict:
"""回复消息 - 非流式"""
# 获取会话和pipeline信息
session = (
self.websocket_group_session
if isinstance(message_source, platform_events.GroupMessage)
else self.websocket_person_session
)
# 从message_source获取pipeline_uuid和connection_id
pipeline_uuid = self.ap.platform_mgr.websocket_proxy_bot.bot_entity.use_pipeline_uuid
session_type = 'group' if isinstance(message_source, platform_events.GroupMessage) else 'person'
# 生成新的消息ID
msg_id = len(session.get_message_list(pipeline_uuid)) + 1
message_data = WebSocketMessage(
id=msg_id,
role='assistant',
content=str(message),
message_chain=[component.__dict__ for component in message],
timestamp=datetime.now().isoformat(),
is_final=True,
)
# 保存到历史记录
session.get_message_list(pipeline_uuid).append(message_data)
# 直接广播到所有该pipeline的连接包含session_type信息
await ws_connection_manager.broadcast_to_pipeline(
pipeline_uuid,
{
'type': 'response',
'session_type': session_type,
'data': message_data.model_dump(),
},
session_type=session_type,
)
return message_data.model_dump()
async def reply_message_chunk(
self,
message_source: platform_events.MessageEvent,
bot_message,
message: platform_message.MessageChain,
quote_origin: bool = False,
is_final: bool = False,
) -> dict:
"""回复消息块 - 流式"""
# 获取会话和pipeline信息
session = (
self.websocket_group_session
if isinstance(message_source, platform_events.GroupMessage)
else self.websocket_person_session
)
pipeline_uuid = self.ap.platform_mgr.websocket_proxy_bot.bot_entity.use_pipeline_uuid
session_type = 'group' if isinstance(message_source, platform_events.GroupMessage) else 'person'
message_list = session.get_message_list(pipeline_uuid)
stream_message_indexes = session.get_stream_message_indexes(pipeline_uuid)
# Streaming messages in LangBot have a stable resp_message_id during the same assistant reply.
# Use it as the primary key to avoid overwriting an old card from a previous reply.
resp_message_id = str(getattr(bot_message, 'resp_message_id', '') or '')
existing_index = stream_message_indexes.get(resp_message_id) if resp_message_id else None
message_is_final = is_final and bot_message.tool_calls is None
if existing_index is None or existing_index >= len(message_list):
# 创建新消息
msg_id = len(message_list) + 1
message_data = WebSocketMessage(
id=msg_id,
role='assistant',
content=str(message),
message_chain=[component.__dict__ for component in message],
timestamp=datetime.now().isoformat(),
is_final=message_is_final,
)
# 立即添加到历史记录即使is_final=False以便后续块可以更新它
message_list.append(message_data)
if resp_message_id:
stream_message_indexes[resp_message_id] = len(message_list) - 1
else:
# 更新同一条流式消息
old_message = message_list[existing_index]
msg_id = old_message.id
message_data = WebSocketMessage(
id=msg_id,
role='assistant',
content=str(message),
message_chain=[component.__dict__ for component in message],
timestamp=old_message.timestamp, # 保持原始时间戳
is_final=message_is_final,
)
# 更新历史记录中的对应消息
message_list[existing_index] = message_data
if message_is_final and resp_message_id:
stream_message_indexes.pop(resp_message_id, None)
# 直接广播到所有该pipeline的连接包含session_type信息
await ws_connection_manager.broadcast_to_pipeline(
pipeline_uuid,
{
'type': 'response',
'session_type': session_type,
'data': message_data.model_dump(),
},
session_type=session_type,
)
return message_data.model_dump()
async def is_stream_output_supported(self) -> bool:
"""根据stream_enabled标志返回是否支持流式输出"""
return self.stream_enabled
def register_listener(
self,
event_type: typing.Type[platform_events.Event],
func: typing.Callable[
[platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], typing.Awaitable[None]
],
):
"""注册事件监听器"""
self.listeners[event_type] = func
def unregister_listener(
self,
event_type: typing.Type[platform_events.Event],
func: typing.Callable[
[platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], typing.Awaitable[None]
],
):
"""取消注册事件监听器"""
del self.listeners[event_type]
async def is_muted(self, group_id: int) -> bool:
return False
async def run_async(self):
"""运行适配器"""
try:
while True:
# 处理主动推送消息
if not self.outbound_message_queue.empty():
try:
message = await asyncio.wait_for(self.outbound_message_queue.get(), timeout=0.1)
# 广播到所有相关连接
target_id = message.get('target_id', '')
await ws_connection_manager.broadcast_to_pipeline(target_id, message)
except asyncio.TimeoutError:
pass
await asyncio.sleep(0.1)
except asyncio.CancelledError:
raise
async def kill(self):
"""停止适配器"""
pass
async def _process_image_components(self, message_chain_obj: list):
"""
处理消息链中的图片和文件组件将path转换为base64
Args:
message_chain_obj: 消息链对象列表
"""
import base64
storage_mgr = self.ap.storage_mgr
for component in message_chain_obj:
comp_type = component.get('type', '')
comp_path = component.get('path', '')
if not comp_path:
continue
if comp_type == 'Image':
try:
file_content = await storage_mgr.storage_provider.load(comp_path)
base64_str = base64.b64encode(file_content).decode('utf-8')
file_key = comp_path
if file_key.lower().endswith(('.jpg', '.jpeg')):
mime_type = 'image/jpeg'
elif file_key.lower().endswith('.png'):
mime_type = 'image/png'
elif file_key.lower().endswith('.gif'):
mime_type = 'image/gif'
elif file_key.lower().endswith('.webp'):
mime_type = 'image/webp'
else:
mime_type = 'image/png'
component['base64'] = f'data:{mime_type};base64,{base64_str}'
await storage_mgr.storage_provider.delete(comp_path)
component['path'] = ''
except Exception as e:
await self.logger.error(f'Failed to load image file {comp_path}: {e}')
async def handle_websocket_message(
self,
connection: WebSocketConnection,
message_data: dict,
owner_bot=None,
):
"""
处理从WebSocket接收的消息
这个方法只负责接收消息、保存到历史记录、并触发事件处理
不等待任何响应响应消息会通过reply_message/reply_message_chunk直接发送
Args:
connection: WebSocket连接对象
message_data: 消息数据,包含:
- message: 消息链
- stream: 是否启用流式输出 (可选默认True)
owner_bot: Optional RuntimeBot that owns this pipeline (e.g. a web_page_bot).
When provided, its identity is used for logging and session tracking.
"""
pipeline_uuid = connection.pipeline_uuid
session_type = connection.session_type
# 获取stream参数默认为True
self.stream_enabled = message_data.get('stream', True)
# 选择会话
use_session = self.websocket_group_session if session_type == 'group' else self.websocket_person_session
# 解析消息链
message_chain_obj = message_data.get('message', [])
# 处理图片组件将path转换为base64
await self._process_image_components(message_chain_obj)
message_chain = platform_message.MessageChain.model_validate(message_chain_obj)
# 生成消息ID
message_id = len(use_session.get_message_list(pipeline_uuid)) + 1
# 保存用户消息
user_message = WebSocketMessage(
id=message_id,
role='user',
content=str(message_chain),
message_chain=message_chain_obj,
timestamp=datetime.now().isoformat(),
connection_id=connection.connection_id,
is_final=True, # 用户消息始终是完整的,非流式
)
use_session.get_message_list(pipeline_uuid).append(user_message)
# 广播用户消息到所有连接包括发送者包含session_type信息
await ws_connection_manager.broadcast_to_pipeline(
pipeline_uuid,
{
'type': 'user_message',
'session_type': session_type,
'data': user_message.model_dump(),
},
session_type=session_type,
)
# 添加消息源
message_chain.insert(0, platform_message.Source(id=message_id, time=datetime.now().timestamp()))
# 创建事件
if session_type == 'person':
sender = platform_entities.Friend(
id=f'websocket_{connection.connection_id}', nickname='User', remark='User'
)
event = platform_events.FriendMessage(
sender=sender, message_chain=message_chain, time=datetime.now().timestamp()
)
else:
group = platform_entities.Group(
id='websocketgroup', name='Group', permission=platform_entities.Permission.Member
)
sender = platform_entities.GroupMember(
id=f'websocket_{connection.connection_id}',
member_name='User',
group=group,
permission=platform_entities.Permission.Member,
)
event = platform_events.GroupMessage(
sender=sender, message_chain=message_chain, time=datetime.now().timestamp()
)
# 设置流水线UUID (proxy bot always needs it for reply_message routing)
self.ap.platform_mgr.websocket_proxy_bot.bot_entity.use_pipeline_uuid = pipeline_uuid
if owner_bot is not None:
owner_bot.bot_entity.use_pipeline_uuid = pipeline_uuid
# 异步触发事件处理
# Use owner_bot's listeners if available, otherwise fall back to proxy bot
listeners = (
owner_bot.adapter.listeners
if (owner_bot and hasattr(owner_bot.adapter, 'listeners') and owner_bot.adapter.listeners)
else self.listeners
)
# Pass owner_bot's adapter so that downstream logging / dashboard
# attributes the message to the correct bot adapter name.
# Wire the ws adapter into the owner so replies are actually delivered.
if owner_bot and hasattr(owner_bot.adapter, 'set_ws_adapter'):
owner_bot.adapter.set_ws_adapter(self)
callback_adapter = owner_bot.adapter if (owner_bot and hasattr(owner_bot, 'adapter')) else self
if event.__class__ in listeners:
asyncio.create_task(listeners[event.__class__](event, callback_adapter))
def get_websocket_messages(self, pipeline_uuid: str, session_type: str) -> list[dict]:
"""获取消息历史"""
if session_type == 'person':
return [message.model_dump() for message in self.websocket_person_session.get_message_list(pipeline_uuid)]
else:
return [message.model_dump() for message in self.websocket_group_session.get_message_list(pipeline_uuid)]
def reset_session(self, pipeline_uuid: str, session_type: str):
"""重置会话"""
if session_type == 'person':
if pipeline_uuid in self.websocket_person_session.message_lists:
self.websocket_person_session.message_lists[pipeline_uuid] = []
if pipeline_uuid in self.websocket_person_session.stream_message_indexes:
self.websocket_person_session.stream_message_indexes[pipeline_uuid] = {}
else:
if pipeline_uuid in self.websocket_group_session.message_lists:
self.websocket_group_session.message_lists[pipeline_uuid] = []
if pipeline_uuid in self.websocket_group_session.stream_message_indexes:
self.websocket_group_session.stream_message_indexes[pipeline_uuid] = {}