Scrapling 速查表

首页 Scrapling 速查表

0.4.8 2026-05-30 D4Vinci/Scrapling

打开 Zread CodeWiki

🔧 调整每列宽度： 340px

一句话结论

快速定位

适合：想从轻量抓取一路扩展到浏览器级抓取 / 多会话 crawl / MCP 服务化
不适合：只想要一个极简 HTML 解析器，且不想带额外框架语义
当前推荐：先直装，再按需启用 fetchers / MCP / 容器化

架构一眼看懂

scrapling.parser Selector Selectors：/
scrapling.fetchers Fetcher DynamicFetcher StealthyFetcher：/ /
scrapling.spiders Spider Request Response Scheduler：/ / /
scrapling.cli scrapling get/post/put/delete/fetch/stealthy_fetch/mcp
server.json：MCP server 元数据（stdio）
Dockerfile：官方容器化方案

安装速查

# 基础安装
pip install scrapling

# 开发态安装
pip install -e /path/to/Scrapling

# 浏览器 / 反检测能力
pip install -e '/path/to/Scrapling[fetchers]'

# MCP / markdownify 相关
pip install -e '/path/to/Scrapling[ai]'

Python 要求： >= 3.10

最小用法

from scrapling.parser import Selector

html = '<div class="product"><h2>Demo</h2><span class="price">$9</span></div>'
sel = Selector(html)
print(sel.css('.product h2::text').get())
print(sel.css('.price::text').get())

何时选“直接安装”

Selector Fetcher
想最快跑通
想嵌入现有 Python 项目
不想先处理容器 runtime / Chromium 依赖

本机实测结论：直接安装可用，且核心 HTML 解析通过。

何时选“容器化”

DynamicFetcher StealthyFetcher
要把 Chromium、Playwright、系统依赖一起封装
要给 MCP / 服务化交付一个可复现镜像

注意：本机 Podman 这次 build 因存储配置冲突失败，不是 Scrapling 本身的问题。

官方发布信号

D4Vinci/Scrapling
v0.4.8
scrapling==0.4.8
License：BSD-3-Clause
README 明确定位为 “from a single request to a full-scale crawl”

常见坑

Adaptor
fetchers
如果本机容器 build 报 storage mismatch，先修容器 runtime，再判断 Scrapling。

工作流建议

参考链接

GitHub：https://github.com/D4Vinci/Scrapling
Docs：https://scrapling.readthedocs.io/en/latest/
PyPI：https://pypi.org/project/scrapling/
Dockerfile：仓库根目录
server.json：仓库根目录