如何快速搭建第一个基于的流水线？

摘要：layout: default title: &quot;第三章：快速上手第一个流水线&quot; 第三章：快速上手第一个流水线本章通过一个完整的实战示例，带你从零到运行第一个 GeoPipeAgent 流

第三章：快速上手第一个流水线本章通过一个完整的实战示例，带你从零到运行第一个 GeoPipeAgent 流水线，体验 AI 生成 → 框架执行的完整闭环。 3.1 准备测试数据首先创建工作目录和测试数据： mkdir -p my-gis-project/data mkdir -p my-gis-project/output cd my-gis-project 创建一个简单的测试 GeoJSON 文件 data/roads.geojson： { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": {"name": "主干道", "type": "primary"}, "geometry": { "type": "LineString", "coordinates": [[116.3, 39.9], [116.4, 39.9], [116.5, 39.95]] } }, { "type": "Feature", "properties": {"name": "次干道", "type": "secondary"}, "geometry": { "type": "LineString", "coordinates": [[116.35, 39.85], [116.35, 39.95], [116.45, 39.95]] } } ] } 3.2 编写第一个 YAML 流水线在 my-gis-project/ 目录下创建流水线文件 buffer-pipeline.yaml： pipeline: name: "道路缓冲区分析" description: "对道路数据做投影转换后进行缓冲区分析，将结果保存为 GeoJSON" variables: input_path: "data/roads.geojson" buffer_dist: 0.01 output_path: "output/road_buffer.geojson" steps: - id: load-roads use: io.read_vector params: path: "${input_path}" - id: buffer-roads use: vector.buffer params: input: "$load-roads" distance: "${buffer_dist}" cap_style: "round" - id: save-result use: io.write_vector params: input: "$buffer-roads" path: "${output_path}" format: "GeoJSON" outputs: result: "$save-result" feature_count: "$buffer-roads.feature_count" 流水线解读字段说明 pipeline.name 流水线名称，出现在报告中 pipeline.variables 可复用的变量，通过 ${变量名} 引用 pipeline.steps 步骤列表，按顺序执行 id: load-roads 步骤唯一 ID，后续步骤通过 $load-roads 引用其输出 use: io.read_vector 步骤类型，格式为类别.动作 params 步骤参数，可使用变量替换和步骤引用 $load-roads 引用 load-roads 步骤的输出（output 字段） $buffer-roads.feature_count 引用 buffer-roads 步骤 stats 中的 feature_count 值 outputs 声明流水线的最终输出，出现在 JSON 报告的 outputs 节关于坐标系：roads.geojson 使用 WGS84（EPSG:4326，单位为度），缓冲区距离 0.01 表示约 1 公里（纬度方向）。若需精确计量距离，应先用 vector.reproject 转换为投影坐标系（如 EPSG:3857，单位米）后再做缓冲。 3.3 执行流水线 geopipe-agent run buffer-pipeline.yaml 正常执行输出（JSON 格式）： { "pipeline": "道路缓冲区分析", "status": "success", "duration": 0.312, "steps": [ { "id": "load-roads", "step": "io.read_vector", "status": "success", "duration": 0.089, "output_summary": { "feature_count": 2, "crs": "EPSG:4326", "geometry_types": ["LineString"], "columns": ["name", "type", "geometry"] } }, { "id": "buffer-roads", "step": "vector.buffer", "status": "success", "duration": 0.124, "output_summary": { "feature_count": 2, "crs": "EPSG:4326", "geometry_types": ["Polygon"], "total_area": 0.00024578 } }, { "id": "save-result", "step": "io.write_vector", "status": "success", "duration": 0.098, "output_summary": { "feature_count": 2, "output_path": "output/road_buffer.geojson", "format": "GeoJSON" } } ], "outputs": { "result": "output/road_buffer.geojson", "feature_count": 2 } } 3.4 使用 --var 覆盖变量通过 --var 参数在命令行覆盖流水线变量，无需修改 YAML 文件： # 使用不同的缓冲距离 geopipe-agent run buffer-pipeline.yaml --var buffer_dist=0.02 # 同时覆盖多个变量 geopipe-agent run buffer-pipeline.yaml \ --var input_path=data/highway.geojson \ --var buffer_dist=0.05 \ --var output_path=output/highway_buffer.geojson 这对批量处理多个数据文件非常方便——只需一个 YAML 文件，通过不同的 --var 参数运行多次。 3.5 流水线校验（不执行）在正式执行之前，可以用 validate 命令检查 YAML 语法和步骤引用是否正确： geopipe-agent validate buffer-pipeline.yaml 输出示例： { "status": "valid", "pipeline": "道路缓冲区分析", "steps_count": 3, "steps": [ {"id": "load-roads", "use": "io.read_vector"}, {"id": "buffer-roads", "use": "vector.buffer"}, {"id": "save-result", "use": "io.write_vector"} ] } 如果有语法错误，会看到类似这样的错误信息： { "status": "invalid", "error": "PipelineParseError", "message": "Missing 'pipeline' key at the top level. Expected: pipeline:\n name: ...\n steps: ..." } 3.6 带投影转换的完整示例以下是一个更真实的示例，演示先投影转换再缓冲，确保缓冲距离单位正确： pipeline: name: "道路 500 米缓冲区分析（精确距离）" description: "先将 WGS84 转为 Web Mercator（米制），再做 500 米缓冲" variables: input_path: "data/roads.geojson" buffer_dist_m: 500 output_path: "output/road_buffer_500m.geojson" steps: - id: load-roads use: io.read_vector params: path: "${input_path}" - id: reproject-to-mercator use: vector.reproject params: input: "$load-roads" target_crs: "EPSG:3857" - id: buffer-500m use: vector.buffer params: input: "$reproject-to-mercator" distance: "${buffer_dist_m}" cap_style: "round" - id: reproject-back use: vector.reproject params: input: "$buffer-500m" target_crs: "EPSG:4326" - id: save-result use: io.write_vector params: input: "$reproject-back" path: "${output_path}" format: "GeoJSON" outputs: result: "$save-result" 3.7 调试模式如果流水线执行失败或结果不符合预期，使用调试模式查看详细日志： # 显示 DEBUG 级别日志 geopipe-agent run buffer-pipeline.yaml --log-level DEBUG # 使用 JSON 格式日志（便于机器解析） geopipe-agent run buffer-pipeline.yaml --json-log 3.8 查看 GIS 文件信息在编写流水线之前，可以用 info 命令快速了解数据文件的基本信息： geopipe-agent info data/roads.geojson 输出： { "path": "data/roads.geojson", "format": "vector", "feature_count": 2, "crs": "EPSG:4326", "geometry_types": ["LineString"], "columns": ["name", "type", "geometry"], "bounds": [116.3, 39.85, 116.5, 39.95] } 这对确定坐标系（决定缓冲距离单位）和了解属性字段非常有帮助。 3.9 快速 Cookbook 体验 GeoPipeAgent 自带 7 个即用型流水线示例（cookbook/ 目录），可以直接运行： # 克隆仓库后 geopipe-agent run cookbook/buffer-analysis.yaml geopipe-agent run cookbook/vector-qc.yaml geopipe-agent run cookbook/overlay-analysis.yaml 这些示例涵盖了最常见的 GIS 工作流，是学习框架的最佳起点。 3.10 本章小结本章完整演示了 GeoPipeAgent 的基本使用流程：准备数据：GeoJSON、Shapefile 等格式均可直接使用编写 YAML：在 pipeline: 下定义 variables、steps、outputs 执行流水线：geopipe-agent run <file> 一键执行，输出 JSON 报告覆盖变量：--var key=value 在运行时动态修改参数校验流水线：geopipe-agent validate <file> 在执行前检查语法查看文件信息：geopipe-agent info <file> 了解数据基本情况下一章将深入解析 YAML 流水线格式的每一个字段和规则。导航：← 第二章：安装与环境配置｜第四章：YAML 流水线格式 →

如何快速搭建第一个基于的流水线？

相关推荐