如何将Ubuntu系统部署Pandawiki智能知识库?

摘要:一、系统环境准备 bash # 更新系统 sudo apt update && sudo apt upgrade -y sudo apt install -y git python3-pip
一、系统环境准备 bash # 更新系统 sudo apt update && sudo apt upgrade -y sudo apt install -y git python3-pip python3-venv nginx ufw certbot # 安装PostgreSQL数据库 sudo apt install -y postgresql postgresql-contrib sudo -u postgres psql -c "CREATE DATABASE pandawiki;" sudo -u postgres psql -c "CREATE USER wikiadmin WITH PASSWORD 'YourStrongPassword!';" sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE pandawiki TO wikiadmin;" # 创建应用目录 mkdir -p /opt/pandawiki && cd /opt/pandawiki
二、Pandawiki部署流程 1. 获取源代码 bash git clone https://github.com/pandawiki/pandawiki.git cd pandawiki # 创建虚拟环境 python3 -m venv venv source venv/bin/activate 2. 安装依赖 bash pip install --upgrade pip pip install -r requirements.txt pip install gunicorn gevent psycopg2-binary 3. 配置应用 bash # 创建配置文件 cp config.example.py config.py nano config.py python # 关键配置修改 SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://wikiadmin:YourStrongPassword!@localhost/pandawiki' SECRET_KEY = '生成32位密钥(openssl rand -hex 32)' ALLOW_REGISTRATION = False # 生产环境禁用开放注册 UPLOAD_FOLDER = '/opt/pandawiki/uploads' # 上传目录 4. 初始化数据库 bash flask db upgrade flask init-data # 创建上传目录 sudo mkdir -p /opt/pandawiki/uploads sudo chown -R $USER:$USER /opt/pandawiki 5. 测试运行 bash gunicorn --bind 0.0.0.0:8000 app:app # 访问 http://<服务器IP>:8000 验证
三、生产环境部署 1. 创建系统服务 bash sudo nano /etc/systemd/system/pandawiki.service ini [Unit] Description=Pandawiki Gunicorn Service After=network.target postgresql.service [Service] User=ubuntu Group=www-data WorkingDirectory=/opt/pandawiki/pandawiki Environment="PATH=/opt/pandawiki/pandawiki/venv/bin" ExecStart=/opt/pandawiki/pandawiki/venv/bin/gunicorn \ --worker-class gevent \ --workers 5 \ --bind unix:/run/pandawiki.sock \ --timeout 300 \ --log-level warning \ app:app Restart=always RestartSec=3 [Install] WantedBy=multi-user.target bash sudo systemctl daemon-reload sudo systemctl start pandawiki sudo systemctl enable pandawiki 2. Nginx配置 bash sudo nano /etc/nginx/sites-available/pandawiki nginx server { listen 80; server_name wiki.yourdomain.com; client_max_body_size 100M; # 允许大文件上传 location / { include proxy_params; proxy_pass http://unix:/run/pandawiki.sock; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $host; proxy_connect_timeout 300s; proxy_read_timeout 300s; } location /static { alias /opt/pandawiki/pandawiki/static; expires 30d; } location /uploads { alias /opt/pandawiki/uploads; expires 30d; add_header Cache-Control "public"; } } bash sudo ln -s /etc/nginx/sites-available/pandawiki /etc/nginx/sites-enabled/ sudo nginx -t && sudo systemctl reload nginx 3. 启用HTTPS bash sudo ufw allow 'Nginx Full' sudo certbot --nginx -d wiki.yourdomain.com
四、监控平台集成 1. 安装Prometheus+Grafana bash # 添加仓库 sudo apt-get install -y apt-transport-https software-properties-common wget wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list sudo apt update # 安装组件 sudo apt install -y prometheus prometheus-node-exporter grafana 2. 配置Prometheus监控Pandawiki bash sudo nano /etc/prometheus/prometheus.yml yaml scrape_configs: # 添加Pandawiki监控 - job_name: 'pandawiki' metrics_path: '/metrics' static_configs: - targets: ['localhost:8000'] # Gunicorn暴露的指标端口 relabel_configs: - source_labels: [__address__] target_label: instance replacement: 'pandawiki-primary' 3. 添加Pandawiki专属仪表板 访问Grafana:http://<服务器IP>:3000(admin/admin) 添加Prometheus数据源:http://localhost:9090 导入仪表板: 使用JSON文件:pandawiki-grafana-dashboard.json 或创建自定义仪表板监控: 请求速率 错误率(4xx/5xx) 响应延迟(P95) 数据库查询性能 内存/CPU使用率 4. 关键监控指标配置 bash sudo nano /etc/prometheus/alert_rules.yml yaml groups: - name: pandawiki rules: - alert: HighErrorRate expr: sum(rate(http_requests_total{status=~"5.."}[5m]) / sum(rate(http_requests_total[5m])) * 100 > 5 for: 10m labels: severity: critical annotations: summary: "高错误率 {{ $labels.instance }}" description: "5xx错误率超过5% (当前值: {{ $value }}%)" - alert: SlowResponse expr: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) > 3 for: 15m labels: severity: warning annotations: summary: "响应延迟过高 {{ $labels.instance }}" description: "P95响应延迟超过3秒 (当前值: {{ $value }}s)"
五、数据备份与恢复 1. 自动化备份脚本 bash sudo nano /usr/local/bin/pandawiki-backup bash #!/bin/bash # 备份数据库、上传文件和配置 BACKUP_DIR="/backup/pandawiki/$(date +%Y%m%d)" mkdir -p $BACKUP_DIR # 数据库备份 sudo -u postgres pg_dump pandawiki > $BACKUP_DIR/pandawiki.sql gzip $BACKUP_DIR/pandawiki.sql # 文件备份 tar -czf $BACKUP_DIR/uploads.tar.gz /opt/pandawiki/uploads tar -czf $BACKUP_DIR/config.tar.gz /opt/pandawiki/pandawiki/config.py # 备份监控数据 tar -czf $BACKUP_DIR/prometheus.tar.gz /var/lib/prometheus # 加密备份 gpg --batch --passphrase "YourEncryptionKey" --symmetric $BACKUP_DIR/*.gz # 保留最近30天备份 find /backup/pandawiki -type d -mtime +30 -exec rm -rf {} \; bash sudo chmod +x /usr/local/bin/pandawiki-backup sudo crontab -e cron # 每天凌晨2点备份 0 2 * * * /usr/local/bin/pandawiki-backup 2. 云存储备份(AWS S3示例) bash # 安装AWS CLI sudo apt install awscli -y aws configure # 添加备份脚本末尾 aws s3 sync --delete /backup/pandawiki s3://your-bucket/pandawiki-backups 3. 灾难恢复流程 bash # 1. 恢复数据库 gunzip -c pandawiki.sql.gz | sudo -u postgres psql pandawiki # 2. 恢复文件 sudo tar -xzf uploads.tar.gz -C / sudo tar -xzf config.tar.gz -C / # 3. 重启服务 sudo systemctl restart pandawiki postgresql nginx
六、日常运维管理 1. 每日检查清单 bash # 服务状态检查 sudo systemctl status pandawiki nginx postgresql prometheus # 日志检查 journalctl -u pandawiki --since "today" | grep -E 'ERROR|CRITICAL' tail -100 /var/log/nginx/error.log # 存储检查 df -h / /opt /backup du -sh /opt/pandawiki/uploads # 性能检查 pg_top -d pandawiki # 数据库性能 htop # 系统资源 2. 自动化维护脚本 bash sudo nano /usr/local/bin/pandawiki-maintenance bash #!/bin/bash # 每周维护任务 source /opt/pandawiki/pandawiki/venv/bin/activate cd /opt/pandawiki/pandawiki # 1. 更新代码 git fetch origin git checkout main git reset --hard origin/main # 2. 更新依赖 pip install -r requirements.txt --upgrade # 3. 数据库迁移 flask db upgrade # 4. 清理缓存 find /tmp -name "pandawiki_cache_*" -mtime +7 -delete # 5. 重启服务 sudo systemctl restart pandawiki 3. 智能问答增强 bash # 安装AI插件 pip install openai faiss-cpu # 配置智能问答 nano config.py python # 添加AI配置 AI_ENABLED = True OPENAI_API_KEY = 'sk-xxx' AI_INDEX_PATH = '/opt/pandawiki/faiss_index' bash # 创建知识索引 flask ai build-index
七、安全加固措施 1. 网络安全 bash # 防火墙配置 sudo ufw default deny incoming sudo ufw allow ssh sudo ufw allow https sudo ufw enable # 安装Fail2Ban sudo apt install fail2ban sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local sudo nano /etc/fail2ban/jail.local ini [pandawiki] enabled = true port = http,https filter = pandawiki logpath = /var/log/nginx/access.log maxretry = 5 findtime = 600 bantime = 86400 2. 文件安全 bash # 设置目录权限 sudo chown -R ubuntu:www-data /opt/pandawiki sudo find /opt/pandawiki/uploads -type f -exec chmod 640 {} \; sudo find /opt/pandawiki/pandawiki -type d -exec chmod 750 {} \; # 配置文件完整性监控 sudo apt install aide sudo aideinit sudo cp /var/lib/aide/aide.db.new /var/lib/aide/aide.db sudo crontab -e cron # 每天检查文件完整性 0 3 * * * /usr/bin/aide --check 3. 数据库安全 sql -- 限制连接数 ALTER SYSTEM SET max_connections = 100; -- 禁用远程访问 sudo nano /etc/postgresql/14/main/pg_hba.conf # 修改为: host all all 127.0.0.1/32 md5
八、注意事项 1. 升级策略 测试环境先行:先在测试环境验证升级 备份优先:升级前执行完整备份 分阶段升级: bash # 1. 停用负载 sudo systemctl stop pandawiki # 2. 更新代码 git pull origin main # 3. 数据库迁移 flask db upgrade # 4. 重启服务 sudo systemctl start pandawiki 2. 性能优化 python # config.py 性能优化配置 CACHE_TYPE = "RedisCache" CACHE_REDIS_URL = "redis://localhost:6379/0" SQLALCHEMY_ENGINE_OPTIONS = { "pool_pre_ping": True, "pool_recycle": 300, "pool_size": 20, "max_overflow": 10 } 3. 灾备恢复计划 故障类型 恢复时间目标(RTO) 恢复点目标(RPO) 恢复方案 应用故障 15分钟 5分钟 从备份恢复服务配置 数据库损坏 1小时 1小时 从SQL备份恢复 服务器宕机 2小时 24小时 云镜像恢复 + 最新数据备份 数据中心故障 4小时 24小时 跨区域备份恢复 4. 智能知识库优化 知识图谱集成: bash pip install py2neo flask knowledge-graph build 问答质量监控: 记录用户问答满意度 定期优化AI模型 设置未知问题报警 5. 合规性要求 GDPR合规: 用户数据加密存储 提供数据导出接口 自动删除90天未活跃用户 操作审计: sql CREATE TABLE audit_log ( id SERIAL PRIMARY KEY, user_id INT, action VARCHAR(50), target VARCHAR(100), timestamp TIMESTAMP DEFAULT NOW() ); 通过以上完整流程,您将获得一个安全、高效、可监控的智能知识库系统。建议每月执行一次安全审计,每季度进行恢复演练,确保系统持续稳定运行。