Github Action全自动发布静态网站

前言
由于Github Pages限流的原因,所以打算将静态页面部署到ECS,在没有限流和网络传输损耗的情况下速度会很快,本文章记录了利用Github Action定时每周一0点自动爬取页面,打包Image并上传Docker Hub,更新K8s服务的整个自动化流程。 Github Action真香呐,免费又好用。
建仓
Github
1.上传爬虫文件,使用Python的Scrapy框架,根据官网的Scrapy Tutorial提供的scrapy startproject tutorial命令可以快速构建一个爬虫项目,将创建的项目提交至Github仓库后提交爬虫脚本到仓库,脚本内容参考之前的文章:Python爬取网站内容做静态页面转发 PS:也可以直接在Github建仓后上传脚本,后续在Action中执行scrapy目录的初始化,但是这样的话就需要很多额外操作去移动文件 如:1.将脚本文件移动到scrapy下的spiders目录下 2.cd 至scrapy目录执行scrapy crawl blog 3.cp 爬取的文件到上层目录(避免url多了一层路径) 2.创建Dockerfile文件,内容如下:
# 使用官方的nginx作为基础镜像 FROM nginx:stable-alpine # 将当前目录下的所有文件复制到nginx的web根目录 COPY . /usr/share/nginx/html # 如果需要自定义nginx的配置,可以将配置文件放在当前目录,并复制到这里 # 例如: COPY nginx.conf /etc/nginx/nginx.conf # 开放80端口 EXPOSE 80 # 当容器启动时,运行nginx CMD ["nginx", "-g", "daemon off;"]
参考Github项目:https://github.com/brook-david/auto-crawl-blog-rebulid-for-static-web-action.git
Docker Hub
在Docker Hub创建一个仓库my-blog
K8s配置
部署Deployment
1.Deployment设置image为上面创建的my-blog仓库,并设置imagePullPolicy为Always避免本地缓存。 2.设置部署节点nodeSelector到控制节点,并设置污点容忍1
apiVersion: apps/v1
kind: Deployment
metadata:
name: static-blog
namespace: blog-web
spec:
replicas: 1
selector:
matchLabels:
app: static-blog
template:
metadata:
labels:
app: static-blog
spec:
containers:
- name: my-container
image: dbopen/my-blog:main # 这里替换为你的镜像仓库和标签
imagePullPolicy: Always # 每次拉取最新镜像
ports:
- containerPort: 80 # 如果你的应用需要暴露端口,可以在这里指定
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Equal"
value: ""
effect: "NoSchedule"
nodeSelector:
kubernetes.io/hostname: izbp1605iwejf5qgem2c7hz
部署Service
apiVersion: v1
kind: Service
metadata:
name: my-static-blog
namespace: blog-web
labels:
app: static-blog
spec:
ports:
- port: 80 # Service的端口
targetPort: 80 # 与Pod中的containerPort相对应
selector:
app: static-blog
type: ClusterIP
部署Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-static-blog-ingress
namespace: blog-web
annotations:
acme.cert-manager.io/http01-edit-in-place: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod #指定cert-manager的cluster-issuer
ingress.kubernetes.io/ssl-redirect: "false"
spec:
ingressClassName: nginx
tls:
- hosts:
- static.wanderto.top
secretName: blog-tls-secret-blog-static
# Ingress规则
rules:
- host: static.wanderto.top # 如果需要基于域名的路由,请设置域名
http:
paths:
- path: / # 这里是路径,可以根据需要设置
pathType: Prefix # 路径类型,可以是Prefix, Exact, or ImplementationSpecific
backend:
service:
name: my-static-blog # 这里替换为你的Service名称
port:
number: 80 # 这里替换为你的Service端口号
K8S API TOKEN获取
Action通知K8s更新服务调用K8s API需要TOKEN,生成TOKEN方式有多种,这里使用serviceaccount生成TOKEN,并为serviceaccount绑定blog-web2下Deployment,Pod的查看与更新权限,生成的TOKEN后续使用
kubectl create serviceaccount update-pod-for-api -n blog-web kubectl create role pod-update --verb=get --verb=list --verb=watch --verb=update --verb=patch --resource=pods --resource=deployment -n blog-web kubectl create rolebinding update-pod-for-api-binding --role=pod-update --serviceaccount=blog-web:update-pod-for-api -n blog-web # token有效期一个月 kubectl create token update-pod-for-api --duration=2592000s -n blog-web
Github Action配置
Python环境准备与页面爬取
在环境中安装Python和scrapy,运行scrapy crawl blog命令爬取页面到当前目录
- name: Setup Python # Set Python version
uses: actions/setup-python@v5
with:
python-version: 3.8 # ${{ matrix.python-version }}
# Install pip and scrapy
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install scrapy
- name: Crawl
run: scrapy crawl blog
打包Image并上传Docker Hub
构建Docker环境并登录Docker Hub,根据Dockerfile将当前目录打包Image并推送,需要将Docker Hub账号密码保存到环境的secrets中,secretss配置位置setting->security->Secrets and variables,参考:Using secrets in GitHub Actions
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Setup Pages
uses: actions/configure-pages@v5
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
# Upload entire repository
path: '.'
# Docker Image upload
- name: Log in to Docker Hub
uses: docker/login-action@f4ef78c080cd8ba55a85445d5b36e214a81df20a
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
with:
images: dbopen/my-blog
- name: Build and push Docker image
id: push
uses: docker/build-push-action@3b5e8027fcad23fda98b2e3ac259d8d67585f671
with:
context: .
file: ./Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v1
with:
subject-name: ${{ env.DOCKER_REGISTRY }}/${{ env.DOCKER_IMAGE_NAME_BLOG}}
subject-digest: ${{ steps.push.outputs.digest }}
push-to-registry: false
通知K8s更新服务
使用K8s API通知K8s更新服务,通过更新Pod annotations中的时间戳的方式触发Pod更新动作,需要将前面生成的TOKEN保存至secrets
- name: Notify
run: |
TIMESTAMP=$(date +%s)
curl -k --location --request PATCH 'https://${{ vars.K8S_HOST }}:6443/apis/apps/v1/namespaces/blog-web/deployments/static-blog' \
--header 'Authorization: Bearer ${{ secrets.K8S_UPDATE_TOKEN }}' \
--header 'Content-Type: application/merge-patch+json' \
--data '{
"spec": {
"template": {
"metadata": {
"annotations": {
"issued-timestamp": "'$TIMESTAMP'"
}
}
}
}
}'
完整的Action配置文件
# Simple workflow for deploying static content to GitHub Pages
name: Crawl Blog Pages
on:
schedule:
# * is a special character in YAML so you have to quote this string
# Runs at 00:00, only on Monday.
- cron: '0 0 * * 1'
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false
# env
env:
DOCKER_REGISTRY: docker.io
DOCKER_IMAGE_NAME_BLOG: dbopen/my-blog
jobs:
crawl_page_and_bulid:
name: Crawl Blog Pages And Build Docker Image
runs-on: ubuntu-latest
# strategy:
# matrix:
# python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
permissions:
packages: write
contents: read
attestations: write
id-token: write
steps:
- uses: actions/checkout@v4
- name: Setup Python # Set Python version
uses: actions/setup-python@v5
with:
python-version: 3.8 # ${{ matrix.python-version }}
# Install pip and scrapy
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install scrapy
- name: Crawl
run: scrapy crawl blog && ls -l
# Build
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
# Docker Image upload
- name: Log in to Docker Hub
uses: docker/login-action@f4ef78c080cd8ba55a85445d5b36e214a81df20a
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
with:
images: dbopen/my-blog
- name: Build and push Docker image
id: push
uses: docker/build-push-action@3b5e8027fcad23fda98b2e3ac259d8d67585f671
with:
context: .
file: ./Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v1
with:
subject-name: ${{ env.DOCKER_REGISTRY }}/${{ env.DOCKER_IMAGE_NAME_BLOG}}
subject-digest: ${{ steps.push.outputs.digest }}
push-to-registry: false
notify_k8s_deployment_update:
name: Notify K8s Deployment Update
runs-on: ubuntu-latest
needs: crawl_page_and_bulid
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Notify
run: |
TIMESTAMP=$(date +%s)
curl -k --location --request PATCH 'https://${{ vars.K8S_HOST }}:6443/apis/apps/v1/namespaces/blog-web/deployments/static-blog' \
--header 'Authorization: Bearer ${{ secrets.K8S_UPDATE_TOKEN }}' \
--header 'Content-Type: application/merge-patch+json' \
--data '{
"spec": {
"template": {
"metadata": {
"annotations": {
"issued-timestamp": "'$TIMESTAMP'"
}
}
}
}
}'
参考:https://github.com/brook-david/auto-crawl-blog-rebulid-for-static-web-action/blob/main/.github/workflows/crawl-page.yml
重新配置Igress转发地址
最后更新K8s中主页的Ingress配置,如果502则转发static.wanderto.top页面
# ingress snippets配置:
nginx.org/server-snippets: |
error_page 502 = @fallback;
location @fallback {
proxy_pass https://xxxxxx.github.io;
}
本章为Python爬取网站内容做静态页面转发的后续处理