# cnblogs archiver ## How can I help? ### Binary Go to [release](https://git.saveweb.org/saveweb/cnblogs/releases) page, downlaod `cnblogs_posts_list` and run it. WARNING: DO NOT run `cnblogs_posts_list` concurrently (on same IP), you may be banned by cnblogs. ### With Docker ```bash export ARCHIVIST= # a string that can uniquely identify your node (for example: bob-gcloud-514). (Legal characters: letters, numbers, -, _) ``` ```bash if [[ -z "$ARCHIVIST" ]]; then echo "WARN: ARCHIVIST must be set" exit 1 fi _image="icecodexi/saveweb:cnblogs" docker pull "${_image}" \ && docker stop cnblogs docker rm -f cnblogs \ && docker run --env ARCHIVIST="$ARCHIVIST" --restart always \ --volume /etc/localtime:/etc/localtime:ro \ --cpu-shares 512 --memory 512M --memory-swap 512M \ --label=com.centurylinklabs.watchtower.enable=true \ --detach --name cnblogs \ "${_image}" ``` ## Archiving stages ### stage1:detect all blogids (~~finished~~) run `cnblogs_rss_detect` ### stage2:iterate all blogids and collect all posts' URLs (~~finished~~) run `cnblogs_posts_list` ### stage3:export all posts' URLs and send to ArchiveTeam (~~finished~~) ### stage4:also download all posts' HTMLs by ourselves (TODO)