Go to file
2024-07-18 02:11:10 +08:00
.gitea/workflows init 2024-07-17 14:56:23 +08:00
cmd/cnblogs_posts_list feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
cmd_disabled/cnblogs_rss_detect feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
pkg feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
.gitignore feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
go.mod feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
go.sum feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
README.md README i18n 2024-07-18 02:11:10 +08:00

cnblogs archiver

How can I help?

Go to release page, downlaod cnblogs_posts_list and run it.

WARNING: DO NOT run cnblogs_posts_list concurrently (on same IP), you may be banned by cnblogs.

NOTE: We will publish a docker image soon™ (<30 minutes).


NOTE: cnblogs_rss_detect is finished, you don't need to run it.

Archiving stages

stage1detect all blogids (finished)

run cnblogs_rss_detect

stage2iterate all blogids and collect all posts' URLs (running)

run cnblogs_posts_list

stage3export all posts' URLs and send to ArchiveTeam (TODO)

stage4also download all posts' HTMLs by ourselves (TODO)