Go to file
2024-07-18 02:05:33 +08:00
.gitea/workflows init 2024-07-17 14:56:23 +08:00
cmd/cnblogs_posts_list feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
cmd_disabled/cnblogs_rss_detect feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
pkg feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
.gitignore feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
go.mod feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
go.sum feat cnblogs_posts_list 2024-07-18 01:36:32 +08:00
README.md add README 2024-07-18 02:05:33 +08:00

cnblogs archiver

How can I help?

Go to release page, downlaod cnblogs_posts_list and run it.

WARNING: DO NOT run cnblogs_posts_list concurrently (on same IP), you may be banned by cnblogs.

NOTE: We will publish a docker image soon™ (<30 minutes).


NOTE: cnblogs_rss_detect is finished, you don't need to run it.

存档阶段

阶段1: 探测所有存在的 blogid (已完成)

运行 cnblogs_rss_detect

阶段2遍历全部 blog收集所有文章的 URL正在进行

运行 cnblogs_posts_list

阶段3导出文章 urls.txt 并发送给 ArchiveTeam

阶段4下载文章 html

保留一份全站文章的纯文本存档STWP