cnblogs/README.md

30 lines
746 B
Markdown
Raw Normal View History

2024-07-17 11:05:33 -07:00
# cnblogs archiver
## How can I help?
Go to [release](https://git.saveweb.org/saveweb/cnblogs/releases) page, downlaod `cnblogs_posts_list` and run it.
WARNING: DO NOT run `cnblogs_posts_list` concurrently (on same IP), you may be banned by cnblogs.
NOTE: We will publish a docker image soon™ (<30 minutes).
---
NOTE: `cnblogs_rss_detect` is finished, you don't need to run it.
## 存档阶段
### 阶段1: 探测所有存在的 blogid (已完成)
运行 `cnblogs_rss_detect`
### 阶段2遍历全部 blog收集所有文章的 URL正在进行
运行 `cnblogs_posts_list`
### 阶段3导出文章 urls.txt 并发送给 ArchiveTeam
### 阶段4下载文章 html
保留一份全站文章的纯文本存档STWP