diff --git a/README.md b/README.md index f922738..e1c0532 100644 --- a/README.md +++ b/README.md @@ -1 +1,29 @@ -GOOOOOOO +# cnblogs archiver + +## How can I help? + +Go to [release](https://git.saveweb.org/saveweb/cnblogs/releases) page, downlaod `cnblogs_posts_list` and run it. + +WARNING: DO NOT run `cnblogs_posts_list` concurrently (on same IP), you may be banned by cnblogs. + +NOTE: We will publish a docker image soon™ (<30 minutes). + +--- + +NOTE: `cnblogs_rss_detect` is finished, you don't need to run it. + +## 存档阶段 + +### 阶段1: 探测所有存在的 blogid (已完成) + +运行 `cnblogs_rss_detect` + +### 阶段2:遍历全部 blog,收集所有文章的 URL(正在进行) + +运行 `cnblogs_posts_list` + +### 阶段3:导出文章 urls.txt 并发送给 ArchiveTeam + +### 阶段4:下载文章 html + +保留一份全站文章的纯文本存档(STWP)