2020-12-23 21:21:47 +00:00
|
|
|
# NZZ Downloader
|
|
|
|
The [NZZ](https://en.wikipedia.org/wiki/Neue_Z%C3%BCrcher_Zeitung) is the Swiss
|
|
|
|
Swiss newspaper of record. Its first issue was all the way back in 1780. It's
|
|
|
|
even better that you can download every single issue ever released (if
|
|
|
|
you have a subscription of course).
|
|
|
|
|
|
|
|
This little tool helps you with downloading all released issues in a specified
|
|
|
|
time span.
|
|
|
|
|
|
|
|
It was written because the archive website is not very friendly in the author's
|
|
|
|
opinion and of course because it is not possible to download everything in a time
|
|
|
|
span.
|
|
|
|
|
|
|
|
Because the archive website makes heavy use of javascript this is done with
|
|
|
|
[selenium](https://www.selenium.dev/) to remote control a browser (firefox in
|
|
|
|
this case). This is also why it is not all that fast but that is ok.
|
|
|
|
|
|
|
|
Please only use this with your own credentials, the journalists deserve to be
|
|
|
|
paid for their work.
|
|
|
|
|
2020-12-23 22:07:53 +00:00
|
|
|
![screenshot](screenshot.jpg)
|
|
|
|
|
2020-12-23 21:21:47 +00:00
|
|
|
## Installation
|
|
|
|
You need to be comfortable with the command line to use the nzz downloader and
|
|
|
|
it has only been tested on linux systems though it should work fine on Windows
|
|
|
|
or macOS.
|
|
|
|
|
|
|
|
- [NodeJS](https://nodejs.org/en/download/) (the LTS version is fine)
|
|
|
|
- [Firefox](https://www.mozilla.org/en-US/firefox/download/thanks/)
|
|
|
|
- [geckodriver](https://github.com/mozilla/geckodriver/releases)
|
2020-12-23 21:27:26 +00:00
|
|
|
- [nzz.js](https://code.vanwa.ch/sebastian/nzz-downloader/-/releases)
|
2020-12-23 21:21:47 +00:00
|
|
|
|
|
|
|
## Usage
|
|
|
|
```
|
|
|
|
Usage: nzz.js -f [date] -t [date] -o [path] -u [usernane] -p [password]
|
|
|
|
|
|
|
|
Options:
|
|
|
|
--version Show version number [boolean]
|
|
|
|
-h, --help Show help [boolean]
|
|
|
|
-f, --from Earliest issue to download. [default: "2020-12-23"]
|
|
|
|
-t, --to Latest issue to download. [default: "2020-12-23"]
|
|
|
|
-o, --out Download directory. [default: "./nzz"]
|
|
|
|
-u, --user Username for the nzz archive. [required]
|
|
|
|
-p, --password Password for the user. [required]
|
|
|
|
```
|
|
|
|
|
|
|
|
### Examples
|
|
|
|
Download all existing issues from 01-01-1780 until 30-02-1780 to the default
|
|
|
|
directory "./nzz"
|
|
|
|
```
|
|
|
|
./nzz.js -u 'myuser@example.com' -p 'mypassword' -f 1780-01-01 -t 1780-02-30
|
|
|
|
```
|
2020-12-23 21:27:26 +00:00
|
|
|
|
|
|
|
## Caveats
|
|
|
|
You need a good internet connection, as the program only waits 5 seconds until a
|
|
|
|
download of an issue can start. This is something that is hard to solve unfortunately.
|
|
|
|
|
2020-12-23 21:58:47 +00:00
|
|
|
If you get strange errors about elements not being visible, wait a bit and try again,
|
|
|
|
it's usually a network problem.
|
|
|
|
|
2020-12-23 21:27:26 +00:00
|
|
|
## Licence
|
2020-12-23 22:09:26 +00:00
|
|
|
Licensed as [MPL 2.0](https://www.mozilla.org/en-US/MPL/2.0/).
|