Go to file
2024-07-02 08:38:16 +02:00
.gitignore initial commit 2020-12-23 22:21:47 +01:00
flake.lock fix login procedure 2024-06-25 11:56:30 +02:00
flake.nix fix login procedure 2024-06-25 11:56:30 +02:00
LICENCE add licence file 2020-12-23 22:27:54 +01:00
nzz.js properly wait to fill in the start date 2024-07-01 10:33:32 +02:00
package-lock.json fix login procedure 2024-06-25 11:56:30 +02:00
package.json fix login procedure 2024-06-25 11:56:30 +02:00
README.md clarify caveats 2024-07-02 08:38:16 +02:00
screenshot.jpg add screenshot 2020-12-23 23:07:53 +01:00

NZZ Downloader

The NZZ is the Swiss Swiss newspaper of record. Its first issue was all the way back in 1780. It's even better that you can download every single issue ever released (if you have a subscription of course).

This little tool helps you with downloading all released issues in a specified time span.

It was written because the archive website is not very friendly in the author's opinion and of course because it is not possible to download everything in a time span.

Because the archive website makes heavy use of javascript this is done with selenium to remote control a browser (firefox in this case). This is also why it is not all that fast but that is ok.

Please only use this with your own credentials, the journalists deserve to be paid for their work.

screenshot

Installation

You need to be comfortable with the command line to use the nzz downloader and it has only been tested on linux systems though it should work fine on Windows or macOS.

Usage

Usage: nzz.js -f [date] -t [date] -o [path] -u [usernane] -p [password]

Options:
      --version   Show version number                                  [boolean]
  -h, --help      Show help                                            [boolean]
  -f, --from      Earliest issue to download.            [default: "2020-12-23"]
  -t, --to        Latest issue to download.              [default: "2020-12-23"]
  -o, --out       Download directory.                         [default: "./nzz"]
  -u, --user      Username for the nzz archive.                       [required]
  -p, --password  Password for the user.                              [required]

Examples

Download all existing issues from 1780-01-01 until 1780-02-30 to the default directory "./nzz"

./nzz.js -u 'myuser@example.com' -p 'mypassword' -f 1780-01-01 -t 1780-02-30

Caveats

You need a good internet connection, as the program only waits a couple seconds until a download of an issue can start. This is something that is hard to solve unfortunately.

If you get strange errors about elements not being visible, wait a bit and try again, it's usually a network problem.

The proper way of doing this would be to figure out how the calls to the backend work and do that instead of using the heavy handed approach of instrumenting a browser.

Licence

Licensed as MPL 2.0.