Last updated: 2020-04-04
You'll need the following:
Install them using your favorite method (homebrew
, etc).
First, fork the repository so you're ready to contribute back.
Replace yourusername
below with your Github username:
git clone --recursive [email protected]:yourusername/coronadatascraper.git
cd coronadatascraper
git remote add upstream [email protected]:lazd/coronadatascraper.git
If you've already cloned without --recursive
, run:
git submodule init
git submodule update
yarn install
If you get an error message saying you have an incompatible version of
node
, you may need to change version. You can use n
to change
node versions: install it and run
n lts
.
yarn start
This gets you the latest scrapers, as well as the cache so we're not hammering servers.
git pull upstream master --recurse-submodules
Note: If you are encountering issues updating a submodule such as Could not access submodule
, you may need to update your fork using:
git submodule update --init --recursive
To run the scrapers for today:
yarn start
To use a subset of scrapers, use --location
/-l
yarn start --location "US/PA"
The location
value should match a path under src/shared/scrapers/
.
Examples:
yarn start --location "US"
: run all scrapers insrc/shared/scrapers/US
and child foldersyarn start --location "US/CA"
: run all scrapers insrc/shared/scrapers/US/CA
and child foldersyarn start --location "US/CA/alameda-county.js"
: run this single scraperyarn start --location "AU"
: run all scrapers insrc/shared/scrapers/AU
and child foldersyarn start --location "AU/index.js"
: run this single scraper
To skip a scraper, use --skip
/-s
yarn start --skip "US/CA/alameda-county.js"
To re-generate old data from cache (or timeseries), use --date
/-d
:
yarn start -d 2020-3-12
To output files without the date suffix, use --outputSuffix
/-o
:
yarn start -d 2020-3-12 -o
To generate a timeseries for the entire history of the pandemic using cached data:
yarn timeseries
To generate it for a date range, use -d
/-e
:
yarn timeseries -d 2020-3-15 -e 2020-3-18
This can be combined with -l
to test a single scraper:
yarn timeseries -d 2020-3-15 -e 2020-3-18 -l 'WA, USA'
Run yarn options
to see the command line options. e.g.,
Options:
--version Show version number [boolean]
--date, -d Generate data for or start the timeseries at the provided
date in YYYY-M-D format [string]
...
We use Tape.
# Run all tests
yarn test
# Run a single test file
node path/to/file.js
To build the website and start a development server at http://localhost:3000/:
yarn dev
To build the latest data, a full timeseries, and the website:
yarn build
To build only the website for production:
yarn buildSite