A versatile command-line web scraper built with Dart. This tool allows you to scrape web pages and save the extracted data in various formats.
- Scrape paragraphs and links from any web page
- Save scraped data in TXT, CSV, or JSON format
- Command-line interface with interactive prompts
- Flexible output options (specify format and filename)
- Error handling for network issues and invalid inputs
To run this project, you need to have Dart SDK installed on your system. If you haven't installed Dart yet, follow the official Dart installation guide.
-
Clone this repository:
git clone https://github.com/Qharny/Dart_Web_Scraper.git cd Dart_Web_Scraper
-
Install dependencies:
dart pub get
You can run the web scraper using the following command:
dart run bin/main.dart [options]
You can run the test using the following command:
dart run test
--url
or-u
: Specify the URL to scrape--format
or-f
: Specify the output format (txt, csv, or json)--output
or-o
: Specify the output filename
If you don't provide these options, the script will prompt you to enter them interactively.
-
Scrape a website and save as TXT (with interactive prompts):
dart run bin/main.dart
-
Scrape a specific URL and save as CSV:
dart run bin/main.dart --url https://example.com --format csv
-
Scrape a website, save as JSON with a custom filename:
dart run bin/main.dart --url https://example.com --format json --output my_data.json
bin/main.dart
: The main entry point of the applicationlib/scraper.dart
: Contains the core scraping logiclib/models/scraped_data.dart
: Defines the data model for scraped content
Contributions are welcome! Please feel free to submit a Pull Request.