Installation
Before we begin, ensure you have Scrapy installed on your system. If you donโt, you can easily install it using pip, the Python package installer:
First, run this command in your terminal to download the custom scrapy spider,ย ReconSpider
, and extract it to the current working directory.
Basic command
results.json
After runningย ReconSpider.py
, the data will be saved in a JSON file,ย results.json
. This file can be explored using any text editor. Below is the structure of the JSON file produced:
Each key in the JSON file represents a different type of data extracted from the target website:
JSON Key | Description |
---|---|
emails | Lists email addresses found on the domain. |
links | Lists URLs of links found within the domain. |
external_files | Lists URLs of external files such as PDFs. |
js_files | Lists URLs of JavaScript files used by the website. |
form_fields | Lists form fields found on the domain (empty in this example). |
images | Lists URLs of images found on the domain. |
videos | Lists URLs of videos found on the domain (empty in this example). |
audio | Lists URLs of audio files found on the domain (empty in this example). |
comments | Lists HTML comments found in the source code. |
By exploring this JSON structure, you can gain valuable insights into the web applicationโs architecture, content, and potential points of interest for further investigation.