add details to user documentation
This commit is contained in:
parent
7e6d4ee581
commit
3f485011b8
59
README.md
59
README.md
@ -1,42 +1,61 @@
|
|||||||
# Help Center Spider
|
# Help Center Spider
|
||||||
## About
|
|
||||||
This is a spider tool with which you can visit all links on https://docs.otc.t-systems.com to find urls that are not correct.
|
|
||||||
|
|
||||||
## Requirements
|
The Open Telekom Cloud Helpcenter Spider is a spider tool visiting all
|
||||||
After you cloned the repository you need to prepare an environment to run the tool. You can easily do this with
|
links starting from its landing page on https://docs.otc.t-systems.com/
|
||||||
python virtual environment:
|
to find and identify urls that are not correct. It parses all types of
|
||||||
|
hyperlinks and normalizes them into a canonical format. The spider
|
||||||
|
descents into the document tree via [...] bredth or width first search.
|
||||||
|
[and does what?] [when is logged which event?]
|
||||||
|
|
||||||
|
## Getting started
|
||||||
|
|
||||||
|
Once you installed the code and its required packages into an virtual
|
||||||
|
environment and checked its configuration file `config.json`, the web
|
||||||
|
spider starts invoking the tool without any arguments. Results are
|
||||||
|
listed in [... TBD].
|
||||||
|
|
||||||
|
## Requirements and Installation
|
||||||
|
After you cloned this repository you need to prepare an environment to
|
||||||
|
run the tool. You can easily do this with a Python virtual environment:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ cd <local_folder>/
|
$ cd _local_folder_/
|
||||||
|
$ git clone https://gitea.eco.tsi-dev.otc-service.com/infra/hc-spider.git
|
||||||
|
$ cd hc-spider
|
||||||
$ python -m venv venv/
|
$ python -m venv venv/
|
||||||
$ source venv/bin/activate
|
$ source venv/bin/activate
|
||||||
(venv)$ python -m pip install -r requirements.txt
|
(venv)$ python -m pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
In _config.json_ you can define a couple items:
|
In _config.json_ you can define several items:
|
||||||
|
|
||||||
- _watchdog_file_: if you run the tool in the background and want to stop it properly (not using `kill`),
|
- _watchdog_file_: if you run the tool in the background and want to
|
||||||
just send an exit message into the watchdog file: `echo exit > watchdog.fifo`
|
stop it properly (without sending a signal with `kill`), just send
|
||||||
- _timer_runtime_: maximum runtime limit in seconds
|
an exit message into the watchdog file: `echo exit > watchdog.fifo`.
|
||||||
- _log_dir_: logging folder
|
- _timer_runtime_: maximum runtime limit in seconds.
|
||||||
- _logging_interval_: frequency of dumping log files
|
- _log_dir_: logging folder.
|
||||||
- _workers_: number of workers (background processes) you want to run. If you set to 0 it will count from the number of cores (_number_of_cores_ - 1)
|
- _logging_interval_: frequency of dumping log files.
|
||||||
|
- _workers_: number of workers (background processes) you want to run.
|
||||||
|
If you set to 0 it will count from the number of cores
|
||||||
|
(_number_of_cores_ - 1)
|
||||||
- _starting_point_: base url where to start
|
- _starting_point_: base url where to start
|
||||||
|
|
||||||
## How to run
|
## Operations
|
||||||
There are two ways to do it
|
There are two ways to start the spider:
|
||||||
|
|
||||||
### In foreground
|
### In the foreground
|
||||||
```
|
```
|
||||||
$ source venv/bin/activate
|
$ source venv/bin/activate
|
||||||
$ python main.py
|
(venv)$ python main.py
|
||||||
```
|
```
|
||||||
|
|
||||||
### In background
|
### In the background
|
||||||
```
|
```
|
||||||
$ source venv/bin/activate
|
$ source venv/bin/activate
|
||||||
$ nohup python main.py > log/hc_spider.log 2> log/hc_spider.err <&- &
|
(venv)$ nohup python main.py > log/hc_spider.log 2> log/hc_spider.err <&- &
|
||||||
```
|
```
|
||||||
|
|
||||||
In case you running the tool in background you can stop the execution with `$ echo exit > <watchdog_file>`
|
### Stopping the process polietely
|
||||||
|
To stop the tool when run in the background, send a command to the
|
||||||
|
control fifo with: `(venv)$ echo exit > _watchdog_file_`
|
||||||
|
Loading…
x
Reference in New Issue
Block a user