Get Started. It's Free
or sign up with your email address
GetBooks by Mind Map: GetBooks

1. Each source is inspected one by one or in parallel?

2. Each source at the same time should be processed only by one process

2.1. Lock by source

3. Source can be braked to pages (one process only?), where each page contains multiple entries

3.1. Each entry can be processed separately and independently

4. Fault-tolerant

4.1. BookLoadingJob?

4.2. PageLoadingJob?

4.3. SourceRefreshJob?

4.4. retryability, elasticity, monitoring, ...

5. List of sources (free or not)

6. Each source has associated struct to find entries

7. Jobs

7.1. General part and specific

7.2. E.g. BookLoadingJob

7.3. Let's model it in Mongo

7.4. It's better to have progress report