-
Notifications
You must be signed in to change notification settings - Fork 6
Crawl feeds
Andy Jackson edited this page Nov 14, 2013
·
5 revisions
One of the primary goals of w3act is to produce a set of crawl feeds that can be used to drive the crawling process.
These feeds MUST contain ALL w3act Targets that are currently deemed in scope. They MUST NOT contain any entries that do not meet the current scope requirements.
Critically, we need two sets of feeds:
- The 'by permission' feed, which contains all items that we have an explicit license for, but which do NOT fall under any of the other (Legal Deposit) criteria.
- The 'Legal Deposit' feed, which contains all items that meet the Legal Deposit criteria.
TODO Describe the output in detail, including time-wise scoping of validity, i.e. the permission should have a time-frame, distinct from the crawl frequency/scheduling.