-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[backend/frontend] add configuration to have better chances to get RSS Feed contents (#8736) #9244
base: master
Are you sure you want to change the base?
Conversation
2c316f3
to
6b967af
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9244 +/- ##
==========================================
- Coverage 65.12% 65.10% -0.03%
==========================================
Files 630 630
Lines 60083 60117 +34
Branches 6694 6703 +9
==========================================
+ Hits 39127 39137 +10
- Misses 20956 20980 +24 ☔ View full report in Codecov by Sentry. |
const { messages_number, messages_size } = await queueDetails(connectorIdFromIngestId(ingestion.id)); | ||
if (messages_number === 0) { | ||
const { last_execution_date } = ingestion; | ||
const shouldExecuteIngestion = !last_execution_date || sinceNowInMinutes(last_execution_date) > CSV_FEED_MIN_INTERVAL_MINUTES; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why strict > ? so if CSV_FEED_MIN_INTERVAL_MINUTES is 5min, and sinceNowInMinutes(last_execution_date) is 5, it won't execute until the next minute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and since this code is the same for RSS, I suggest to maybe refactor it to use a method like :
const shouldExexcuteIngestion = (ingestion, min_interval_minutes) => {
const { last_execution_date } = ingestion;
return !last_execution_date || sinceNowInMinutes(last_execution_date) > min_interval_minutes;
}
so we could call it like this shouldExecuteIngestion(ingestion, CSV_FEED_MIN_INTERVAL_MINUTES)
or shouldExecuteIngestion(ingestion, RSS_FEED_MIN_INTERVAL_MINUTES)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inerval is 30s so it will wait only 30s (well for default values). Anyway I'm adding an <= and ok for shouldExecuteIngestion
className={classes.bodyItem} | ||
style={{ width: dataColumns.last_execution_date.width }} | ||
> | ||
{nsdt(node.last_execution_date)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this || '-'
check here?
const defaultHttpAgent: http.Agent | undefined = undefined; | ||
let defaultHttpsAgent: https.Agent | undefined; | ||
|
||
if (cert || key || ca) { | ||
defaultHttpsAgent = new https.Agent({ rejectUnauthorized: rejectUnauthorized === true, cert, key, ca }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure to understand the logic behind the changes. Moreover, defaultHttpAgent
is always undefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to testagain because it's the more risky change, but the issue was that defaultHttpAgent was set to new Agent() instead of undefined. Then is there is some proxy configuration the agent is correctly initialize in
opencti/opencti-platform/opencti-graphql/src/config/conf.js
Lines 396 to 409 in b293e2e
export const getPlatformHttpProxyAgent = (uri) => { | |
const platformProxies = getPlatformHttpProxies(); | |
const targetUrl = new URL(uri); | |
const targetProxy = platformProxies[targetUrl.protocol]; // Select the proxy according to target protocol | |
if (targetProxy) { | |
// If proxy found, check if hostname is not excluded | |
if (targetProxy.isExcluded(targetUrl.hostname)) { | |
return undefined; | |
} | |
// If not generate the agent accordingly | |
return targetProxy.build(); | |
} | |
return undefined; | |
}; |
As you see there, if no proxy the return is undefined, which means that in the http call there is no empty httpAgent => from my test having empty httpAgent in request is what is causing being refused by cloudflare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should just remove defaultHttpAgent in http-client
432ffbf
to
4a7b302
Compare
Proposed changes
Adding several configuration to get better chance to avoid HTTP 403 from public RSS Feed:
Adding some info and warn level log:
Related issues
Checklist
Further comments