Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elsevier articles not in the repo #249

Closed
agentilb opened this issue Dec 1, 2023 · 10 comments
Closed

Elsevier articles not in the repo #249

agentilb opened this issue Dec 1, 2023 · 10 comments
Assignees

Comments

@agentilb
Copy link
Collaborator

agentilb commented Dec 1, 2023

Could you check why?

10.1016/j.physletb.2023.138271
10.1016/j.physletb.2023.138272
10.1016/j.physletb.2023.138273
10.1016/j.physletb.2023.138274
10.1016/j.physletb.2023.138275
10.1016/j.physletb.2023.138277
10.1016/j.physletb.2023.138278
10.1016/j.physletb.2023.138279
10.1016/j.physletb.2023.138280
10.1016/j.physletb.2023.138281
10.1016/j.physletb.2023.138282
10.1016/j.physletb.2023.138283
10.1016/j.physletb.2023.138285
10.1016/j.physletb.2023.138287
10.1016/j.physletb.2023.138288
10.1016/j.physletb.2023.138289
10.1016/j.physletb.2023.138263
10.1016/j.physletb.2023.138265
10.1016/j.physletb.2023.138266
10.1016/j.physletb.2023.138245

@ErnestaP
Copy link

ErnestaP commented Dec 1, 2023

Just a note for DEVELOPERS:
I see that we have an article 10.1016/j.physletb.2023.138271 in our pods here: /data/harvesting/Elsevier/unpacked/CERNAB00000010814_SBdGkA/CERNAB00000010814/03702693/v847sC/S0370269323006056/main.xml

I see that it was uploaded at 11:42 on the 6th of Nov:
Screenshot 2023-12-01 at 15 37 45

In completed workflows, I cannot see any records of this timing
Screenshot 2023-12-01 at 15 39 09

Same as in error or halted states.

I cannot see logs older than the 30th of November in scoap3 crawler job list.
Also, logs in the pod reach until the 30th on November
Screenshot 2023-12-01 at 15 44 54

@ErnestaP ErnestaP self-assigned this Dec 1, 2023
@ErnestaP
Copy link

ErnestaP commented Dec 4, 2023

in CERNAB00000010814.zip
ERROR: Not found referenced affiliations ([])
Articles:
'10.1016/j.physletb.2023.138289'
'10.1016/j.physletb.2023.138288'
'10.1016/j.physletb.2023.138266'
'10.1016/j.nuclphysb.2023.116378'

@ErnestaP
Copy link

I managed to get articles on prod, except one: 10.1016/j.physletb.2023.138245
It looks like it was never sent.
Also, we have an issue with one of the files, one of XMLs is missing namespaces in it:
how it should be:

<article xmlns="http://www.elsevier.com/xml/ja/dtd"
    xmlns:ce="http://www.elsevier.com/xml/common/dtd"
    xmlns:sa="http://www.elsevier.com/xml/common/struct-aff/dtd"
    xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd"
    xmlns:xlink="http://www.w3.org/1999/xlink" docsubtype="sco" xml:lang="en">
    <item-info>
        <jid>PLB</jid>
        <aid>138266</aid>
        <ce:article-number>138266</ce:article-number>
        <ce:pii>S0370-2693(23)00600-7</ce:pii>
        <ce:doi>10.1016/j.physletb.2023.138268</ce:doi>
        <ce:copyright year="2023" type="other">The Author(s)</ce:copyright>
        <ce:doctopics>
            <ce:doctopic id="doc0010">
                <ce:text>Theory</ce:text>
            </ce:doctopic>
        </ce:doctopics>
    </item-info>

how it is:

<article docsubtype="sco" xml:lang="en">
    <item-info>
        <jid>PLB</jid>
        <aid>138268</aid>
        <ce:article-number>138268</ce:article-number>
        <ce:pii>S0370-2693(23)00602-0</ce:pii>
        <ce:doi>10.1016/j.physletb.2023.138268</ce:doi>
        <ce:copyright year="2023" type="other">Oak Ridge National Laboratory</ce:copyright>
        <ce:doctopics>
            <ce:doctopic id="doc0010">
                <ce:text>Experiments</ce:text>
            </ce:doctopic>
        </ce:doctopics>
    </item-info>

The path of file in the file structure Elsevier sent to us:

CERNAB00000010814/03702693/v847sC/S0370269323006020/main.xml

So, in the end, the record was not updated:
https://repo.scoap3.org/records/81089

@agentilb
Copy link
Collaborator Author

Thank you Ernesta, so I will contact Elsevier for:
10.1016/j.physletb.2023.138245
which is in halted mode because of duplicate affiliation

And for: https://repo.scoap3.org/records/81089
Which has an incorrect xml file which generates an error.

It seems we have the same problem with those articles that should have been harvested in November:
10.1016/j.physletb.2023.138249
10.1016/j.physletb.2023.138290
10.1016/j.physletb.2023.138291
10.1016/j.physletb.2023.138292
10.1016/j.physletb.2023.138295
10.1016/j.physletb.2023.138296
10.1016/j.physletb.2023.138297
10.1016/j.physletb.2023.138298
10.1016/j.physletb.2023.138299
10.1016/j.physletb.2023.138300
10.1016/j.physletb.2023.138301
10.1016/j.physletb.2023.138302
10.1016/j.physletb.2023.138303
10.1016/j.physletb.2023.138305
10.1016/j.physletb.2023.138306
10.1016/j.physletb.2023.138307
10.1016/j.physletb.2023.138309
10.1016/j.physletb.2023.138310
10.1016/j.physletb.2023.138311
10.1016/j.physletb.2023.138312
10.1016/j.physletb.2023.138313
10.1016/j.physletb.2023.138314
10.1016/j.physletb.2023.138315
10.1016/j.physletb.2023.138316
10.1016/j.physletb.2023.138317
10.1016/j.physletb.2023.138318
10.1016/j.physletb.2023.138320
10.1016/j.physletb.2023.138321
10.1016/j.physletb.2023.138323
10.1016/j.physletb.2023.138325

Could you check them as well?

@ErnestaP
Copy link

All the articles above are reharvested.
The problem is the same article as in the previous harvesting:

CERNAB00000010814/03702693/v847sC/S0370269323006020/main.xml

@ErnestaP
Copy link

ErnestaP commented Dec 15, 2023

@agentilb found the missing article: 10.1016/j.physletb.2023.138245
It's in a halted state because of duplicated affiliations:
https://repo.scoap3.org/admin/workflow/details/?url=%2Fadmin%2Fworkflow%2F%3Fflt0_21%3D2&id=bdc8bcc4-9b3f-11ee-9333-c6debf016353

@agentilb
Copy link
Collaborator Author

agentilb commented Dec 18, 2023

Yes @ErnestaP, I already contacted Elsevier for this article.
I'm now waiting for their answer.

But it seems this one is still in halted mode: 10.1016/j.physletb.2022.137649
It was due to the previous problem with the address line that was corrected some weeks ago I believe.
Could you please try to re-harvest it?

@ErnestaP
Copy link

ErnestaP commented Dec 19, 2023

yes, I see, because we never reharvested it. I will do it :)

@ErnestaP
Copy link

Article is in the repo: https://repo.scoap3.org/records/82257

@ErnestaP
Copy link

@agentilb Can we close the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants