Clarify number of references on Authorities Page vs. Search Results #4865

elisa-a-v · 2024-12-30T22:06:07Z

Noticed after doing #4134

On the “Authorities” page for a given docket, we display the references to opinions in the docket's documents. However, each "authority" in the list is not actually an opinion, but an element of a Docket's authorities:

    @property
    def authorities(self):
        """Returns a queryset that can be used for querying and caching
        authorities.
        """
        return OpinionsCitedByRECAPDocument.objects.filter(
            citing_document__docket_entry__docket_id=self.pk
        )

class OpinionsCitedByRECAPDocument(models.Model):
    citing_document = models.ForeignKey(
        RECAPDocument, related_name="cited_opinions", on_delete=models.CASCADE
    )
    cited_opinion = models.ForeignKey(
        Opinion, related_name="citing_documents", on_delete=models.CASCADE
    )
    depth = models.IntegerField(
        help_text="The number of times the cited opinion was cited "
        "in the citing document",
        default=1,
    )

This means an opinion cited by multiple documents in the same docket will be listed multiple times, like the authorities in this docket. Notice Norman v. United States, 429 F.3d 1081 (Fed. Cir. 2005) being listed 14 times, where all links point to the same search which yields 14 docket entries.

This also means the counts are confusing. Each "depth" represents only the number of times the opinion is cited in one citing document, not the total number of references across all documents in the docket. We could:

Make the counts less confusing by making the authorities list include the citing document and changing the text to something like 5 references to <CITED_OPINION> in <CITING_DOC>, with a link to the citing doc.
Group authorities by opinion, then aggregate the depth. This way we would get the summary of all references to a given opinion in a given docket, instead of having the same opinion repeated several times. This does sound like it could be a lot more work, but potentially more informative if we could also include a sub-list of all citing docs with their links.

So instead of:
- 5 references to Norman v. United States, 429 F.3d 1081 (Fed. Cir. 2005)
  Court of Appeals for the Federal Circuit Nov. 18, 2005
- 5 references to Norman v. United States, 429 F.3d 1081 (Fed. Cir. 2005)
  Court of Appeals for the Federal Circuit Nov. 18, 2005
- 2 references to Norman v. United States, 429 F.3d 1081 (Fed. Cir. 2005)
  Court of Appeals for the Federal Circuit Nov. 18, 2005
It would be:
- 12 references to Norman v. United States, 429 F.3d 1081 (Fed. Cir. 2005)
  Court of Appeals for the Federal Circuit Nov. 18, 2005
  - 5 references in citing doc 1
  - 5 references in citing doc 2
  - 2 references in citing doc 3
Update the search results to display the depth of treatment when a cites query is made, so that each result shown says something like, "22 references to case XYZ".

Whatever the option, we should make sure that the opinions' authorities page doesn't get broken since authorities_list.html is used in both docket and opinion authorities, which are not OpinionsCitedByRECAPDocument but OpinionsCited so we don't always have the same attributes available.

The text was updated successfully, but these errors were encountered:

mlissner · 2024-12-31T02:50:20Z

Fun, so this issue is indeed worse than we thought. I didn't realize that we're repeating the same case many times in the list of authorities. That's not great.

The easiest thing is to aggregate on the depth and have it say something like:

3 filings make 14 references to XYZ

We don't need to say which filings do that on this page, so we can spare ourselves that nested layout you describe.

Aggregating the depth should be easy. I can't think offhand how to get the filing count, but I'm guessing it's not too hard either and can be done at the query-level as well.

If we do the above, I think that fixes the confusion issue too, since it says the number of documents and then when you click on it, that's how many show up in the search results.

One last thing: We're introducing a second count to this page. Currently, it's ordered by the citation depth, but I think if we make this change, we should order by the number of filings citing a document instead. I guess a later feature could be to allow users to choose which ordering they prefer.

tactipus · 2025-01-07T23:10:52Z

lol sup, i'm gonna look at this soon

mlissner · 2025-01-10T00:09:25Z

@tactipus, we're planning to fix this in our coming sprint, which starts Monday. Are you still thinking about helping with this one?

tactipus · 2025-01-10T00:41:17Z

@tactipus, we're planning to fix this in our coming sprint, which starts Monday. Are you still thinking about helping with this one?

yeah. just wanted to know if elisa still wanted to do paired programming so it can be scheduled. i'm looking at the Docket class rn

mlissner · 2025-01-10T00:44:28Z

Up to you guys. I'll get out of the way. :) @elisa-a-v?

elisa-a-v · 2025-01-10T00:45:28Z

Oh I'd love to! @tactipus if you're down, let me know when would be a good time for you and I can probably adjust my schedule :)

tactipus · 2025-01-10T18:42:26Z

@elisa-a-v I am good after 17:00 EST, usually. i can also do 13:00 to 16:00 EST

elisa-a-v · 2025-01-10T20:32:07Z

@tactipus that's great, 13:00 EST generally works for me as well, but I think we should probably move this discussion over to email so we don't keep cluttering up the issue thread 😅 my address is [email protected]—feel free to reach out!

tactipus · 2025-01-16T14:31:51Z

Good morning,

This comment is more a record than anything. Per our conversation yesterday, we will focus on views.py & map out the implementation @mlissner discussed. That way, we can avoid tinkering too much with the models.py.

The solution is to map out the references using an array or a query set IIRC. It was all @elisa-a-v's idea, I just took notes ;p

async def docket_authorities(
    request: HttpRequest,
    docket_id: int,
    slug: str,
) -> HttpResponse:
    docket, context = await core_docket_data(request, docket_id)
    if not await docket.ahas_authorities():
        raise Http404("No authorities data for this docket at this time")

    context.update(
        {
            # Needed to show/hide parties tab.
            "parties": await docket.parties.aexists(),
            "docket_entries": await docket.docket_entries.aexists(),
            "authorities": docket.authorities_with_data,
            
        }
    )
    return TemplateResponse(request, "docket_authorities.html", context)

elisa-a-v · 2025-01-16T16:31:19Z

That's correct, we basically need two things:

Map the authorities in the context of that view so that it's an iterable of Opinion instances instead of OpinionsCitedByRECAPDocuments, with the number of filings (how many RECAPDocuments cite this Opinion in this Docket) and the number of references (the aggregated depth of all those OpinionsCitedByRECAPDocument instances)

We now list Opinions instead of OpinionsCitedByRECAPDocuments so we only list each opinion once.

I believe the iterable could either be a QuerySet if you know your way around annotations, or simply a list. I imagine something like:

>>> context["authorities"]
[
    {
        "opinion": Opinion1,
        "filings": OpinionsCitedByRECAPDocument.filter(cited_opinion=Opinion1, ...).count(),
        "references": OpinionsCitedByRECAPDocument.filter(cited_opinion=Opinion1, ...).aggregate(...)["sum"],
    },
    {
        "opinion": Opinion2,
        "filings": OpinionsCitedByRECAPDocument.filter(cited_opinion=Opinion2, ...).count(),
        "references": OpinionsCitedByRECAPDocument.filter(cited_opinion=Opinion2, ...).aggregate(...)["sum"],
    },
]

Update the template to display the info in the new format without breaking the document authorities view (they both use the same authorities_list.html template) which should probably remain unchanged:

elisa-a-v · 2025-01-16T16:31:47Z

@mlissner I did notice a small thing when looking through the authorities_list.html template: in line 14 it says {% if authority.blocked %}, but I don't see how that's ever True. I wonder if maybe that's a mistake and it should be authority.cited_opinion.cluster.blocked instead?

<a href="{{ authority.cited_opinion.cluster.get_absolute_url }}{% querystring %}" {% if authority.blocked %}rel="nofollow" {% endif %}>

tactipus · 2025-01-16T16:53:41Z

I just emailed @elisa-a-v about this. To recap that email, I suggested that we use annotate() to get counts. annotate() returns QuerySet objects, which are iterable while aggregate() provides a dictionary, which is iterable but...not as flexible I believe.

mlissner · 2025-01-17T01:03:32Z

I wonder if maybe that's a mistake and it should be authority.cited_opinion.cluster.blocked instead?

That's certainly possible. The idea here is to tell crawlers not to waste their time crawling pages that we already know are blocked. Google and other crawlers give you a budget of links that they'll follow (called the "Crawl Budget"), so it's important not to send them to pages they can't index anyway.

I think you can get all the information you need from the queryset with something like this:

cited = (
    OpinionsCitedByRECAPDocument.objects
        .filter(citing_document__docket_entry__docket_id=4214664)
        .values("cited_opinion_id")
        .annotate(opinion_count=Count('cited_opinion_id'), total_depth=Sum('depth'))
)

That returns objects like:

{'cited_opinion_id': 9339585, 'opinion_count': 1, 'total_depth': 1}

And if you iterate over the entire thing you get results like:

In [27]: for c in cited:
    ...:     print(f"citation_count: {c["opinion_count"]}; total depth: {c["total_depth"]}")
    ...
    citation_count: 1; total depth: 1
    citation_count: 4; total depth: 35
    citation_count: 1; total depth: 1
    citation_count: 1; total depth: 1
    citation_count: 1; total depth: 2
    ...

Does this work as a solution (if you pick the fields carefully, do the prefetches, etc)?

elisa-a-v · 2025-01-17T02:35:46Z

That's certainly possible. The idea here is to tell crawlers not to waste their time crawling pages that we already know are blocked. Google and other crawlers give you a budget of links that they'll follow (called the "Crawl Budget"), so it's important not to send them to pages they can't index anyway.

I understand, well from what I saw I don't think the authority actually has any blocked attribute so I'd bet it's not working, unless I'm missing something 🤔

Does this work as a solution (if you pick the fields carefully, do the prefetches, etc)?

Yes that makes a lot of sense to me!

elisa-a-v mentioned this issue Dec 30, 2024

Link docket authorities to search results #4134

Closed

mlissner added this to Sprint (Web Team) Dec 31, 2024

mlissner moved this to Backlog Jan 13 - Jan 24 in Sprint (Web Team) Dec 31, 2024

mlissner added this to Volunteer backlog Jan 4, 2025

mlissner moved this to CourtListener Backlog in Volunteer backlog Jan 4, 2025

mlissner assigned elisa-a-v Jan 10, 2025

mlissner moved this from Backlog Jan 13 - Jan 24 to To Do in Sprint (Web Team) Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify number of references on Authorities Page vs. Search Results #4865

Clarify number of references on Authorities Page vs. Search Results #4865

elisa-a-v commented Dec 30, 2024 •

edited

Loading

mlissner commented Dec 31, 2024

tactipus commented Jan 7, 2025

mlissner commented Jan 10, 2025

tactipus commented Jan 10, 2025

mlissner commented Jan 10, 2025

elisa-a-v commented Jan 10, 2025 •

edited

Loading

tactipus commented Jan 10, 2025

elisa-a-v commented Jan 10, 2025

tactipus commented Jan 16, 2025 •

edited

Loading

elisa-a-v commented Jan 16, 2025

elisa-a-v commented Jan 16, 2025

tactipus commented Jan 16, 2025

mlissner commented Jan 17, 2025

elisa-a-v commented Jan 17, 2025

Clarify number of references on Authorities Page vs. Search Results #4865

Clarify number of references on Authorities Page vs. Search Results #4865

Comments

elisa-a-v commented Dec 30, 2024 • edited Loading

mlissner commented Dec 31, 2024

tactipus commented Jan 7, 2025

mlissner commented Jan 10, 2025

tactipus commented Jan 10, 2025

mlissner commented Jan 10, 2025

elisa-a-v commented Jan 10, 2025 • edited Loading

tactipus commented Jan 10, 2025

elisa-a-v commented Jan 10, 2025

tactipus commented Jan 16, 2025 • edited Loading

elisa-a-v commented Jan 16, 2025

elisa-a-v commented Jan 16, 2025

tactipus commented Jan 16, 2025

mlissner commented Jan 17, 2025

elisa-a-v commented Jan 17, 2025

elisa-a-v commented Dec 30, 2024 •

edited

Loading

elisa-a-v commented Jan 10, 2025 •

edited

Loading

tactipus commented Jan 16, 2025 •

edited

Loading