Skip to content

Commit

Permalink
Align production with staging (#555)
Browse files Browse the repository at this point in the history
* Some changes to cset_upload.py to allow site observation short codes (#527)

* stashing

* fix: cset_upload.py updated to allow new-style site observation codes

NB! this probably still won't work! I suspect the file I was given is
broken and I cannot test it further

* stashing

* stashing

* Short code prefix and tooltip to backend

Target loader now reads short code prefix and tooltip from
meta_aligner.yaml. Tooltip is saved to Experiment model.

TODO: make tooltip available via API

* Prefix tooltip now serverd by api/site_observation

* stashing

* Site observation groups for shortcodes now by experiment

* feat: download structure fixed

TODO: add all the yamls

* All yaml files added to download

* New format to download zip (issue 1326) (#530)

* stashing

* stashing

* feat: download structure fixed

TODO: add all the yamls

* All yaml files added to download

* cset_upload.py: lhs_pdb renamed to ref_pdb

* Renamed canon- and conf site tags

* Adds support for key-based SSH connections (#534)

* Centralised environment variables (#529)

* refactor: Restructured settings.py

* docs: Minor tweaks

* refactor: Move security and infection config to settings

* refactor: b/e & f/e/ tags now in settings (also fixed f/e tag value)

* refactor: Move Neo4j config to settings

* refactor: More variables into settings

* refactor: Moved remaining config

* docs: Adds configuration guide as comments

* docs: Variable prefix now 'stack_' not 'stack_env_'

---------

Co-authored-by: Alan Christie <[email protected]>

* feat: Adds support for private keys on SSH tunnel

* fix: Fixes key-based logic

---------

Co-authored-by: Alan Christie <[email protected]>

* build(deps): bump cryptography from 42.0.0 to 42.0.2 (#533)

Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0 to 42.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@42.0.0...42.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: Updates documentation (#536)

Co-authored-by: Alan Christie <[email protected]>

* build(deps): bump django from 3.2.20 to 3.2.24 (#535)

Bumps [django](https://github.com/django/django) from 3.2.20 to 3.2.24.
- [Commits](django/django@3.2.20...3.2.24)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: reverting wrong changes

* fix: reverting wrong changes (#538)

* stashing

* add site observation's ligand sdf to aligned_files

* fix: custom pdb now downloadable

* fix: increased loglevel to error on unexpected exceptions block

* fix: Discourse service check now checks API key before creating a service (#544)

Co-authored-by: Alan Christie <[email protected]>

* build(deps): bump cryptography from 42.0.2 to 42.0.4 (#539)

Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.2 to 42.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@42.0.2...42.0.4)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* metadata.csv populated

Started working on issue 1355 as well, it's too tightly coupled. Some
work remaining re that:
- when tag created in UI, make sure upload_name attribute is populated

* upload_name automatically pouplated when creating tags in UI

Only populated on creation, updates won't touch it

* changes to api/download_structures

- apo_file, bound_file, sdf_info and smiles_info merged into
all_aligned_structures
- added pdb_info field

NB! download_structures was requred to provide ligand_pdb as
well. This wasn't tracked previously, so I added field to
SiteObservation model. Meaning there's a migration and on stack
deployment data needs to be wiped and reuploaded

* don't download neighbourhoods.yaml unless trans_matrix_info is checked

* fixed error handling (errors.csv) and not returning combined sdf

* fix: Added parsing directives to DownloadStructuresserializer

* Consecutive numbering of observations under canon site

* SiteObservatdion.tag split to tag and tag_prefix (1361)

* fix: crystallographic_files folders in download now sans suffix (#550)

* fix: tag names underdand prefix in download's metadata.csv

* fix: return all proteins listed in api/download_structures

* fix: fixed 'All structures' option not working in download dialog

* Migrations for new file fields

* Issue 1326 - mol and smiles added to download bundle

NB! not prodction/staging ready, still contains a hack for testing
because XCA doesn't provide all the attributes.

* Target loader should handle empty code_prefix and tooltip

'Should' because haven't tested yet with real data

* Column 'Downloaded' to metadata.csv in downloads

* fix: restore 'upload_name' in site obvs tags to prefix-tag format

* Removed ligand_smiles workaround

All necessary files are now tracked by the database and returned in download.

* fix: Add force_error_display to connection functions (default False) (#559)

Co-authored-by: Alan Christie <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Kalev Takkis <[email protected]>
Co-authored-by: Warren Thompson <[email protected]>
Co-authored-by: Alan Christie <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Warren Thompson <[email protected]>
  • Loading branch information
6 people authored Mar 12, 2024
1 parent 4a1e452 commit 3beafb4
Show file tree
Hide file tree
Showing 6 changed files with 135 additions and 54 deletions.
14 changes: 8 additions & 6 deletions api/security.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
# response = view(request)


def get_remote_conn() -> Optional[SSHConnector]:
def get_remote_conn(force_error_display=False) -> Optional[SSHConnector]:
credentials: Dict[str, Any] = {
"user": settings.ISPYB_USER,
"pw": settings.ISPYB_PASSWORD,
Expand All @@ -71,7 +71,8 @@ def get_remote_conn() -> Optional[SSHConnector]:
# Assume the credentials are invalid if there is no host.
# If a host is not defined other properties are useless.
if not credentials["host"]:
logger.debug("No ISPyB host - cannot return a connector")
if logging.DEBUG >= logger.level or force_error_display:
logger.info("No ISPyB host - cannot return a connector")
return None

# Try to get an SSH connection (aware that it might fail)
Expand All @@ -81,14 +82,14 @@ def get_remote_conn() -> Optional[SSHConnector]:
except Exception:
# Log the exception if DEBUG level or lower/finer?
# The following will not log if the level is set to INFO for example.
if logging.DEBUG >= logger.level:
if logging.DEBUG >= logger.level or force_error_display:
logger.info("credentials=%s", credentials)
logger.exception("Got the following exception creating SSHConnector...")

return conn


def get_conn() -> Optional[Connector]:
def get_conn(force_error_display=False) -> Optional[Connector]:
credentials: Dict[str, Any] = {
"user": settings.ISPYB_USER,
"pw": settings.ISPYB_PASSWORD,
Expand All @@ -101,7 +102,8 @@ def get_conn() -> Optional[Connector]:
# Assume the credentials are invalid if there is no host.
# If a host is not defined other properties are useless.
if not credentials["host"]:
logger.debug("No ISPyB host - cannot return a connector")
if logging.DEBUG >= logger.level or force_error_display:
logger.info("No ISPyB host - cannot return a connector")
return None

conn: Optional[Connector] = None
Expand All @@ -110,7 +112,7 @@ def get_conn() -> Optional[Connector]:
except Exception:
# Log the exception if DEBUG level or lower/finer?
# The following will not log if the level is set to INFO for example.
if logging.DEBUG >= logger.level:
if logging.DEBUG >= logger.level or force_error_display:
logger.info("credentials=%s", credentials)
logger.exception("Got the following exception creating Connector...")

Expand Down
57 changes: 47 additions & 10 deletions viewer/download_structures.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

import pandoc
from django.conf import settings
from django.db.models import Exists, OuterRef, Subquery
from django.db.models import CharField, Exists, F, OuterRef, Subquery, Value
from django.db.models.functions import Concat

from viewer.models import (
DownloadLinks,
Expand Down Expand Up @@ -48,6 +49,8 @@
'apo_desolv_file': ('aligned'), # SiteObservation: apo_desolv_file
'bound_file': ('aligned'), # SiteObservation: bound_file
'sdf_info': ('aligned'), # SiteObservation: ligand_mol_file (indirectly)
'ligand_mol': ('aligned'), # SiteObservation: ligand_mol
'ligand_smiles': ('aligned'), # SiteObservation: ligand_smiles
'ligand_pdb': ('aligned'), # SiteObservation: ligand_pdb
'smiles_info': (''), # SiteObservation: smiles_info (indirectly)
# those above are all controlled by serializer's all_aligned_structures flag
Expand Down Expand Up @@ -79,6 +82,7 @@ class TagSubquery(Subquery):
"""Annotate SiteObservation with tag of given category"""

def __init__(self, category):
# fmt: off
query = SiteObservationTag.objects.filter(
pk=Subquery(
SiteObvsSiteObservationTag.objects.filter(
Expand All @@ -88,8 +92,16 @@ def __init__(self, category):
),
).values('site_obvs_tag')[:1]
)
).values('tag')[0:1]
).annotate(
combitag=Concat(
F('tag_prefix'),
Value(' - '),
F('tag'),
output_field=CharField(),
),
).values('combitag')[0:1]
super().__init__(query)
# fmt: on


class CuratedTagSubquery(Exists):
Expand Down Expand Up @@ -126,6 +138,8 @@ class ArchiveFile:
'diff_file': {},
'sigmaa_file': {},
'ligand_pdb': {},
'ligand_mol': {},
'ligand_smiles': {},
},
'molecules': {
'sdf_files': {},
Expand Down Expand Up @@ -219,6 +233,10 @@ def _patch_molecule_name(site_observation):
lines = site_observation.ligand_mol_file.split('\n')
if not lines[0].strip():
lines[0] = site_observation.long_code

# the db contents is mol file but what's requested here is
# sdf. add sdf separator
lines.append('$$$$\n')
return '\n'.join(lines)


Expand Down Expand Up @@ -400,21 +418,21 @@ def _trans_matrix_files_zip(ziparchive, target):
_add_empty_file(ziparchive, archive_path)


def _metadate_file_zip(ziparchive, target):
def _metadata_file_zip(ziparchive, target, site_observations):
"""Compile and add metadata file to archive."""
logger.info('+ Processing metadata')

annotations = {}
values = ['code', 'longcode', 'cmpd__compound_code', 'smiles']
header = ['Code', 'Long code', 'Compound code', 'Smiles']
values = ['code', 'longcode', 'cmpd__compound_code', 'smiles', 'downloaded']
header = ['Code', 'Long code', 'Compound code', 'Smiles', 'Downloaded']

for category in TagCategory.objects.filter(category__in=TAG_CATEGORIES):
tag = f'tag_{category.category.lower()}'
values.append(tag)
header.append(category.category)
annotations[tag] = TagSubquery(category.category)

pattern = re.compile(r'\W+')
pattern = re.compile(r'\W+') # non-alphanumeric characters
for tag in SiteObservationTag.objects.filter(
category__in=TagCategory.objects.filter(category__in=CURATED_TAG_CATEGORIES),
target=target,
Expand All @@ -431,6 +449,12 @@ def _metadate_file_zip(ziparchive, target):
).prefetch_related(
'cmpd',
'siteobservationtags',
).annotate(
downloaded=Exists(
site_observations.filter(
pk=OuterRef('pk'),
),
)
).annotate(**annotations).values_list(*values)
# fmt: on

Expand Down Expand Up @@ -488,7 +512,7 @@ def _extra_files_zip(ziparchive, target):
ziparchive.write(
filepath,
os.path.join(
_ZIP_FILEPATHS[f'extra_files_{num_extra_dir}'], file
f'{_ZIP_FILEPATHS["extra_files"]}_{num_extra_dir}', file
),
)
num_processed += 1
Expand Down Expand Up @@ -603,7 +627,9 @@ def _build_readme(readme, original_search, template_file, ziparchive):
readme.write(f'- {filename}' + '\n')


def _create_structures_zip(target, zip_contents, file_url, original_search, host):
def _create_structures_zip(
target, zip_contents, file_url, original_search, host, site_observations
):
"""Write a ZIP file containing data from an input dictionary."""

logger.info('+ _create_structures_zip(%s)', target.title)
Expand Down Expand Up @@ -666,7 +692,7 @@ def _create_structures_zip(target, zip_contents, file_url, original_search, host

# compile and add metadata.csv
if zip_contents['metadata_info']:
_metadate_file_zip(ziparchive, target)
_metadata_file_zip(ziparchive, target, site_observations)

if zip_contents['trans_matrix_info']:
_trans_matrix_files_zip(ziparchive, target)
Expand Down Expand Up @@ -766,6 +792,8 @@ def _create_structures_dict(site_obvs, protein_params, other_params):
'artefacts_file',
'pdb_header_file',
'ligand_pdb',
'ligand_mol',
'ligand_smiles',
'diff_file',
]:
# siteobservation object
Expand Down Expand Up @@ -880,6 +908,8 @@ def get_download_params(request):
'apo_solv_file': serializer.validated_data['all_aligned_structures'],
'apo_desolv_file': serializer.validated_data['all_aligned_structures'],
'ligand_pdb': serializer.validated_data['all_aligned_structures'],
'ligand_mol': serializer.validated_data['all_aligned_structures'],
'ligand_smiles': serializer.validated_data['all_aligned_structures'],
'cif_info': serializer.validated_data['cif_info'],
'mtz_info': serializer.validated_data['mtz_info'],
'map_info': serializer.validated_data['map_info'],
Expand Down Expand Up @@ -984,7 +1014,14 @@ def create_or_return_download_link(request, target, site_observations):
zip_contents = _create_structures_dict(
site_observations, protein_params, other_params
)
_create_structures_zip(target, zip_contents, file_url, original_search, host)
_create_structures_zip(
target,
zip_contents,
file_url,
original_search,
host,
site_observations,
)

download_link = DownloadLinks()
# Note: 'zip_file' and 'zip_contents' record properties are no longer used.
Expand Down
36 changes: 36 additions & 0 deletions viewer/migrations/0049_auto_20240307_1344.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Generated by Django 3.2.23 on 2024-03-07 13:44

from django.db import migrations, models


class Migration(migrations.Migration):
dependencies = [
('viewer', '0048_auto_20240305_1038'),
]

operations = [
migrations.AddField(
model_name='historicalsiteobservation',
name='ligand_mol',
field=models.TextField(max_length=255, null=True),
),
migrations.AddField(
model_name='historicalsiteobservation',
name='ligand_smiles',
field=models.TextField(max_length=255, null=True),
),
migrations.AddField(
model_name='siteobservation',
name='ligand_mol',
field=models.FileField(
max_length=255, null=True, upload_to='target_loader_data/'
),
),
migrations.AddField(
model_name='siteobservation',
name='ligand_smiles',
field=models.FileField(
max_length=255, null=True, upload_to='target_loader_data/'
),
),
]
6 changes: 6 additions & 0 deletions viewer/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,12 @@ class SiteObservation(models.Model):
seq_id = models.IntegerField()
chain_id = models.CharField(max_length=1)
ligand_mol_file = models.TextField(null=True)
ligand_mol = models.FileField(
upload_to="target_loader_data/", null=True, max_length=255
)
ligand_smiles = models.FileField(
upload_to="target_loader_data/", null=True, max_length=255
)
ligand_pdb = models.FileField(
upload_to="target_loader_data/", null=True, max_length=255
)
Expand Down
32 changes: 21 additions & 11 deletions viewer/target_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -635,7 +635,7 @@ def logfunc(key, message):

# memo to self: added type ignore directives to return line
# below and append line above because after small refactoring,
# mypy all of the sudden started throwing errors on bothe or
# mypy all of the sudden started throwing errors on both of
# these. the core of it's grievance is that it expects the
# return type to be list[str]. no idea why, function signature
# clearly defines it as list[str | None]
Expand Down Expand Up @@ -734,7 +734,6 @@ def process_experiment(
"""
del kwargs
assert item_data
assert prefix_tooltips
logger.debug("incoming data: %s", item_data)
experiment_name, data = item_data

Expand Down Expand Up @@ -814,8 +813,12 @@ def process_experiment(
# version int old versions are kept target loader
version = 1

code_prefix = extract(key="code_prefix")
prefix_tooltip = prefix_tooltips.get(code_prefix, "")
# if empty or key missing entirely, ensure code_prefix returns empty
code_prefix = extract(key="code_prefix", level=logging.INFO)
# ignoring type because tooltip dict can legitimately be empty
# and in such case, assert statement fails. need to remove it
# and use the ignore
prefix_tooltip = prefix_tooltips.get(code_prefix, "") # type: ignore[union-attr]

fields = {
"code": experiment_name,
Expand Down Expand Up @@ -1279,7 +1282,7 @@ def process_site_observation(
longcode = f"{experiment.code}_{chain}_{str(ligand)}_{str(idx)}"
key = f"{experiment.code}/{chain}/{str(ligand)}"

smiles = extract(key="ligand_smiles")
smiles = extract(key="ligand_smiles_string")

try:
compound = compounds[experiment_id].instance
Expand Down Expand Up @@ -1319,11 +1322,13 @@ def process_site_observation(
apo_desolv_file,
apo_file,
artefacts_file,
ligand_mol,
ligand_mol_file,
sigmaa_file,
diff_file,
event_file,
ligand_pdb,
ligand_mol,
ligand_smiles,
) = self.validate_files(
obj_identifier=experiment_id,
file_struct=data,
Expand All @@ -1340,16 +1345,19 @@ def process_site_observation(
"diff_map", # NB! keys in meta_aligner not yet updated
"event_map",
"ligand_pdb",
"ligand_mol",
"ligand_smiles",
),
validate_files=validate_files,
)

logger.debug('looking for ligand_mol: %s', ligand_mol)
logger.debug('looking for ligand_mol: %s', ligand_mol_file)

mol_data = None
if ligand_mol:
if ligand_mol_file:
with contextlib.suppress(TypeError, FileNotFoundError):
with open(
self.raw_data.joinpath(ligand_mol),
self.raw_data.joinpath(ligand_mol_file),
"r",
encoding="utf-8",
) as f:
Expand Down Expand Up @@ -1377,6 +1385,8 @@ def process_site_observation(
"event_file": str(self._get_final_path(event_file)),
"artefacts_file": str(self._get_final_path(artefacts_file)),
"ligand_pdb": str(self._get_final_path(ligand_pdb)),
"ligand_mol": str(self._get_final_path(ligand_mol)),
"ligand_smiles": str(self._get_final_path(ligand_smiles)),
"pdb_header_file": "currently missing",
"ligand_mol_file": mol_data,
}
Expand Down Expand Up @@ -1483,7 +1493,7 @@ def process_bundle(self):
self.version_number = meta["version_number"]
self.version_dir = meta["version_dir"]
self.previous_version_dirs = meta["previous_version_dirs"]
prefix_tooltips = meta["code_prefix_tooltips"]
prefix_tooltips = meta.get("code_prefix_tooltips", {})

# check transformation matrix files
( # pylint: disable=unbalanced-tuple-unpacking
Expand Down Expand Up @@ -1874,7 +1884,7 @@ def _tag_observations(self, tag, prefix, category, so_list):
so_tag = SiteObservationTag()
so_tag.tag = tag
so_tag.tag_prefix = prefix
so_tag.upload_name = tag
so_tag.upload_name = f"{prefix} - {tag}"
so_tag.category = TagCategory.objects.get(category=category)
so_tag.target = self.target
so_tag.mol_group = so_group
Expand Down
Loading

0 comments on commit 3beafb4

Please sign in to comment.