Skip to content

Commit

Permalink
Add pages documentation regarding supported file formats (#7)
Browse files Browse the repository at this point in the history
  • Loading branch information
awvwgk authored Jan 15, 2021
1 parent e09b473 commit f8eeba0
Show file tree
Hide file tree
Showing 10 changed files with 713 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ jobs:
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
BRANCH: gh-pages
FOLDER: docs
FOLDER: _docs
CLEAN: true
60 changes: 60 additions & 0 deletions doc/format-ctfile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Connection table format
---

## Specification

@Note [Reference](https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf)

The molfile is identified by the extension ``mol`` and the structure data format
is identified by ``sdf``.

## Example

Caffeine molecule in mol format:

```text
11262021073D
24 0 0 0 0 999 V2000
1.0732 0.0488 -0.0757 C 0 0 0 0 0 0 0 0 0 0 0 0
2.5137 0.0126 -0.0758 N 0 0 0 0 0 0 0 0 0 0 0 0
3.3520 1.0959 -0.0753 C 0 0 0 0 0 0 0 0 0 0 0 0
4.6190 0.7303 -0.0755 N 0 0 0 0 0 0 0 0 0 0 0 0
4.5791 -0.6314 -0.0753 C 0 0 0 0 0 0 0 0 0 0 0 0
3.3013 -1.1026 -0.0752 C 0 0 0 0 0 0 0 0 0 0 0 0
2.9807 -2.4869 -0.0738 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8253 -2.9004 -0.0758 O 0 0 0 0 0 0 0 0 0 0 0 0
4.1144 -3.3043 -0.0694 N 0 0 0 0 0 0 0 0 0 0 0 0
5.4517 -2.8562 -0.0723 C 0 0 0 0 0 0 0 0 0 0 0 0
6.3893 -3.6597 -0.0723 O 0 0 0 0 0 0 0 0 0 0 0 0
5.6624 -1.4768 -0.0749 N 0 0 0 0 0 0 0 0 0 0 0 0
7.0095 -0.9365 -0.0752 C 0 0 0 0 0 0 0 0 0 0 0 0
3.9206 -4.7409 -0.0616 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7340 1.0879 -0.0750 H 0 0 0 0 0 0 0 0 0 0 0 0
0.7124 -0.4570 0.8233 H 0 0 0 0 0 0 0 0 0 0 0 0
0.7124 -0.4558 -0.9755 H 0 0 0 0 0 0 0 0 0 0 0 0
2.9930 2.1176 -0.0748 H 0 0 0 0 0 0 0 0 0 0 0 0
7.7653 -1.7263 -0.0759 H 0 0 0 0 0 0 0 0 0 0 0 0
7.1486 -0.3218 0.8197 H 0 0 0 0 0 0 0 0 0 0 0 0
7.1480 -0.3208 -0.9695 H 0 0 0 0 0 0 0 0 0 0 0 0
2.8650 -5.0232 -0.0583 H 0 0 0 0 0 0 0 0 0 0 0 0
4.4023 -5.1592 0.8284 H 0 0 0 0 0 0 0 0 0 0 0 0
4.4002 -5.1693 -0.9478 H 0 0 0 0 0 0 0 0 0 0 0 0
M END
```

## Extensions

No extension implemented to the original format.

## Missing Features

The following features are currently not supported:

- Not all modifiers are supported for the connection table
- SDF key-value pair annotations are dropped

@Note Feel free to contribute support for missing features
or bring missing features to our attention by opening an issue.
69 changes: 69 additions & 0 deletions doc/format-ein.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Gaussian external format
---

## Specification

@Note [Reference](https://gaussian.com/external/)

The first line of the input is read as four integers of width 10, ``(4i10)``,
containing the number of atoms in the first integer.
A run mode specific integer is given in the second entry.
The third integer contains the total charge and the fourth integer the spin as
number of unpaired electrons.
The total charge and the systems spin are stored in the [[structure_type]].

The structure is specified by atomic numbers, cartesian coordinates in atomic units
(Bohr) and a scalar quantity, usually partial charges using the fixed format
``(i10,4f20.12)``.
The element is identified by its atomic number,
which is converted to its capitalized element symbol internally.
Only positive, non-zero integers are allowed as atomic numbers.

The expected file extension is ``ein``.

## Examples

Caffeine molecule in Gaussian external format:

```text
24 1 0 0
6 2.027996941030 0.092313100971 -0.143108928077 0.000000000000
7 4.750109032883 0.023734954927 -0.143241208877 0.000000000000
6 6.334341685252 2.070988200950 -0.142353037792 0.000000000000
7 8.728605263543 1.380028892063 -0.142655393906 0.000000000000
6 8.653186310426 -1.193248402810 -0.142315243278 0.000000000000
6 6.238570386230 -2.083535979669 -0.142182962479 0.000000000000
6 5.632667631585 -4.699502178348 -0.139405065684 0.000000000000
8 3.449316339873 -5.480922657010 -0.143184517105 0.000000000000
7 7.775087464402 -6.244277357876 -0.131071375299 0.000000000000
6 10.302293246446 -5.397396780594 -0.136721655174 0.000000000000
8 12.074100072866 -6.915734697428 -0.136664963403 0.000000000000
7 10.700382864677 -2.790784724183 -0.141483763966 0.000000000000
6 13.245975677887 -1.769690333624 -0.142182962479 0.000000000000
6 7.408915313425 -8.959057313972 -0.116369309269 0.000000000000
1 1.387020877193 2.055757011721 -0.141786120079 0.000000000000
1 1.346221699097 -0.863566855309 1.555905663964 0.000000000000
1 1.346240596354 -0.861336978970 -1.843408533601 0.000000000000
1 5.655967949599 4.001720959646 -0.141313688652 0.000000000000
1 14.674305959118 -3.262309083535 -0.143449078705 0.000000000000
1 13.508968805056 -0.608151528241 1.548989267863 0.000000000000
1 13.507797175115 -0.606148418987 -1.832145768365 0.000000000000
1 5.414083058620 -9.492394601323 -0.110227700709 0.000000000000
1 8.319196188304 -9.749472887017 1.565392087032 0.000000000000
1 8.315114380769 -9.768540219438 -1.791082028670 0.000000000000
```

## Extensions

No extension implemented to the original format.

## Missing Features

The following features are currently not supported:

- the requested run-mode is dropped while reading.
- scalar atomic quantities are not preserved and dropped.

@Note Feel free to contribute support for missing features
or bring missing features to our attention by opening an issue.
106 changes: 106 additions & 0 deletions doc/format-gen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: DFTB+ general format
---

## Specification

@Note [Reference](https://dftbplus.org/fileadmin/DFTBPLUS/public/dftbplus/latest/manual.pdf)

The general (gen) format is used for DFTB+ as geometry input format.
It is based on the [xyz format](./format-xyz.html).

The first line contains the number of atoms and the specific kind of provided
geometry.
Available types are cluster (``C``), supercell (``S``), fractional (``F``),
and helical (``H``), the letter defining the format is case-insensitive.

The second line gives the element symbols for each group of atoms separated by
spaces, the groups are indexed starting from 1 and references in the specification
of the atomic coordinates by this index rather than their element symbol.

The following lines are specified as two integers and three reals separated by
spaces. The first integer is currently ignored. The second integer references
the the element symbol in the second line.
The atomic coordinates are given in Ångström for cluster, supercell and helical,
while they are given as fraction of the lattice vector for fractional input types.

For supercell and fractional input the next lines contains three reals containing
the origin of the stucture, followed by three lines of each three reals for the
lattice vectors.

Lines starting with the ``#`` are comments and are ignored while parsing.

The format is identified by the extension ``gen``.

## Example

Caffeine molecule in genFormat:

```text
24 C
C N O H
1 1 1.07317000000000E+00 4.88500000000000E-02 -7.57300000000000E-02
2 2 2.51365000000000E+00 1.25600000000000E-02 -7.58000000000000E-02
3 1 3.35199000000000E+00 1.09592000000000E+00 -7.53300000000000E-02
4 2 4.61898000000000E+00 7.30280000000000E-01 -7.54900000000000E-02
5 1 4.57907000000000E+00 -6.31440000000000E-01 -7.53100000000000E-02
6 1 3.30131000000000E+00 -1.10256000000000E+00 -7.52400000000000E-02
7 1 2.98068000000000E+00 -2.48687000000000E+00 -7.37700000000000E-02
8 3 1.82530000000000E+00 -2.90038000000000E+00 -7.57700000000000E-02
9 2 4.11440000000000E+00 -3.30433000000000E+00 -6.93600000000000E-02
10 1 5.45174000000000E+00 -2.85618000000000E+00 -7.23500000000000E-02
11 3 6.38934000000000E+00 -3.65965000000000E+00 -7.23200000000000E-02
12 2 5.66240000000000E+00 -1.47682000000000E+00 -7.48700000000000E-02
13 1 7.00947000000000E+00 -9.36480000000000E-01 -7.52400000000000E-02
14 1 3.92063000000000E+00 -4.74093000000000E+00 -6.15800000000000E-02
15 4 7.33980000000000E-01 1.08786000000000E+00 -7.50300000000000E-02
16 4 7.12390000000000E-01 -4.56980000000000E-01 8.23350000000000E-01
17 4 7.12400000000000E-01 -4.55800000000000E-01 -9.75490000000000E-01
18 4 2.99301000000000E+00 2.11762000000000E+00 -7.47800000000000E-02
19 4 7.76531000000000E+00 -1.72634000000000E+00 -7.59100000000000E-02
20 4 7.14864000000000E+00 -3.21820000000000E-01 8.19690000000000E-01
21 4 7.14802000000000E+00 -3.20760000000000E-01 -9.69530000000000E-01
22 4 2.86501000000000E+00 -5.02316000000000E+00 -5.83300000000000E-02
23 4 4.40233000000000E+00 -5.15920000000000E+00 8.28370000000000E-01
24 4 4.40017000000000E+00 -5.16929000000000E+00 -9.47800000000000E-01
```

Ammonia molecular crystal:

```text
16 S
H N
1 1 2.19855889440000E+00 1.76390058240000E+00 8.80145481600000E-01
2 1 1.76390058240000E+00 8.80145481600000E-01 2.19855889440000E+00
3 1 8.80145481600000E-01 2.19855889440000E+00 1.76390058240000E+00
4 1 4.84115108400000E+00 1.61941554720000E+00 4.93981400880000E+00
5 1 4.35630903840000E+00 2.49981169680000E+00 3.63248012160000E+00
6 1 3.51957925440000E+00 1.15357413600000E+00 4.08403345680000E+00
7 1 4.08403345680000E+00 3.51957925440000E+00 1.15357413600000E+00
8 1 4.93981400880000E+00 4.84115108400000E+00 1.61941554720000E+00
9 1 3.63248012160000E+00 4.35630903840000E+00 2.49981169680000E+00
10 1 2.49981169680000E+00 3.63248012160000E+00 4.35630903840000E+00
11 1 1.15357413600000E+00 4.08403345680000E+00 3.51957925440000E+00
12 1 1.61941554720000E+00 4.93981400880000E+00 4.84115108400000E+00
13 2 1.37461317840000E+00 1.37461317840000E+00 1.37461317840000E+00
14 2 3.99815460000000E+00 1.99105592400000E+00 4.46364507600000E+00
15 2 4.46364507600000E+00 3.99815460000000E+00 1.99105592400000E+00
16 2 1.99105592400000E+00 4.46364507600000E+00 3.99815460000000E+00
0.00000000000000 0.00000000000000 0.00000000000000
5.01336000000000 0.00000000000000 0.00000000000000
0.00000000000000 5.01336000000000 0.00000000000000
0.00000000000000 0.00000000000000 5.01336000000000
```

## Extensions

No extension implemented to the original format.

## Missing Features

The following features are currently not supported:

- Helical boundary conditions

@Note Feel free to contribute support for missing features
or bring missing features to our attention by opening an issue.
Loading

0 comments on commit f8eeba0

Please sign in to comment.