Datasets

This page contains the datasets and the resulting fingerprint files supporting various papers and tests.

Fingerprint file format

Is the file called *-fp.dat

DatatypeCountContent
int81Space dimension
int81Number of fingerprints
int81Number of sections of each fingerprint
float8space dim.first fingerprint
float8space dim.second fingerprint
float8space dim.all the remaining fingerprints

Distance file format

Is the file called *-dist.dat

DatatypeCountContent
int81Number of fingerprints
int81Number of fingerprints
float8num. fp X num. fp.The distances

Sorted distances file format

Is the file called *-sorted.dat It contains all the distances sorted in increasing order. For N structures there are N*(N-1) values (called below number of distances).

DatatypeCountContent
int81Number of distances
float8num. distancesThe distances

Datasets for the Journal of Chemical Physics paper

Here are the datafiles and resulting fingerprints for the following paper:

A. R. Oganov and M. Valle, How to quantify energy landscapes of solids, The Journal of Chemical Physics, vol. 130, p. 104504, Mar. 14 2009. bib doi file

The fingerprint datasets have been obtained with the following parameters:

cutoff-distance: 15.0
bin-size: 0.05
peak-size: 0.075

With these parameters the fingerprint space has a dimension of 900. There are exceptions. See below.

Au8Pd4

12 atoms in the unit cell, 259 structures poscar energy

Computed cutoff distance: 13.1925 forced to: 15.0000

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

GaAs

8 atoms per unit cell, 2997 structures poscar energy

Computed cutoff distance: 20.8210 forced to: 30.0000 (note the difference). The space dimension here is 1800.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

H2O

12 atoms per unit cell, 949 structures poscar energy

Computed cutoff distance: 9.1283 forced to: 15.0000

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

Lennard-Johnes A4B8

12 atoms in the unit cell, 1949 structures poscar energy

Computed cutoff distance: 27.7744 forced to: 30.0000 (note the difference). The space dimension here is 1800.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

MgNH

12 atoms per unit cell, 6979 structures poscar energy

Computed cutoff distance: 26.4909 forced to: 30.0000 (note the difference). The space dimension here is 3600.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

MgO

32 atoms per unit cell, 967 structures poscar energy

Computed cutoff distance: 27.3328 forced to: 30.0000 (note the difference). The space dimension here is 1800.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

Datasets for the Acta Crystallographica paper

Here are the datafiles and resulting fingerprints for the following paper:

M. Valle and A. R. Oganov, Crystal fingerprint space - a novel paradigm for studying crystal-structure sets, Acta Crystallographica Section A, vol. 66, pp. 507-517, Sept. 2010. bib doi PDF file (PDF from the publisher of Foundations of Crystallography Online) publisher site

Few dataset are in common with the previous paper.

The fingerprint datasets have been obtained with the following parameters:

cutoff-distance: 30.0
bin-size: 0.05
peak-size: 0.075

Duplicated points have been removed if their distance is less than 2 10-6 retaining the lowest energy structure.

With these parameters the fingerprint space has a dimension of 1800.

SiO2 (3 atoms/cell)

3 atoms per unit cell, 10000 structures reduced to 4738. poscar energy

Computed cutoff distance: 6.6407 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (6 atoms/cell)

6 atoms per unit cell, 5728 structures reduced to 3384. poscar energy

Computed cutoff distance: 11.5061 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (9 atoms/cell)

9 atoms per unit cell, 4758 structures reduced to 4132. poscar energy

Computed cutoff distance: 12.4774 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (12 atoms/cell)

12 atoms per unit cell, 9999 structures reduced to 9245 poscar energy

Computed cutoff distance: 16.3994 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (24 atoms/cell)

24 atoms per unit cell, 608 structures poscar energy

Computed cutoff distance: 20.1117 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (36 atoms/cell)

36 atoms per unit cell, 9956 structures poscar energy

Computed cutoff distance: 45.1450 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.

SiO2 (48 atoms/cell)

48 atoms per unit cell, 3436 structures poscar energy

Computed cutoff distance: 33.1608 forced to: 30.0000.

script

distance matrix and the corresponding AVS fld file to load it in AVS/Express. The sorted distances file

fingerprints file and the corresponding AVS fld file to load it in AVS/Express.