Getting started

Tutorial videos

Videos for helping users to get started with colabfit-tools have been created and uploaded to Vimeo:

  1. Installation: link

  2. Overview of database structure: part 1 link and part 2 link)

  3. Example using the Si PRX dataset: link

  4. Dataset exploration example: link

Installing colabfit-tools

Using pip

Install directly from the GitHub repository using pip.

$ pip install git+https://github.com/colabfit/colabfit-tools.git@master

Installing Mongo

See the official MongoDB documentation for complete installation instructions. For convenience, the shell commands to download, install, and start a Mongo server have been included below (for Ubuntu 20.04).

# Instructions copied from MongoDB setup tutorial
wget -qO - https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
echo "deb [ arch=amd64,arm64  ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/5.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-5.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo systemctl start mongod

For installation without sudo or apt-get, it is suggested to use conda:

# Install using conda
conda install -c conda-forge mongodb

# Start the server without using sudo or systemctl
mongod --dbpath <path_to_folder_for_storing_mongo_data>

To confirm that this has been set up correctly, try opening a connection:

from colabfit.tools.database import MongoDatabase

database = MongoDatabase('test')

The logs from the mongod command should show the new connection, with output looking something like this:

2022-01-20T10:41:14.785-0600 I NETWORK  [conn1] received client metadata from 127.0.0.1:59890 conn1: { driver: { name: "PyMongo", version: "4.0.1" }, os: { type: "Linux", name: "Linux", architecture: "ppc64le", version: "4.18.0-305.3.1.el8_4.ppc64le" }, platform: "CPython 3.7.10.final.0" }

Note: in order for the MongoDatabase to be able to access the Mongo server, it must be able to open an SSH connection to the machine where the mongod command was run from. Refer to the PyMongo documentation for more details regarding setting up a connection to the Mongo server.

To enable access control (user/password authentication), see the following section of the MongoDB documentation.

First steps

Start your local Mongo server and confirm that it’s running.

$ sudo systemctl start mongod
$ sudo systemctl status mongod

Open a connection to the Mongo server from inside your Python script.

from colabfit.tools.database import MongoDatabase

client = MongoDatabase('my_database')

Build a Configuration just like you would build an ASE Atoms object

import numpy as np
from colabfit.tools.configuration import Configuration

atoms = Configuration(symbols='H2O', positions=np.random.random((3, 3)))

And finally, add the Configuration into the Database. Note that this command will not work if you haven’t first installed Mongo.

client.insert_data(
    [atoms],
    generator=False,
    verbose=True
)

Use mongosh for external verification that the data was added to your local database.

# In a Mongo terminal opened using the `mongosh` command-line-tool
$ show dbs
$ use my_database
$ my_database.configurations.findOne()

Next steps