SmotifCS Version 0.04 SmotifCS Hybrid Modeling Method PRE-REQUISITES: The hybrid modeling algorithm requires a BMRB formatted chemical shift file as input. Additionally, if the structure of the protein is known from any alternate resource, then a PDB-formatted structure file is required. This pdb-file can be present in a centralized local directory or a user-designated separate directory. Third-party Software: 1. MySQL 2. Phylip 3. Modeller 4. NMRPipe/TALOS Other requirements: 1. MySQL Smotif database (http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz) 2. Smotif chemical shift library and related files (http://fiserlab.org/SmotifCS/chemical_shift.tar.gz) 3. Local PDB directory (central or user-designated) - updated (http://www.rcsb.org). The path to all the pre-requisites should be provided in smotifcs.ini configuration file. Smotics has been tested on 1. CentOS release 6.6 (Final) 2. Centos release 7 DOWNLOAD AND INSTALL Third-party Software: 1. MySQL Can be downloaded from (http://dev.mysql.com/downloads/mysql/) 1.a INSTALL MySQL AND perl DBI SUPPORT $ yum install mysql mysql-devel mysql-server perl-DBD-MySQL 1.b START MYSQL FOR FIRST TIME # /sbin/service mysqld start Initializing MySQL database: Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: /usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h denali.aecom.yu.edu password 'new-password' Alternatively you can run: /usr/bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr ; /usr/bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd /usr/mysql-test ; perl mysql-test-run.pl Please report any problems with the /usr/bin/mysqlbug script! [ OK ] Starting mysqld: [ OK ] 1.c CHANGE ROOT PASSWORD # /usr/bin/mysqladmin -u root password 'new-password' 1.d CONNECT TO MYSQL TO VERIFY THAT NEW PASSWORD WORKS [root@denali tmp]# mysql -u root -h localhost -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.1.73 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> 1.e QUIT MYSQL CLIENT mysql> quit Bye 2. Phylip (version 3.69) PHYLIP is freely available from: http://evolution.genetics.washington.edu/phylip.html PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees). It is available free over the Internet, and written to work on as many different kinds of computer systems as possible. The source code is distributed (in C), and executables are also distributed. INSTALLATION http://evolution.genetics.washington.edu/phylip/getme.html if you’re using a Windows machine, installation is easy. Download the three zip-files (phylip.exe,phylipwx.exe,phylipwy.exe ), and extract them to a preferred folder. The subfolder exe contains all the programs. Manual can be found from the subfolderdoc. For Macintosh OS X you may download the packaged disk image (Phylip3.66.dmg). It is compressed, so you need to expand it, and copy the resulting folder to a desired location. Alternatively, you may compile the programs from their sources as outlined in the UNIX installation below. There are source codes and ready made compilations available for older Macintosh systems, Mac OS 8 or 9, also. Installation for UNIX systems is also quite straight-forward. These instruction apply for RedHat-based Linux systems. Download the source code and documentation package (phylip-3.66.tar.gz) into a suitable folder. Unzip the package with gzip utility (gzip –d phylip-3.66.tar.gz) and expand the tar ball (tar xvf phylip-3.66.tar). Move to the newly formed folder containing the source codes (cd phylip3.6/src). The folder contains a file called Makefile. Installation of the PHYLIP programs is done simply by typing make install INSTALL PHYLIP ON LINUX and UNIX http://evolution.gs.washington.edu/phylip/download/phylip-3.696.tar.gz You can easily install PHYLIP and compile it yourself on a Linux or Unix system, provided that you have a C compiler on your system. tar -zxvf phylip-3.696.tar.gz This uncompresses the archive and a phylip3.696 folder is created that contains within it three folders, doc, exe, and src. To make executables, use your C compiler. It is probably as simple as going into the src directory, copying Makefile.unx and calling the copy Makefile, and then typing the command $ cp Makefile.unx Makefile $ make install With luck this will work. After the compilation the executables and their font files will be in folder exe. INSTALLATION SUMMARY $ wget http://evolution.gs.washington.edu/phylip/download/phylip-3.696.tar.gz $ tar -zxvf phylip-3.696.tar.gz $ cd phylip-3.696 $ cd src/ $ cp Makefile.unx Makefile $ make install 3. Modeller (version 9.14 ) https://salilab.org/modeller/ MODELLER is used for homology or comparative modeling of protein three-dimensional structures. The user provides an alignment of a sequence to be modeled with known related structures and MODELLER automatically calculates a model containing all non-hydrogen atoms. MODELLER is available for download for most Unix/Linux systems, Windows, and Mac. Libraries needed for Modeller: yum install libz.so.1 yum install libutil.so.1 yum install librt.so.1 yum install libpthread.so.0 yum install libm.so.6 yum install libdl.so.2 yum install libglib-2.0.so.0 yum install libc.so.6 Install MODELLER: rpm -ivh modeller-9.14-1.***.rpm 4. NMRPipe/TALOS http://spin.niddk.nih.gov/NMRPipe/ NMRPipe is an extensive software system for processing, analyzing, and exploiting NMR spectroscopic data. Libraries needed for NMRPIPE: yum install xorg-x11-fonts-100dpi yum install xorg-x11-fonts-75dpi yum install xorg-x11-fonts-misc yum install libX11.so.6 yum install libXext.so.6 yum install libstdc++.so.6 wget http://spin.niddk.nih.gov/NMRPipe/install/download/install.com wget http://spin.niddk.nih.gov/NMRPipe/install/download/binval.com wget http://spin.niddk.nih.gov/NMRPipe/install/download/NMRPipeX.tZ wget http://spin.niddk.nih.gov/NMRPipe/install/download/talos.tZ wget http://spin.niddk.nih.gov/NMRPipe/install/download/dyn.tZ chmod a+r *.tZ chmod u+rx *.com ./install.com +dest /home/user/software/talos DOWNLOAD AND INSTALL INSTRUCTIONS FOR OTHER REQUIREMENTS: 1. MySQL Smotif database MySQL Smotif database is freely available from: http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz *** How to install a local copy of the Smotif Database *** - Log into the server running MySQL - Download Smotif database from http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz and save it to /tmp directory - Uncompress vilas_loop_pred.sql.gz cd /tmp/ tar -zxvf vilas_loop_pred.sql.gz - Connect to the MySQL server and create a database named vilas_loop_pred $ mysql -u root -h localhost -p 'your_mysql_root_password' Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2844 Server version: 5.1.73 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> create database vilas_loop_pred; mysql> quit - Load vilas_loop_pred database (This process might take some time). $ mysql -u root -h localhost -p vilas_loop_pred < vilas_loop_pred.sql Enter password: - Connect to the MySQL server as root and create a user with read access to vilas_loop_pred database [root@denali tmp]# mysql -u root -h localhost -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 14 Server version: 5.1.73 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use mysql Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed Create New user: mysql> CREATE USER 'my_user'@'client_computer_where_you_will_run_Smotifcs' IDENTIFIED BY 'my_pass'; # Query OK, 0 rows affected (0.00 sec) mysql> GRANT SELECT ON vilas_loop_pred.* TO 'my_user'@'client_computer_where_you_will_run_Smotifcs' ; Query OK, 0 rows affected (0.00 sec) 2. Smotif chemical shift library and related files Smotif chemical shift library and related files is freely available from: http://fiserlab.org/SmotifCS/chemical_shift.tar.gz *** How to install a local copy of Smotif chemical shift library and related files *** - Download chemical shift database from http://fiserlab.org/SmotifCS/chemical_shift.tar.gz and save it to /tmp directory - Uncompress chemical_shift.tar.gz and move it to /usr/local/databases INSTALLATION (SmotifCS ) To install this SmotifCS-0.03, run the following commands: 1. Manually: Install where standard Perl modules are stored tar -zxvf SmotifCS-0.03.tar.gz cd SmotifCS-0.03/ perl Makefile.PL make make test make install 2. Install in a custom location (/home/user/MyPerlLib) tar -zxvf SmotifCS-0.03.tar.gz cd SmotifCS-0.03/ perl Makefile.PL PREFIX=/home/user/MyPerlLib/ make make test make install Please, do not forget to add the following line: use lib "$ENV{HOME}/MyLib/share/perl5/" in ./smotifcs.pl; 3. Using a CPAN client: as root type: perl -MCPAN -e shell > install SmotifCS 4. Using a CPAN client and installing in a custom location (/home/user/MyPerlLib) perl -MCPAN -e shell > conf makepl_arg PREFIX=/home/user/MyPerlLib/ > install SmotifCS Please, do not forget to add the following line: use lib "$ENV{HOME}/MyLib/share/perl5/" in ./smotifcs.pl; HOW TO RUN THE SmotifCS HYBRID MODELING ALGORITHM: INITIALIZE THE CONFIGURATION FILE: 1.Set up the configuration file: The configuration file, smotifcs_config.ini has all the information regarding the required library files and other pre-requisite software. Set all the paths and executables in this file correctly. Set environment varible in .bashrc file: export SMOTIFCS_CONFIG_FILE=/home/user/SmotifCS-0.01/smotifcs_config.ini in $HOME/smotifcs_config.ini file. 2. Set environment varible in .bashrc file: export SMOTIFCS_CONFIG_FILE=/home/user/smotifcs_config.ini 3. Create a subdirectory with a dummy pdb file name (eg: 1abc or 1zzz). 4. Put the chemical shift input file (in BMRB format) in this directory. Use the filename 1abc/pdb1abcshifts.dat or 1zzz/pdb1zzzshifts.dat for the BMRB formatted chemical shift input file. For testing purpose you can download an input file (in BMBR format) from: http://fiserlab.org/SmotifCS/pdb1aabshifts.dat 5. Optional: If structure is known, include a pdb format structure file in the same directory. 1abc/pdb1abc.ent or 1zzz/pdb1zzz.ent 6. Run steps 1 to 6 as given above sequentially. Output from previous steps are often required in subsequent steps. Wait for each step to be completed without errors before going to the next step. 7. To run all steps together use: perl smotifcs.pl --step=all --pdb=1zzz --chain=A --havestructure=0 8. Use multiple-cores or clusters as available, for steps 2, 3 & 4. These are slow and require a lot of computational resources. 8. If structure is known, use --havestructure=1. Else, use --havestructure=0 in all the steps. Results: Top 5 models are stored in the subdirectory (1abc or 1zzz) as: Model.1.pdb, Model.2.pdb, Model.3.pdb, Model.4.pdb & Model.5.pdb MODELING PROTEINS USING A SUPER-SECONDARY STRUCTURE LIBRARY AND NMR CHEMICAL SHIFT INFORMATION SMOTIFCS implement a hybrid protein modeling algorithms that relies on a library of protein super-secondary structure motifs (Smotifs) and easily obtainable NMR experimental data. MODELING ALGORITHM STEPS: Step 1: Run Talos+ Get SS, Phi/PSi, Smotif Information (Single-core task) Usage: perl smotifcs.pl --step=1 --pdb=1zzz --chain=A --havestructure=0 Step 2: Compare experimental CS of Query SmotifS to theoretical CS of library Smotifs (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=2 --pdb=1zzz --chain=A --havestructure=0 Step 3: Cluster and rank chosen SmotifS (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=3 --pdb=1zzz --chain=A --havestructure=0 Step 4: Enumerate all possible combinations of Smotifs (about a million models) (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=4 --pdb=1zzz --chain=A --havestructure=0 Step 5: Rank enumerated structures using a composite energy function (Single-core task) Usage: perl smotifcs.pl --step=5 --pdb=1zzz --chain=A --havestructure=0 Step 6: Run Modeller to generate top 5 complete models (Single-core task) Usage: perl smotifcs.pl --step=6 --pdb=1zzz --chain=A --havestructure=0 Reference: Menon V, Vallat BK, Dybas JM, Fiser A. Modeling proteins using a super-secondary structure library and NMR chemical shift information. Structure, 2013, 21(6):891-9. Authors: Vilas Menon, Brinda Vallat, Joe Dybas, Carlos Madrid and Andras Fiser. SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command. perldoc SmotifCS You can also look for information at: RT, CPAN's request tracker (report bugs here) http://rt.cpan.org/NoAuth/Bugs.html?Dist=SmotifCS AnnoCPAN, Annotated CPAN documentation http://annocpan.org/dist/SmotifCS CPAN Ratings http://cpanratings.perl.org/d/SmotifCS Search CPAN http://search.cpan.org/dist/SmotifCS/ LICENSE AND COPYRIGHT Copyright (C) 2015 Fiserlab Members This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.