BC3203

Managing bioinformatics software with conda

The conda package manager is a very easy and convenient way to install and manage collections of bioinformatics software.

This page describes installing and using miniconda from within a cloud RStudio environment.

All of the commands below should be run from the Terminal window. The terminal window can be accessed by choosing Tools -> Terminal within RStudio.

  1. Download the miniconda installer
    wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
    
  2. Run the installer
    bash Miniconda3-latest-Linux-x86_64.sh
    

    You will need to accept the license conditions by typing yes. The installer will ask you various questions. Always just press enter to keep the default answers to these questions. At the very end of the installation it will ask;

    Do you wish the installer to initialize Miniconda3 ?

    Answer yes to this question. This will put some special conda startup commands into a file .bashrc in your home directory

  3. Test your installation Close your Terminal window and then open a fresh terminal window. After you do this the conda tool should now be available to you. Check that it is by typing the following command
    conda info
    

    If everything is working you can now cleanup the installer script

    rm Miniconda3-latest-Linux-x86_64.sh
    
  4. Setup channels to install bioinformatics software from the bioconda project

    You can setup channels by running the following commands. You only need to do this once but make sure you do it in the same order as shown below;

    conda config --add channels defaults
    conda config --add channels conda-forge
    conda config --add channels bioconda
    
  5. Install software

    For example, two pieces of software required to create phylogenetic trees are;

    • mafft : A multiple sequence aligner
    • iqtree : A maximum likelihood phylogenetic inference program

    These can both be installed in a single command

    conda install mafft iqtree
    

    As you progress through tutorials and assignments you may be prompted to install more software. Just modify the command above to install as required.

  6. Running conda from within RMarkdown code chunks

    When you run bash code in RStudio your RStudio bash sessions need to setup some special conda magic. In order to make this work from within RMarkdown code chunks you will need to do the following;

    Firstly run the line of code below to set an Renviron file that will set an environment variable containing the location of a conda activation scripts

    echo "PATH=\"$HOME/miniconda3/bin:$PATH\"" > ~/.Renviron
    

    You only need to do this once. After you have done it you will need to restart (Session -> Restart R) for it to take effect.