Skip to content

Jupyterhub

Accessing Jupyterhub

The new Jupyterhub is available at https://jupyterhub.kennesaw.edu. Log in with your netID and password (not the full email).

Set up conda

With the new Jupyterhub server comes the ability to manage your own conda environments.

Once logged in, click "Terminal". This will open a command prompt in your home folder. Initialize conda to add items to your .bashrc file to automatically enter you into the system-wide base conda environment when you log in.

/opt/jupyterhub/miniforge/condabin/conda init

You can either exit the terminal and re-open it, or "dot source" using the following command to load the changes you just made.

. ~/.bashrc

You should see an environment in parentheses before your username.

Before, it will look like

[cdow1@jupyterhub ~]$

Afterward, it will look like

(base) [cdow1@jupyterhub ~]$

Creating conda environments

To create an environment, first set up conda

Once initialized, you can create environments. You should always create an environment outside the base environment, ideally one for each new project. If you want to use it inside Jupyter notebooks, make sure to install ipykernel.

conda create --name python310 python=3.10 ipykernel

Activate the new environment in your current shell

conda activate python310

And if you want to always want to activate the same env, add it to your .bashrc

echo "conda activate python310" >> ~/.bashrc

To install a kernel for use in notebooks, run this command, substituting in your environment name in the file system path, name, and display-name.

~/.conda/envs/python310/bin/python -m ipykernel install --user --name 'python310' --display-name "My Python 3.10"

Tensorflow

Note that tensorflow can be a bit fiddly in conda, so first create your environment, activate it, and install tensorflow using pip. Check Tensorflow's website to make sure you have a supported version of Python. The tensorflow[and-cuda] metapackage will install the necessary CUDA and cuDNN dependencies. The last line verifies that Tensorflow can see the GPU device.

conda create -y -n tf217_py312 python=3.12 ipykernel
conda activate tf217_py312
pip install tensorflow[and-cuda]
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Pytorch

Check Pytorch's website for the commands to install using conda. Make sure you have numpy or the dependency resolution may have issues.

conda create -y -n pytorch python=3.12 ipykernel numpy
conda activate pytorch
conda install -y pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Installing packages in conda environments

Packages must be installed in your conda environment one of several ways.

First, you can install packages as part fo the the conda env creation command. Just list all of the packages you want to install after python and ipykernel.

conda create --name python310 python=3.10 ipykernel <package1> <package2> ...

Or if you already have a conda environment created, you can install packages by opening a terminal session, activating your chosen conda environment, and installing the packages. Note that you want the command prompt to say what conda env you've activated before installing any packages.

conda activate python310
conda install <package1> <package2> ...

Or you can use magic commands after you’ve created the conda env, installed the kernel, and selected the kernel for your current notebook. In a notebook cell:

%conda install <package1> <package2> ...

Sharing conda environments

To share environments, you must export from conda and the other person can import using your spec file.

To export a conda environment, first activate your chosen environment, then export it. This command will overwrite my_environment.yml if it exists in your home folder.

conda activate my_environment
conda env export > my_environment.yml

You can copy the file into a shared location (for group work), download and send it via other means, or ask me to copy it between home folders.

To import an environment

conda env create -f my_environment.yml

See the above instructions for activating the new environment and installing it as a Jupyter notebook kernel.

Managing Jupyter kernels

Finding kernels

When installing a kernelspec, it's installed to a location in your home directory. Some kernelspecs are installed system-wide. To get a list of all kernelspecs on your account, use this command from the terminal.

jupyter kernelspec list

The output will look like

(base) [cdow1@jupyterhub ~]$ jupyter kernelspec list
Available kernels:
  cdow1_python310    /gpfs/user_home/os_home_dirs/cdow1/.local/share/jupyter/kernels/cdow1_python310
  pytorch_cu121      /gpfs/user_home/os_home_dirs/cdow1/.local/share/jupyter/kernels/pytorch_cu121
  tf217_py312        /gpfs/user_home/os_home_dirs/cdow1/.local/share/jupyter/kernels/tf217_py312
  python3            /opt/jupyterhub/anaconda/share/jupyter/kernels/python3
  anaconda2024.02    /usr/local/share/jupyter/kernels/anaconda2024.02
  old_jupyterhub     /usr/local/share/jupyter/kernels/old_jupyterhub
  python39           /usr/local/share/jupyter/kernels/python39
  r4.3               /usr/local/share/jupyter/kernels/r4.3

System-wide kernels are in /usr/local/share/jupyter/kernels/ and user kernels are in /gpfs/user_home/os_home_dirs/$netid/.local/share/jupyter/kernels/

Removing kernels

If you delete an old conda environment, it doesn't automatically delete the jupyter kernelspec. To also delete the kernel:

conda env remove -n pytorch_cu121
jupyter kernelspec remove pytorch_cu121

Confirm with Y.

Convert Anaconda to miniforge

Back up existing environments

You may want to export your old envs before moving to miniforge. Replace my_env with the name of your conda environment. (Note: you can list all of your environments with conda env list).

The --from-history flag will attempt to use the specifications from the commands you originally used to build the env export file, instead of exporting the env "as-is". This will remove packages' build information.

Note that if you use this file to build a new env after switching to miniforge, you should remove the line containing the "defaults" channel from Anaconda

conda env export --from-history --name my_env > ~/my_env_backup.yml

Initialize miniforge

Log into Jupyterhub. Open a terminal. If you have already configured conda, you should see an environment name next to your prompt.

(base) [cdow1@jupyterhub ~]$

First remove the old conda. This command clears your .bashrc file of any modifications from conda.

conda init --reverse

Then deactivate the old conda environment

conda deactivate

Next, use miniforge to modify your .bashrc. Note that miniforge replaces anaconda in the path.

/opt/jupyterhub/miniforge/condabin/conda init

Finally, source the new environment.

. ~/.bashrc

And you should once again see the base environment next to the command prompt. You will use the same commands to manipulate conda envs, but using the new package resolver.

Installing packages inside a notebook

To install packages inside a notebook instead of using the terminal, use the %conda or %pip magic commands rather than !conda and !pip.

!conda and !pip instruct the notebook to run the command as a shell command, which may not include crucial environment information. You'll probably see an error like /bin/bash: pip: command not found

%conda or %pip will use the environment of whichever kernel you're running in the current notebook.

Alternate language environments

In addition to Python, Conda also supports programming in R, Julia, Java, C/C++, etc.

To create an environment in another language, you just have to include the Jupyter kernel package for that language. You can find a list of available kernels on the Jupyter project's Github.

R

There is an R kernel installed system-wide that contains almost all packages installed on the Datascience RStudio server

saspy (SAS 9.4 streaming)

Preparations

:!: Note: The initial preparations are complete for all users on the grid as of October 2020. You can easily test by just trying to SSH into the appropriate server from a Jupyterhub terminal. Skip to #Using SASPy

To connect to the SAS 9.4 grid using Jupyterhub, first you must generate an ssh private/public keypair and copy it to the remote SAS compute node. Open a terminal by clicking the + sign in the left menu in Jupyterlab. Click the Terminal button in the Other section. Once a session is open, generate your key.

ssh-keygen -t rsa -b 4096

After running the command, press enter 3 at the 3 prompts to save it in the default location without a passcode. Then copy the public key to the remote compute node.

ssh-copy-id dssasprdcmp06

When asked if you're sure you want to continue connecting, type y and hit enter.

You can test to see if worked by using SSH. You should connect without a password prompt. If asked to store the host int he list of known hosts, type y.

ssh dssasprdcmp06.kennesaw.edu

Using SASPy

Once you can successfully SSH using the Jupyterhub terminal, you can test in a Jupyter notebook.

import sapy
sas = saspy.SASsession(cfgname='ssh')

If successful, you will see something like the following:

SAS Connection established. Subprocess id is 27339

You can test it with the following code to open the SASHELP.CARS table and list the column information:

cars_sas = sas.sasdata('CARS', 'SASHELP')
cars_sas.columnInfo()

Remember to close the connection when you're done.

sas._endsas()

For more information on how to use saspy, check the documentation: https://sassoftware.github.io/saspy/getting-started.html

SWAT (SAS Viya Streaming)

Technical details

The Viya SSL certs need to be loaded. This is done in /etc/profile.d/sysadmin.sh and /etc/jupyterhub/jupyterhub_config.py with $CAS_CLIENT_SSL_CA_LIST. (Note that this is the same trustedcerts.pem that SAS 9.4 uses to connect to Viya.)

To do this within a notebook (Python):

import os
os.environ["CAS_CLIENT_SSL_CA_LIST"] = "/gpfs/sas/compute/sashome/SASSecurityCertificateFramework/1.1/cacerts/trustedcerts.pem"
print(os.environ["CAS_CLIENT_SSL_CA_LIST"])

Connecting to Viya from Python requires the SWAT package. There are two methods of connecting - direct binary access or via REST API. Binary access is faster since it doesn't have to translate from JSON to a SAS data format, but REST code is more portable. (Note: direct access to the CAS server is only possible from on-campus resources such as Jupyterhub. If you plan to copy the code to a different python environment, plan to use REST.)

Preparations

First, encode your password in SAS. Log into SAS Viya and start a new program. Use the following SAS code, substituting your netID password, then copy/paste the output into your .authinfo file.

proc pwencode in='your-password' method=sas004;  
run;

Note: Do not save this file with your password in plaintext.

The output will look something like this. Do not share this with anyone. Make sure the include the whole string, including the portion in braces {SAS###}.

{SAS004}25E96978CF3B5CB4C8F2801B2932C8B23C5C20CB64C36BC5

Open a terminal window in Jupyterhub. Edit your .authinfo file and replace with the newly encoded password from the previous step.

nano .authinfo

Choose one of these methods (default is Binary):

# Binary
host dssasviyaprdctl02.kennesaw.edu port 5571 user <netID> password <password>

# REST
host dssasviya.kennesaw.edu port 443 user <netID> password <password>

Using SWAT

Once the file is in place, you can create the connection by passing the authinfo string.

import swat
conn = swat.CAS('dssasviyaprdctl02.kennesaw.edu', 5571, authinfo='/gpfs/user_home/os_home_dirs/<netid>/.authinfo')

# Test the connection by getting the server status
out = conn.serverstatus()

Close your session when you're done.

conn.close()

For more information, check out the documentation on SWAT.