Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests #43

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Tests #43

wants to merge 18 commits into from

Conversation

davidt0x
Copy link

Setup pytests using testbook for all notebooks.

Any notebooks that require datasets to run should place code for
downloading\extracting that data into download_data.sh. This allows
systematic downloading and caching of datasets for testing.
The path construction for the dataset was prepending the users $HOME
directory. I removed this to make it more transportable and similar
to the other notebooks.
This makes it easier to track if the dataset has been downloaded
already or not so we can cache things on della.
htfa notebook expects the dataset to be in /data. This is where it
resides in the Docker image. I extracted the download commands for
the data files and put them into download_data.sh. I modified the
commands to extract the data into a local data/ folder. I then
modified the notebook to look for the data in both of these locations
and throw an exception otherwise.
The rt-cloud repo is a dependency of the the notebook. It is not
pip installable so I added it as a git sub-module. I have also
extracted a dependency list from rt-cloud/environment.yml to
include with the other notebook dependencies.
…into tests

Conflicts:
	notebooks/real-time/rtcloud_notebook.ipynb
One final issue to fix is that the last cell in the notebook
executes the a python script on the command line. Even when this
script fails this does not cause the cell in the notebook to fail.
Importing the main function and running directly ensures that tests
fail if there are errors in sample.py. This is because if a cell
has a command line execution that fails this is not considered a
cell failure for testbook (is this a bug?)
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Member

@mihaic mihaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @davidt0x! I have a couple of suggestions.

@@ -92,8 +92,7 @@
"source": [
"# Download and extract example data from Zenodo\n",
"!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n",
"!tar -xzf brainiak-aperture-isc-data.tgz\n",
"!rm brainiak-aperture-isc-data.tgz"
"!tar -xzf brainiak-aperture-isc-data.tgz\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding --skip-old-files to minimize I/O? Same goes for any other extraction code in notebooks.

@@ -92,8 +92,7 @@
"source": [
"# Download and extract example data from Zenodo\n",
"!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have --no-clobber so the file is not downloaded again? Same goes for any other download code in notebooks.

requirements.txt Outdated
@@ -0,0 +1,32 @@
testbook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is easier for notebook authors in the future if each notebook has its own requirements file.

@@ -111,7 +111,7 @@
"source": [
"*1.2 Load participant data*<a id=\"load_ppt\"></a>\n",
"\n",
"Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed in the home directory with the name Corr_MVPA "
"Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed same directory as this notebook with the name Corr_MVPA "
Copy link
Collaborator

@manojneuro manojneuro Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CameronTEllis FYI: please note a minor update to the data directory to make it compatible for automated testing.

@davidt0x
Copy link
Author

Ok, I think I have addressed your comments @mihaic. Can you take a look?

"Clear any pre-existing plot for this run using 'clearRunPlot(runNum)'\n",
"###################################################################################\n",
"/tmp/notebook-simdata/labels.npy\n",
"Collected training data for TR 0\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi David, Can you clear the "outputs" created from running the notebook before checking in? I think it's in the Jupiter menu Cell->clear->all_outputs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants