There are many ways to program. One of the most productive paradigms is interactive: you use a REPL (read-evaluate-print loop) to write and test your code as you code, and then copy the tested code to a file.
The REPL method, which originated in LISP development environments, is well suited to Python programming, since Python has always had good interactive development tools. The downside to this style of programming is that once you’ve written the code, you have to extract the tests and write the documentation separately, keep all of that in a repository, package and publish your package and documentation.
Donald Knuth’s literate programming paradigm prescribes writing documentation and code in the same document, with documentation intended for humans interspersed with code intended for the computer. Literate programming has been widely used for scientific programming and data science, often using portable environments such as Jupyter Notebooks, Jupyter Lab, Visual Studio Code, and PyCharm. One problem with notebooks is that they sometimes don’t work well with repositories because they hold too much information, including metadata that nobody cares about. That creates a problem when there are merge conflicts, since notebooks are cell-oriented and source code repositories like Git are line-oriented.
Jeremy Howard and Hamel Husain of fast.ai, along with some two dozen minor collaborators, created a set of command line utilities that not only allow Jupyter Notebooks to work well with Git, but also enable highly interactive literate programming. productive. style. In addition to producing correct Python code quickly, you can produce documentation and tests at the same time, save everything to Git without fear of corruption due to merge conflicts, and publish to PyPI and Conda with just a few commands. While there is a learning curve to these utilities, that investment pays dividends in that you can get it done with your development project in about the time it would normally take to just write the code.
As you can see in the following diagram, nbdev works with Jupyter Notebooks, GitHub, Quarto, Anaconda, and PyPI. To summarize what each piece of this system does:
- You can generate documentation using Quarto and host it on GitHub Pages. Documents are LaTeX compatible, searchable and automatically linked.
- You can publish packages to PyPI and Conda, as well as tools to simplify package releases. Python best practices are automatically followed, for example, only exported objects are included in __all__.
- There is two-way synchronization between the notebooks and the plain text source code, allowing you to use your IDE to navigate your code or make quick edits.
- Tests written as ordinary notebook cells are executed in parallel with a single command.
- There is seamless integration with GitHub Actions that runs your tests and rebuilds your docs.
- Git-compliant notebooks with Jupyter/Git hooks that clean up unwanted metadata and generate merge conflicts in a human-readable format.
The nbdev software works with Jupyter Notebooks, GitHub, Quarto, Anaconda, and PyPi to produce a productive and interactive environment for Python development.
nbdev installation
nbdev works on macOS, Linux, and most Unix-style operating systems. Requires a recent version of Python 3; I used Python 3.9.6 on macOS Ventura, running it on a MacBook Pro M1. nbdev works on Windows under WSL (Windows Subsystem for Linux), but not under cmd or PowerShell. You can install nbdev with pip or conda. I used pip:
pip install nbdev
That installed 29 command line utilities, which you can list using nbdev_help
:
% nbdev_help
nbdev_bump_version Increment version in settings.ini by one
nbdev_changelog Create a CHANGELOG.md file from closed and labeled GitHub issues
nbdev_clean Clean all notebooks in `fname` to avoid merge conflicts
nbdev_conda Create a `meta.yaml` file ready to be built into a package, and optionally build and upload it
nbdev_create_config Create a config file.
nbdev_docs Create Quarto docs and README.md
nbdev_export Export notebooks in `path` to Python modules
nbdev_filter A notebook filter for Quarto
nbdev_fix Create working notebook from conflicted notebook `nbname`
nbdev_help Show help for all console scripts
nbdev_install Install Quarto and the current library
nbdev_install_hooks Install Jupyter and git hooks to automatically clean, trust, and fix merge conflicts in notebooks
nbdev_install_quarto Install latest Quarto on macOS or Linux, prints instructions for Windows
nbdev_merge Git merge driver for notebooks
nbdev_migrate Convert all markdown and notebook files in `path` from v1 to v2
nbdev_new Create an nbdev project.
nbdev_prepare Export, test, and clean notebooks, and render README if needed
nbdev_preview Preview docs locally
nbdev_proc_nbs Process notebooks in `path` for docs rendering
nbdev_pypi Create and upload Python package to PyPI
nbdev_readme None
nbdev_release_both Release both conda and PyPI packages
nbdev_release_gh Calls `nbdev_changelog`, lets you edit the result, then pushes to git and calls `nbdev_release_git`
nbdev_release_git Tag and create a release in GitHub for the current version
nbdev_sidebar Create sidebar.yml
nbdev_test Test in parallel notebooks matching `path`, passing along `flags`
nbdev_trust Trust notebooks matching `fname`
nbdev_update Propagate change in modules matching `fname` to notebooks that created them
The developers at nbdev suggest watching this 90 minute video or following this approximately one hour written tutorial. I did both and also read more documentation and some of the source code. I learned different material from each one, so I suggest watching the video first, then doing the tutorial. For me, the video gave me a clear enough idea of the usefulness of the package to motivate me to follow the tutorial.
Start the nbdev tutorial
The tutorial starts with installing Jupyter Notebook:
pip install notebook
And then launching Jupyter:
jupyter notebook
The installation continues on the laptop, first creating a new terminal and then using the terminal to install nbdev. You can skip that installation if you already did it in a shell, like I did.
You can then use nbdev to install Quarto:
nbdev_install_quarto
That requires root access, so you’ll need to enter your password. You can read Quarto’s source code or documents to verify that it is safe.
At this point, you need to navigate to GitHub and create an empty repository (repo). I followed the tutorial and called mine nbdev_hello_world, and added a fairly generic description. Create the repository. See the instructions if you need them. Then clone the repository to your local machine. The instructions suggest using the Git command line on your machine, but I like to use GitHub Desktop, which also worked fine.
In any case, cd
in your repository in your terminal. It doesn’t matter if you use a terminal on your desktop or on your laptop. now run nbdev_new
, which will create a bunch of files in your repository. Then commit and push your additions to GitHub:
git add .
git commit -m'Initial commit'
git push
Go back to your repository on GitHub and open the Actions tab. You will see something like this:
GitHub actions after initial commit. There are two: a continuous integration (CI) workflow for cleaning up your code, and a Deployment to GitHub Pages workflow for publishing your documentation.
Now enable GitHub Pages, following the optional instructions. It should look like this:
Enabling GitHub Pages.
Open the Actions tab again and you will see a third workflow:
There are now three workflows in your repository. The new one generates web documentation.
Now open your generated website, at https://{user}.github.io/{repo}. Mine is at https://meheller.github.io/nbdev-hello-world/. You can copy that and change meheller to your own GitHub identifier and see something similar to the following:
Web documentation home page for the package.
Continue the nbdev tutorial
Now we’re finally getting to the good stuff. You will install web links to automatically clean the notebooks when you register them,
nbdev_install_hooks
export your library,
nbdev_export
install your package,
pip install -e '.[dev]'
preview your documents,
nbdev_preview
(and click the link) and finally start editing your Python notebook:
jupyter notebook
(and click on nbs, and click on 00_core.ipynb).
Edit the notebook as described, then prepare your changes:
nbdev_prepare
Edit index.ipynb as described, then push your changes to GitHub:
git add .
git commit -m'Add `say_hello`; update index'
git push
If you want, you can go ahead and add advanced features.
The nbdev-hello-world repository after finishing the tutorial.
As you’ve seen, especially if you’ve worked through the tutorial yourself, nbdev can enable a highly productive Python development workflow in notebooks, working seamlessly with a GitHub repository and the Quarto documentation shown on the pages of GitHub. If you haven’t worked through the tutorial yet, what are you waiting for?
—
Contact: fast.ai, https://nbdev.fast.ai/
Cost: Free open source under Apache License 2.0.
Platforms: macOS, Linux, and most Unix-style operating systems. It works on Windows under WSL, but not under cmd or PowerShell.
Copyright © 2023 IDG Communications, Inc.
Be First to Comment