notebooks

The 'notebook pipeline' in this directory is a pseudo templating system with kaggle kernels API integration.

Directory structure

To start a new track (i.e. a themed sequence of notebooks such as are found on the Kaggle Learn homepage, or in Kaggle Learn Challenges) called "spam", run new_track.sh spam. This will create a spam/ subdirectory, with the following contents:

raw/
__init__.py
track_meta.py
default.yaml

raw/ is where the notebooks you author should go. These will be straightforward ipynb notebooks which (perhaps with some setup such as downloading necessary datasets, installing libraries, and careful path manipulation) may be runnable locally. However, they're also recipes for generating ipynb notebooks.

track_meta.py defines some metadata, mostly about the notebooks you'll be syncing. A notebook in the raw/ subdirectory will only be rendered (and have a kernel-metadata.json file generated) if it has an entry in track_meta.py. (This means you're welcome to put as many throwaway testing notebooks as you like in raw without worrying about them breaking anything.) See examples/example_track/track_meta.py for an exhaustively commented example.

default.yaml is a config file specifying one way to build our raw notebooks into kernels. Whereas track_meta.py deals wit the "what", this config file deals more with the "how". A single track may have many config files, each of which may generate a distinct set of kernels.

Once you've created some raw notebooks and run the pipeline (more on that below), your directory structure will look like:

raw/
default/
    rendered/
    kernels_api_metadata/
__init__.py
track_meta.py
default.yaml

rendered/ is where the notebooks generated from the raw/ recipes go. These are what get synced to Kaggle Kernels. You should not edit these directly (treat them as build products).

kernels_api_metadata/ contains kernel-metadata.json files for syncing notebooks to Kaggle Kernels via the API.

If you create further config files (e.g. testing.yaml), they will generate their own analogous subdirectories (e.g. testing/rendered/, testing/kernels_api_metadata/).

Pipeline steps

Step 1: Prepare Push

prepare_push.py translates notebooks in raw/ to publishable notebooks in rendered/. The logic for this step mostly lives in lesson_preprocessor.py. Most of its work is in expanding macros which look like #$HIDE_OUTPUT$, or #$EXERCISE_URL(2)$. See MACROS.txt for a listing of available macros.

The Kaggle Kernels API requires a kernel-metadata.json file for any kernel being pushed to the site. prepare_push.py also generates these in the kernels_api_metadata subdirectory.

Step 2: Pushing

Use ./pushall.sh as ./pushall deep_learning/prod

Changes to the functionality of this pipeline happen by modifying the nb_utils folder.

Name		Name	Last commit message	Last commit date
parent directory ..
bqml		bqml
computer_vision		computer_vision
data_cleaning		data_cleaning
data_viz_to_coder		data_viz_to_coder
deep_learning		deep_learning
deep_learning_intro		deep_learning_intro
embeddings		embeddings
ethics		ethics
examples		examples
feature_engineering		feature_engineering
feature_engineering_new		feature_engineering_new
game_ai		game_ai
geospatial		geospatial
machine_learning		machine_learning
microchallenges		microchallenges
ml_explainability		ml_explainability
ml_intermediate		ml_intermediate
nb_utils		nb_utils
nlp		nlp
pandas		pandas
python		python
sql		sql
sql_advanced		sql_advanced
time_series		time_series
.gitignore		.gitignore
ISSUES.md		ISSUES.md
MACROS.txt		MACROS.txt
README.md		README.md
clean.py		clean.py
new_comp_submit.sh		new_comp_submit.sh
new_track.sh		new_track.sh
prepare_push.py		prepare_push.py
pushall.sh		pushall.sh
requirements.txt		requirements.txt
retry.sh		retry.sh
test.sh		test.sh
test.sh~		test.sh~
updateall.sh		updateall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Directory structure

Pipeline steps

Step 1: Prepare Push

Step 2: Pushing

FilesExpand file tree

notebooks

Directory actions

More options

Directory actions

More options

Latest commit

History

notebooks

Folders and files

parent directory

README.md

Directory structure

Pipeline steps

Step 1: Prepare Push

Step 2: Pushing