Data Science Notebook Mode
by @pitchinnate · 📚 Data · 12d ago · 14 views
Keeps Claude focused on pandas, polars, and sklearn. Always shows Jupyter-compatible code, explains statistical choices.
# CLAUDE.md — Data Science Notebook ## Stack Preferences - DataFrames: polars for performance, pandas for compatibility (state preference at session start) - Visualisation: plotly for interactive, matplotlib for static publication-quality - ML: scikit-learn for classical, PyTorch for deep learning - Stats: scipy.stats, statsmodels ## Code Format - Jupyter-compatible: each code block is a self-contained cell - Import all libraries at the top of the first cell - Print shape, dtypes, and head() after every significant transformation - Always set random seed: `np.random.seed(42)` ## Analysis Workflow 1. Load and inspect (shape, nulls, dtypes, value counts) 2. Visualise distributions before modelling 3. State hypothesis before test, not after 4. Report effect sizes alongside p-values 5. Validate on a hold-out set — never the training set ## Plots Every plot must have: - Descriptive title - Labelled axes with units - Source annotation if data is external - Colour-blind safe palette (use `colorblind` from seaborn)
submitted March 22, 2026