Python¶
API¶
- List of API wrappers
- Layman guide to create APIs for DS
- How to build an API in Python
- Datamodel code generator: create pydantic model from an openapi file and others
- FastAPI code generator: create a FastAPI app from an openapi file
Asynchronous Programming¶
Audio recognition¶
- Kaldi and speech recognition
- FastAI for audio classification and frequency transforms
- Sound data analysis with
librosa
- Whisper: OpenAI model for audio transcriptions
- Pedalboard: a Python library for working with audio: reading, writing, rendering, adding effects, and more
- WhisperX: Automatic Speech Recognition with Word-level Timestamps and Diarization
- Moonshine: Fast and accurate automatic speech recognition (ASR) for edge devices
- Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit
Automate boring stuff¶
- Automate boring stuff
- Scheduling recurring jobs with Python
- Numerizer
- Calendar
- Mail with HTML template and charts
- Robotic Process Automation (RPA) with Python
schedule
as Python scheduling library- PyAutoGUI
- Convert docx to HTML with
mammoth
- How to build a serverless automation to cross-post blog articles
- Rocketry: a modern statement-based scheduling framework for Python
Caching¶
Carbon footprint¶
Cheatsheets¶
CLI¶
- How to create a Python CLI app
- Colorama
- Python Fire
- Typer
- CLI comparison
- Cyclopts: a modern, easy-to-use command-line interface framework
- radicli: radically lightweight command-line interfaces
Code freezing¶
Code maintenance¶
- Why your code is probably bad
- Coding mistakes made by DS
- How to make Python run faster
- Tools for production quality code
- Time complexity for DS
- Clean code Python
- Down with technical debt for Python DS
py-spy
code profilerline_profiler
code profiler- Mypy type checker
- Logging in python with
logzero
- Memory profiler for DS
- Python speedup skills
- Python coding mistakes
- Python best practices for DS
- Improving Python code efficiency
- How to write better Python code
- Reduce in Python
- Static code analysis
- Log, don't print
- Python type annotations
pydantic
type checking- Joblib to speed up Python pipelines
- Avoid recursion in favour of closure
- Custom context managers
- Complexity theory and Big O notation
- Parallelization with Python
- Python functions to interact with JSON
icecream
for code debuggingdebugpy
for code debug- Python features from 3.7 to 3.9
- How to avoid nested if-else statements
- Codetags for code comments
kedro
for code modularization and pipeline visualizationretry
decorator- Monitoring Python code execution
scalene
for ML code memory consumptionbirdseye
for visual code debugging- Best practices for writing code comments
- Maintain and visualize Python dependencies
pretty errors
memray
code memory profiler- Custom classes supporting
with
statement and context management - PySnooper: a poor man's debugger
- Improve performance with caching via
@lru_cache
- The art of naming things
- Scalene: a Python CPU+GPU+memory profiler with AI-powered optimization proposals
- Generates call graphs for dynamic programming language
- Google Python Style Guide
- pyinstrument: call stack profiler for Python
- Python Protocols: Leveraging Structural Subtyping
- Stamina: production-grade retries for Python
Colors¶
- ColorAid: a pure Python, object oriented approach to colors
- pypalettes: a large (+2500) collection of color maps for Python and its color palette finder
Dash¶
dash
regression- Capturing mouse events position
dash
SVM- Reactive dashboard with
dash
- Introducing
dash
dash
in PBPawesome-dash
- Metrics in
dash
- Long callbacks in Dash
- What is a Data App?
Data Augmentation¶
- AugLy
snorkel
for training data labeling- Cleanlab: automatically find and fix label issues in ML datasets
- Bulk: a quick developer tool to apply some bulk labels
Data Processing¶
flowpy
- Data anonymization tutorial
- Unstructured: Open-Source Pre-Processing Tools for Unstructured Data
Data Structures¶
- Container data types
- Python dictionaries on steroids with
munch
- Format comparison for large datasets
Data validation¶
schema
library for data validationpandera
- Joint usage of
hypothesis
andpandera
to automatically create validation test examples
Datatable¶
Dates and times¶
Dependencies management¶
Digital clock¶
Documentation¶
- Autodocs with Python
- How to write an awesome readme
diagrams
as codepycco
for source files inline docspdoc
: API Documentation for Python projects
mkdocs¶
- mktestdocs: run pytest against markdown files/docstrings
- mkdocs-jupyter: use Jupyter Notebooks in mkdocs
- Python markdown terminal built for mkdocs
- Mkdocs Newsletter: show the changes of documentation repositories in a user friendly format
- Python Code Playground in MkDocs
- mkdocs-charts-plugin: mkdocs plugin to add plots from data using vegalite
DTale¶
File system¶
Feature flags¶
- OpenFeature: an open specification that provides a vendor-agnostic, community-driven API for feature flagging
- GO Feature Flag: a completely open-source, simple and lightweight feature flag solution
Functional programming¶
- Functions attributes
- Python operators module
funcy
- Lazy Evaluation Using Recursive Python Generators
- Writing Python like it's Rust
Game development¶
- Pygame
- Traffic intersection simulation
- A primer on game programming
- Build Tic-Tac-Toe with Python
- Build a Tic-Tac-Toe Game Engine With an AI Player in Python
GUI¶
- Toga: a Python native, OS native GUI toolkit
- Textual: a Rapid Application Development framework for Python
- DearPyGui: a modern, fast and powerful GUI framework for Python
Holidays¶
Hypothesis testing¶
Jupyter¶
- Jupyter tools to increase productivity
- Nbconvert
- Jupyter notebooks for data science
- Jupyter themes
- Jupyter lab
- Interactive dashboards in Jupyter
- Tips for writing in Jupyter notebooks
- Magic Python commands to boost productivity
- Jupyter notebook from cmd raises
module not found
error - Jupyter notebook installation issue using
pip
- Jupyter notebooks in Excel
- JupyterLab cells execution time
- Deploy Jupyter notebook with Binder
- Jupyter magic commands
- Tools for notebook reproducibility
- JupyterLite: Jupyter in web browser
- Jupyter Desktop App
- Enable notification for jupyter cells execution
atoti
for BI dashboard in jupyter- StickyLand for sticky notes in Jupyter
- Convert notebook to web app with Mercury
- Notebook cells output strip out with
nbstripout
- In and Out variables and %store magic command
- Jupyter Book
- Quarto: an open-source scientific and technical publishing system
- papermill: a tool for parameterizing, executing, and analyzing Jupyter Notebooks
marimo¶
Logging¶
- Structured logging
- Whylogs for data logging
- Toolong: a terminal application to view, tail, merge, and search log files
Missing values¶
Numpy¶
OO Programming¶
- Python classes and objects
- Dunder methods
- Special methods
- Class decorators
- Python function tools
- Managing instance attributes in Python
- Methods in Python for DS
- Function overloading
- Easy tutorial with Cat vs Dog
- Practical intro to object oriented programming
- Tricks for Python classes
- Design patterns
- RealPython best practices collection
- Python class
__slots__
- Strict constants in Python
- Python decorator patterns
__init__
is not a constructor: a deep dive in Python object creation- Python
@property
decorator
Object Relational Mapper (ORM)¶
OR-Tools¶
Pandas¶
- Drop all rows after first occurrence of column value
- Drop all rows after first occurrence of column value 2
- Vectorize data aggregation
- Selecting rows based on value counts of a column
- Filter rows of dataframe with operator chaining
- Python dictionaries get nested value
- Styling
pandas
bamboolib
sidetable
with an introductive blog post- HTML table in PBP
cut
to transform numerical data into cateoricalswifter
for parallel applyConnectorX
to fast SQL data load into DataFrame- Movingpandas for trajectory data
- Dataframe manipulations explained
pandas-log
- Pandas illustrated: a visual guide
- PandasAI: a Python library to ask questions to your data in natural language
Numpy¶
Polars¶
- Polars
- Polars intro by Practical Business Python
- Great Tables: The Polars DataFrame Styler of Your Dreams
- Using Polars in a Pandas world
Daft¶
Ibis¶
Narwhals¶
- Narwhals: lightweight and extensible compatibility layer between dataframe libraries!
- How Narwhals and scikit-lego came together to achieve dataframe-agnosticism
Password Management¶
OS and Pathlib¶
Polymorphism¶
Privacy¶
Process simulation¶
simpy
for manifacturing simulation- Modelling and simulations in data science
mesa
agent-based modeling framework
Project packaging¶
- Poetry
- Publish to PiP via Poetry
- Publish to PiP via Poetry v2
- Pooch: a friend to fetch your data files
- Robust Testing & Packaging with
src
layout - Python Packages: modern and efficient workflows for creating Python packages
- uv: an extremely fast Python package and project manager, written in Rust
Regex¶
Scikit-learn¶
- Pipelines with
sklearn
- Pipeline visualization
- Sklearn Pipelines for the Modern ML Engineer
- Pipeline for data preparation
Extensions¶
combo
for ML models combinationscikit-multilearn
for multi-label learningscikit-lego
lazypredict
human-learn
to rule-based learning and interactive drawing rulesscikit-mdn
: a mixture density network, by PyTorch, for scikit-learn
Probabl¶
Search engine¶
Standard library¶
- Math module overview
- Boltons
- Priority queues and
heapq
- Everything You Can Do with Python's textwrap Module
Extensions¶
Streamlit¶
- Coding ML tools like you code ML models
- Build and deploy
streamlit
applications - Intermediate
streamlit
- Sharing
streamlit
securely - Download file in
streamlit
streamlit
inside JupyterHubstreamlit
multi page hackstreamlit
app to make apps- Session state for multipage apps
- Drag scatter point with Bokeh events
streamlit-heroku
deployment utils- Display live 2D data in Streamlit
- Real time dashboard update with asyncio
- How to build a real time live dashboard
- Sync session state and app url via query params
- Table of contents in Streamlit
- Prettymaps Streamlit frontend
- Stlite: streamlit app running in browser
- Streamlit book
- Streamlit-Pydantic: auto-generate Streamlit UI elements from Pydantic models
- Package Streamlit into an Electron desktop app
- Prototyping Streamlit app via Figma and figma-to-streamlit plugin
- Streamlit type checking playground with mypy
- Creating repeatable elements
- Data analysis with Mito: a powerful spreadsheet in Streamlit
- Simplifying generative AI workflows
- Streamlit Contact Form Template
- Search grid for a Pandas DataFrame
- Streamlit auth via JWT and FastAPI
- Build a chatbot with custom data sources, powered by LlamaIndex
- Streamlit: an opinionated framework
Components¶
- Ant Design Menu and Tree
- Authenticator
- Barfi: a visual flow-based programming component
- Bokeh events
- Calendar
- Component for chat UI
- Datalist
- Drawable canvas
- Elements for Material UI tools integration
- Extras
- Extra components
- Float: fix the vertical position of containers relative to viewport instead of page
- Streamlit Component to quickly create Interactive Flow Diagrams using React Flow
- Image comparison
- Image cropper
- Image selection component
- LDAP authenticator
- Link analysis
- Lottie animations with
streamlit-lottie
- Marquee banner
- Navigation bar
- Option menù
- Pyvista for 3D objects visualization
- Raw echarts
- RevealJS slides
- Shadcn-ui
- SHAP
- Sortables
- Star rating
- Text labeling and annotation tool
- Text rating component
- Timeline
- Toggle switch
- Tree-shaped nested selectbox component
- User feedback
- Vizzu
- Vertical slider
Build components¶
- End to end streamlit components tutorial
- How to create custom Streamlit components
- Introductive tutorial to Streamlit components
- Streamlit components tutorials
- Streamlit components video tutorial
- Streamlit tutorial app to build components
Strings¶
Structural Pattern Matching¶
Tensorflow¶
- Tensorflow playground
- ML classifying text with NN and tensorflow
einsum
for compact and efficient Einstein summation
Testing¶
PyHamcrest
: a framework for writing matcher objects, allowing you to declaratively define match rulesbehave
: behavior-driven development based on Gherkin syntaxhypothesis
: generates simple and comprehensible examples that make your tests failmutmut
: a mutation testing system for Python, with a strong focus on ease of usefreezegun
: let your Python tests travel through time
Unit tests¶
- Unit testing for DS
pytest
pytest
and travis for github CInox
locust
as a test framework in pure Python- Assertions vs Exceptions
- Python Mocking in Production
Misc utils¶
- Barcodes, captcha and num2words
- Barcode generation with Python
- Modern high-performance serialization utilities for Python
pysentation
: a CLI for displaying Python presentations- Pint: a Python package to define, operate and manipulate physical quantities
- kanban-python: your terminal Kanban-board manager
- humanize: various common humanization utilities
- pycountry: a Python library to access ISO country, subdivision, language, currency and script definitions and their translations
- bigtree: Tree Implementation and Methods for Python, integrated with list, dictionary, pandas and polars DataFrame
- srsly: modern high-performance serialization utilities for Python
- xlwings: a Python library that makes it easy to call Python from Excel and vice versa
- wigglystuff: a collection of creative AnyWidgets for Python notebook environments
Python versions¶
Video editing¶
Vocal reader¶
Web App Framework¶
- Deploying ML model as a REST API
- Denzel
- Dash development and deployment
- Deploy ML model with Flask and Heroku
- FastAPI
- Website to host Python web app
- Deploy Dash app to Heroku
- Build and deploy a ML web app
- Deploy
streamlit
on Heroku - Anvil
- Pythonanywhere
- Deploy Dash app for free
streamlit
deployment on Heroku- Deploying
streamlit
on AWS Lightsail with nginx and docker gradio
as a lightweight alternative to Streamlit- ML model deployment on iPhone
- Deploy PyCaret model via FastAPI
PyWebIO
for web app development- FastDash
- H2O Wave and its table component
- MLEM: package and deploy machine learning models
- 10 Python web frameworks
- Reflex: performant, customizable web apps in pure Python
- Dara
- NiceGUI
- Vizro is a toolkit for creating modular data visualization applications
- Panel for data apps
- Playwright: reliable end-to-end testing for modern web apps
- Grog: a CLI that creates a Gradio UI for a Cog application
- Gradio Themes Gallery
- DearPyGui: a fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
- Hyperdiv: Build reactive web UIs in Python
- Building the Same App Using Various Web Frameworks
- PuePy: PyScript Frontend Framework
Web scraping¶
- Comparison between
BeatifulSoup
,selenium
andscrapy
- trafilatura: a Python package and command-line tool to gather text on the Web
- Helium: lighter web automation for Python