2025-05-06¶
ACID¶
API¶
- Hackernews API: documentation and Samples for the Official HN API
- sensei: the Python framework that provides a quick way to build robust HTTP requests and best API clients
Audio recognition¶
Basics¶
- BI-as-Code and the New Era of GenBI
- Data Engineering Design Patterns (DEDP) book
- Data Engineering Vault
- Data Lake and Lakehouse Guide: Powered by Data Lake Table Formats (Delta Lake, Iceberg, Hudi)
- Data Modeling – The Unsung Hero of Data Engineering: An Introduction to Data Modeling (Part 1)
- Modern Data Stack: The Struggle of Enterprise Adoption
- The Data Engineering Toolkit: Essential Tools for Your Machine
- The Rise of the Declarative Data Stack
- The Rise of the Semantic Layer
Bayesian methods¶
Books¶
- Cosmic Python: simple patterns for building complex applications
- Forecasting: Principles and Practice, the Pythonic Way
- Probabilistic Artificial Intelligence
- Web Browser Engineering
CLI¶
Caching¶
Computer Vision¶
Data Storytelling¶
Data Structures¶
Data catalog¶
Data validation¶
- dataframely: a declarative, polars-native data frame validation library
- msgspec: a fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
Data, Art & Science¶
Database¶
- Instant SQL is here: speedrun ad-hoc queries as you type
- JameSQL: an in-memory NoSQL database implemented in Python
- drawdb: free, simple, and intuitive online database diagram editor and SQL generator
- icechunk: open-source, cloud-native transactional tensor storage engine
- xarray: N-D labeled arrays and datasets in Python
- zarr-python: an implementation of chunked, compressed, N-dimensional arrays for Python
Dataset¶
Dependencies management¶
Diagrams¶
Docker¶
Documentation¶
Embeddings¶
- How to Implement a Cosine Similarity Function in TypeScript for Vector Comparison
- Late Chunking: The Better Way to Embed Document Chunks
- model2vec: fast State-of-the-Art Static Embeddings
GUI¶
Geo science¶
- Simplification of street networks
- Uber H3: Hexagonal hierarchical geospatial indexing system
- a5: pentagonal geospatial indexing system DGGS
Geodata¶
Git and versioning¶
Graphs¶
High-dimensional data¶
Javascript libraries¶
Jobs¶
Jupyter¶
- Jupyter AI
- marimo-snippets: JS snippet to send codeblock contents as a query string
- nu-jupyter-kernel: a wip jupyter raw kernel for nu
Knowledge Management¶
- koaning.io: time for a bespoke blog engine
- nb: a command line and local web note‑taking, bookmarking, archiving, and knowledge base application
- rucola: terminal-based markdown note manager
- techbook.digital: searchable database collection of IT keywords, terms, and concepts with descriptive explanations
Large Language Models (LLM)¶
- 36 Alternatives to LLM Context
- A cheat sheet for why using ChatGPT is not bad for the environment
- AI code is legacy code from day one
- Aider: AI pair programming in your terminal
- CoRT (Chain-of-Recursive-Thoughts): AI think harder when it argues with itself repeatedly
- Dummy's Guide to Modern LLM Sampling
- Emerging Patterns in Building GenAI Products
- I'd rather read the prompt
- The Cultural Divide between Mathematics and AI
- The Hidden Cost of AI Coding
- The Problem with "Vibe Coding"
- Transformers and Large Language Models cheatsheet for Stanford's CME 295
- Writing an LLM from scratch
- agenticSeek: fully Local Manus AI
- agx: AI Powered Analytics App
- blast: browser-LLM Auto-Scaling Technology
- codegen: Python SDK to Interact with Intelligent Code Generation Agents
- token-explorer: a simple tool to explore different possible paths that an LLM might sample
- torchexplorer: interactively inspect module inputs, outputs, parameters, and gradients
MCMC¶
Machine Learning¶
Markdown¶
- Marp: Markdown Presentation Ecosystem
- Slidev: presentation slides for developers
- presenterm: a markdown terminal slideshow tool
- typst: a new markup-based typesetting system
Mathematics¶
- 100 years to solve an integral
- A friendly introduction to triangular transport
- An Introduction to Stochastic Calculus
- Are polynomial features the root of all evil?
- Unsure Calculator
Misc¶
- Leaflet: create and share delightful documents on the web
- filepizza: peer-to-peer file transfers in your browser
Natural Language Processing (NLP)¶
Numpy¶
Optimization¶
Password Management¶
Physics¶
Project packaging¶
Publications plot¶
Regulation¶
Reinforcement Learning¶
- A (Long) Peek into Reinforcement Learning
- Mathematical Foundations of Reinforcement Learning
- Pokemon: Reinforcement Learning Edition
SQL¶
Scikit-learn¶
Services¶
- AWS MCP Servers: specialized MCP servers that bring AWS best practices directly to your development workflow
- Amazon Q Developer new context features
- AssumeRole vs. AssumeRole vs. PassRole
- cedar-py: python bindings for the Cedar Policy project
Software Development¶
- 50 things we’ve learned about building successful products
- Binary search as a bidirectional generator
- Modern Tech Stack
Standard library¶
- 14 Advanced Python Features
- Haskelling My Python: reimplementing Haskell lazy infinite lists using Python generators
- Python loop targets
- Self-destructing Python scripts
- bisect: array bisection algorithm
Structural Pattern Matching¶
Tables¶
Technical skills¶
- Concept Maps: mental models used in introductory data science lessons
- VisuAlgo: visualising data structures and algorithms through animation
Technical writing¶
Terminal¶
- "Rules" that terminal programs follow
- Atuin Desktop: Runbooks that Run
- Font Ligatures for your Code Editor and Terminal
- broot: a new way to see and navigate directory trees