2024-11-25¶
Annotations¶
Audio recognition¶
- Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit
- Moonshine: Fast and accurate automatic speech recognition (ASR) for edge devices
Bayesian methods¶
Colors¶
Data, Art & Science¶
- 1 dataset 100 visualizations
- DataMorph: morph an input dataset of 2D points into select shapes while preserving the summary statistics
Database¶
- 15+ companies using duckdb in production: a comprehensive guide
- A Beginner's Guide to DuckDB's Python Client
- DuckDB in Python in the Browser with Pyodide, PyScript and JupyterLite
- Ducklake: A journey to integrate DuckDB with Unity Catalog
Dataset¶
- Data Commons: aggregates global, open data, uncovering insights with natural language questions
- Foursquare Places OS Data Schemas
- Open-Meteo: Free Weather Forecast API for non-commercial use
Docker¶
Documentation¶
EDA¶
Git and versioning¶
Interactive visualizations¶
Knowledge Management¶
- A Single Text File Is My Productivity Hack
- How to Use AI to Build Your Company’s Collective Intelligence
- My Simple Knowledge Management and Time Tracking System
- jrnl: a simple journal application for the command line
Large Language Models (LLM)¶
- A RAG from scratch to query the scikit-learn documentation
- Beyond Traditional Testing: Addressing the Challenges of Non-Deterministic Software
- ChainLit: Build Conversational AI in minutes
- DataChain: AI-data warehouse to enrich, transform and analyze unstructured data
- Docling: parse documents and export them to the desired format with ease and speed
- Introduction to Large Language Models
- JIT Implementation: A Python Library That Implements Your Code at Runtime
- Large Chainsaw Model
- Model2Vec: Distill a Small Fast Model from any Sentence Transformer
- Official code repo for the O'Reilly Book "Hands-On Large Language Models"
- Open Source Frameworks for Building Generative AI Applications
- Posting: the modern API client that lives in your terminal
- Simplemind: Python client for AI providers
- el: a language model programming library
Mathematics¶
- Counterintuitive Properties of High Dimensional Space
- Techniques and numbers for estimating system's performance from first-principles
- The Deceptively Asymmetric Unit Sphere
Methodology¶
Misc utils¶
Natural Language Processing (NLP)¶
Resampling¶
SQL¶
- SQLModel: a library for interacting with SQL databases from Python code, with Python objects
- Sampling with SQL