- Installation
- Documentation
- Getting Started
- Connect
- Data Import
- Overview
- Data Sources
- CSV Files
- JSON Files
- Overview
- Creating JSON
- Loading JSON
- Writing JSON
- JSON Type
- JSON Functions
- Format Settings
- Installing and Loading
- SQL to / from JSON
- Caveats
- Multiple Files
- Parquet Files
- Partitioning
- Appender
- INSERT Statements
- Client APIs
- Overview
- ADBC
- C
- Overview
- Startup
- Configuration
- Query
- Data Chunks
- Vectors
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- CLI
- Dart
- Go
- Java (JDBC)
- Julia
- Node.js (Deprecated)
- Node.js (Neo)
- ODBC
- Python
- Overview
- Data Ingestion
- Conversion between DuckDB and Python
- DB API
- Relational API
- Function API
- Types API
- Expression API
- Spark API
- API Reference
- Known Python Issues
- R
- Rust
- Swift
- Wasm
- SQL
- Introduction
- Statements
- Overview
- ANALYZE
- ALTER TABLE
- ALTER VIEW
- ATTACH and DETACH
- CALL
- CHECKPOINT
- COMMENT ON
- COPY
- CREATE INDEX
- CREATE MACRO
- CREATE SCHEMA
- CREATE SECRET
- CREATE SEQUENCE
- CREATE TABLE
- CREATE VIEW
- CREATE TYPE
- DELETE
- DESCRIBE
- DROP
- EXPORT and IMPORT DATABASE
- INSERT
- LOAD / INSTALL
- PIVOT
- Profiling
- SELECT
- SET / RESET
- SET VARIABLE
- SUMMARIZE
- Transaction Management
- UNPIVOT
- UPDATE
- USE
- VACUUM
- Query Syntax
- SELECT
- FROM and JOIN
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT and OFFSET
- SAMPLE
- Unnesting
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Prepared Statements
- Data Types
- Overview
- Array
- Bitstring
- Blob
- Boolean
- Date
- Enum
- Interval
- List
- Literal Types
- Map
- NULL Values
- Numeric
- Struct
- Text
- Time
- Timestamp
- Time Zones
- Union
- Typecasting
- Expressions
- Overview
- CASE Expression
- Casting
- Collations
- Comparisons
- IN Operator
- Logical Operators
- Star Expression
- Subqueries
- TRY
- Functions
- Overview
- Aggregate Functions
- Array Functions
- Bitstring Functions
- Blob Functions
- Date Format Functions
- Date Functions
- Date Part Functions
- Enum Functions
- Interval Functions
- Lambda Functions
- List Functions
- Map Functions
- Nested Functions
- Numeric Functions
- Pattern Matching
- Regular Expressions
- Struct Functions
- Text Functions
- Time Functions
- Timestamp Functions
- Timestamp with Time Zone Functions
- Union Functions
- Utility Functions
- Window Functions
- Constraints
- Indexes
- Meta Queries
- DuckDB's SQL Dialect
- Overview
- Indexing
- Friendly SQL
- Keywords and Identifiers
- Order Preservation
- PostgreSQL Compatibility
- SQL Quirks
- Samples
- Configuration
- Extensions
- Overview
- Installing Extensions
- Advanced Installation Methods
- Distributing Extensions
- Versioning of Extensions
- Troubleshooting of Extensions
- Core Extensions
- Overview
- AutoComplete
- Avro
- AWS
- Azure
- Delta
- Encodings
- Excel
- Full Text Search
- httpfs (HTTP and S3)
- Iceberg
- Overview
- Iceberg REST Catalogs
- Amazon S3 Tables
- Amazon SageMaker Lakehouse (AWS Glue)
- Troubleshooting
- ICU
- inet
- jemalloc
- MySQL
- PostgreSQL
- Spatial
- SQLite
- TPC-DS
- TPC-H
- UI
- VSS
- Guides
- Overview
- Data Viewers
- Database Integration
- File Formats
- Overview
- CSV Import
- CSV Export
- Directly Reading Files
- Excel Import
- Excel Export
- JSON Import
- JSON Export
- Parquet Import
- Parquet Export
- Querying Parquet Files
- File Access with the file: Protocol
- Network and Cloud Storage
- Overview
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- S3 Iceberg Import
- S3 Express One
- GCS Import
- Cloudflare R2 Import
- DuckDB over HTTPS / S3
- Fastly Object Storage Import
- Meta Queries
- Describe Table
- EXPLAIN: Inspect Query Plans
- EXPLAIN ANALYZE: Profile Queries
- List Tables
- Summarize
- DuckDB Environment
- ODBC
- Performance
- Overview
- Environment
- Import
- Schema
- Indexing
- Join Operations
- File Formats
- How to Tune Workloads
- My Workload Is Slow
- Benchmarks
- Working with Huge Databases
- Python
- Installation
- Executing SQL
- Jupyter Notebooks
- marimo Notebooks
- SQL on Pandas
- Import from Pandas
- Export to Pandas
- Import from Numpy
- Export to Numpy
- SQL on Arrow
- Import from Arrow
- Export to Arrow
- Relational API on Pandas
- Multiple Python Threads
- Integration with Ibis
- Integration with Polars
- Using fsspec Filesystems
- SQL Editors
- SQL Features
- Snippets
- Creating Synthetic Data
- Dutch Railway Datasets
- Sharing Macros
- Analyzing a Git Repository
- Importing Duckbox Tables
- Copying an In-Memory Database to a File
- Troubleshooting
- Glossary of Terms
- Browsing Offline
- Operations Manual
- Overview
- DuckDB's Footprint
- Logging
- Securing DuckDB
- Non-Deterministic Behavior
- Limits
- Development
- DuckDB Repositories
- Profiling
- Building DuckDB
- Overview
- Build Configuration
- Building Extensions
- Android
- Linux
- macOS
- Raspberry Pi
- Windows
- Python
- R
- Troubleshooting
- Unofficial and Unsupported Platforms
- Benchmark Suite
- Testing
- Internals
- Why DuckDB
- FAQ
- Code of Conduct
- Release Calendar
- Roadmap
- Sitemap
- Live Demo
marimo is an open-source reactive notebook for Python and SQL that's tightly integrated with DuckDB's Python client, letting you mix and match Python and SQL in a single git-versionable notebook. Unlike traditional notebooks, when you run a cell or interact with a UI element, marimo automatically (or lazily) runs affected cells, keeping code and outputs consistent. Its integration with DuckDB makes it well-suited to interactively working with data, and its representation as a Python file makes it simple to run notebooks as scripts.
Installation
To get started, install marimo and DuckDB from your terminal:
pip install "marimo[sql]" # or uv add "marimo[sql]"
Install supporting libraries:
pip install "polars[pyarrow]" # or uv add "polars[pyarrow]"
Run a tutorial:
marimo tutorial sql
SQL in marimo
Create a notebook from your terminal with marimo edit notebook.py
. Create SQL
cells in one of three ways:
- Right-click the + button and pick SQL cell
- Convert any empty cell to SQL via the cell menu
- Hit the SQL button at the bottom of your notebook
In marimo, SQL cells give the appearance of writing SQL while being serialized as standard Python code using the mo.sql()
function, which keeps your notebook as pure Python code without requiring special syntax or magic commands.
df = mo.sql(f"SELECT 'Off and flying!' AS a_duckdb_column")
This is because marimo stores notebooks as pure Python, for many reasons, such as git-friendly diffs and running notebooks as Python scripts.
The SQL statement itself is an f-string, letting you interpolate Python values into the query with {}
(shown later). In particular, this means your SQL queries can depend on the values of UI elements or other Python values, all part of marimo's dataflow graph.
Warning Heads up! If you have user-generated content going into the SQL queries, be sure to sanitize your inputs to prevent SQL injection.
Connecting a Custom DuckDB Connection
To connect to a custom DuckDB connection instead of using the default global connection, create a cell and create a DuckDB connection as Python variable:
import duckdb
# Create a DuckDB connection
conn = duckdb.connect("path/to/my/duckdb.db")
marimo automatically discovers the connection and lets you select it in the SQL cell's connection dropdown.

Database, Schema, and Table Auto-Discovery
marimo introspects connections and display the database, schemas, tables, and columns in the Data Sources panel. This panel lets you quickly navigate your schemas to pull tables and columns into your SQL queries.

Reference a Local Dataframe
Reference a local dataframe in your SQL cell by using the name of the Python variable that holds the dataframe. If you have a database connection with a table of the same name, the database table will be used instead.
import polars as pl
df = pl.DataFrame({"column": [1, 2, 3]})
SELECT * FROM df WHERE column > 2
Reference the Output of a SQL Cell
Defining a non-private (non-underscored) output variable in the SQL cell allows you to reference the resulting dataframe in other Python and SQL cells.

Reactive SQL Cells
marimo allows you to create reactive SQL cells that automatically update when their dependencies change. Working with expensive queries or large datasets? You can configure marimo's runtime to be “lazy”. By doing so, dependent cells are only marked as stale letting the user choose when they should be re-run.
digits = mo.ui.slider(label="Digits", start=100, stop=10000, step=200)
digits
CREATE TABLE random_data AS
SELECT i AS id, random() AS random_value,
FROM range({digits.value}) AS t(i);
SELECT * FROM random_data;
Interacting with UI elements, like a slider, makes your data more tangible.

DuckDB-Powered OLAP Analytics in marimo
marimo provides several features that work well with DuckDB for analytical workflows:
- Seamless integration between Python and SQL
- Reactive execution that automatically updates dependent cells when queries change
- Interactive UI elements that can be used to parameterize SQL queries
- Ability to export notebooks as standalone applications or Python scripts, or even run entirely in the browser with WebAssembly.
Next Steps
- Read the marimo docs.
- Try the SQL tutorial:
marimo tutorial sql
. - The code for this guide is available on GitHub. Run it with
marimo edit ⟨github_url⟩
.