Query a DuckDb Database.
type: "io.kestra.plugin.jdbc.duckdb.Query"
Execute a query that reads a csv, and outputs another csv.
id: query_duckdb
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv"
- id: query
type: io.kestra.plugin.jdbc.duckdb.Query
url: 'jdbc:duckdb:'
timeZoneId: Europe/Paris
sql: |-
CREATE TABLE new_tbl AS SELECT * FROM read_csv_auto('in.csv', header=True);
COPY (SELECT order_id, customer_name FROM new_tbl) TO '{{ outputFiles.out }}' (HEADER, DELIMITER ',');
inputFiles:
in.csv: "{{ outputs.http_download.uri }}"
outputFiles:
- out
Execute a query that reads from an existing database file using a URL.
id: query_duckdb
namespace: company.team
inputs:
- id: my_db
type: FILE
tasks:
- id: query1
type: io.kestra.plugin.jdbc.duckdb.Query
databaseUri: "{{ inputs.my_db }}"
sql: SELECT * FROM table_name;
fetchType: STORE
Run a SQL query with DuckDB on MotherDuck and get the result as a CSV file
id: motherduck
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.duckdb.Query
sql: |
SELECT by, COUNT(*) as nr_comments
FROM sample_data.hn.hacker_news
GROUP BY by
ORDER BY nr_comments DESC;
fetchType: STORE
- id: csv
type: io.kestra.plugin.serdes.csv.IonToCsv
from: "{{ outputs.query.uri }}"
pluginDefaults:
- type: io.kestra.plugin.jdbc.duckdb.Query
values:
url: jdbc:duckdb:md:my_db?motherduck_token={{ secret('MOTHERDUCK_TOKEN') }}
timeZoneId: Europe/Berlin
YES
jdbc:duckdb:
The JDBC URL to connect to the database.
The default value, jdbc: duckdb:
, will use a local in-memory database.
Set this property when connecting to a persisted database instance, for example jdbc: duckdb: md: my_database?motherduck_token=<my_token>
to connect to MotherDuck.
YES
true
DEPRECATED Whether autocommit is enabled.
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store
property in which case the auto-commit will be disabled.
DEPRECATED: Please use Queries task with 'transaction' property if you want to run multiple queries with or without autocommit
YES
Database URI
Kestra's URI to an existing Duck DB database file
NO
false
DEPRECATED, please use fetchType: FETCH
instead.
Whether to fetch the data from the query result to the task output. This parameter is evaluated after fetchOne
and store
.
NO
false
DEPRECATED, please use fetchType: FETCH_ONE
instead.
Whether to fetch only one data row from the query result to the task output. This parameter is evaluated before store
and fetch
.
YES
10000
Number of rows that should be fetched.
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit
is false.
YES
NONE
STORE
FETCH
FETCH_ONE
NONE
The way you want to store data.
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
YES
Input files to be loaded from DuckDb.
Describe a files map that will be written and usable by DuckDb. You can reach files by their filename, example: SELECT * FROM read_csv_auto('myfile.csv');
YES
false
Output the database file.
This property lets you define if you want to output the in-memory database as a file for further processing.
YES
Output file list that will be uploaded to internal storage.
List of keys that will generate temporary files.
On the SQL query, you can just use a variable named outputFiles.key
for the corresponding file.
If you add a file with ["first"]
, you can use the special vars COPY tbl TO '{{ outputFiles.first }}' (HEADER, DELIMITER ',');
and use this file in others tasks using {{ outputs.taskId.outputFiles.first }}
.
YES
Parameters
A map of parameters to bind to the SQL queries. The keys should match the parameter placeholders in the SQL string, e.g., : parameterName.
YES
The database user's password.
YES
The SQL query to run.
NO
false
DEPRECATED, please use fetchType: STORE
instead.
Whether to fetch data row(s) from the query result to a file in internal storage. File will be saved as Amazon Ion (text format).
See Amazon Ion documentation This parameter is evaluated after fetchOne
but before fetch
.
YES
The time zone id to use for date/time manipulation. Default value is the worker's default time zone id.
YES
The database user.
uri
The database output URI in Kestra's internal storage.
The output files' URI in Kestra's internal storage.
Map containing the first row of fetched data.
Only populated if fetchOne
parameter is set to true.
List of map containing rows of fetched data.
Only populated if fetch
parameter is set to true.
The number of rows fetched.
Only populated if store
or fetch
parameter is set to true.
uri
The URI of the result file on Kestra's internal storage (.ion file / Amazon Ion formatted text file).
Only populated if store
is set to true.