= {
configuredClient const client = await DuckDBClient.of();
await client.sql`
SET s3_endpoint='minio.lab.sspcloud.fr'
`;
return client;
}
Quarto ojs DuckDB with a parquet file stored in a S3 storage
Context
We assume that we are in the following context: a parquet file stored in MinIO and publicly available.
Since this file is publicly available, we do not want to attach it in the Quarto notebook.
The penguins dataset is available at https://minio.lab.sspcloud.fr/f7sggu/diffusion/penguins.parquet.
DuckDB configuration
DuckDB can be configured to read from a S3 compatible storage, see https://duckdb.org/docs/sql/configuration.html.
Here, since the parquet file is publicly available, we only need to override the s3_endpoint
variable:
Read a parquet file
Since the S3 endpoint is properly configured, the address of the parquet file is now:
= 's3://f7sggu/diffusion/penguins.parquet' url
As explained in the DuckDB help page on parquet, we can create a view from a parquet file:
= {
db await configuredClient.query(`
CREATE VIEW penguins AS
SELECT * FROM read_parquet('${url}');
`);
return configuredClient
}
Request the parquet file
Now, we can request the parquet file:
= db.sql`
res SELECT * FROM penguins LIMIT 10
`
= Inputs.table(res) viewof table
Configuration informations
This notebook was rendered with Quarto 1.3.49