CSV / TSV
Parse comma-separated values (CSV) and tab-separated values (TSV) files with automatic type inference.
Usage
# CSV (default format)
loq "SELECT * FROM data.csv"
loq -i:CSV "SELECT * FROM data.csv"
# TSV
loq -i:TSV "SELECT * FROM data.tsv"File Format
CSV Example
name,age,city,salary
Alice,32,New York,75000
Bob,28,San Francisco,85000
Carol,35,Chicago,70000TSV Example
name age city salary
Alice 32 New York 75000
Bob 28 San Francisco 85000
Carol 35 Chicago 70000Features
Automatic Type Inference
loq automatically detects column types:
| Value | Detected Type |
|---|---|
42, -100 | Integer |
3.14, -0.5 | Float |
true, false | Boolean |
| Everything else | String |
Header Row
The first row is treated as column headers by default:
name,age,city
Alice,32,New YorkColumns are accessible by name:
SELECT name, age FROM data.csv WHERE city = 'New York'Quoted Fields
Fields containing commas, quotes, or newlines should be quoted:
name,description,price
Widget,"A small, useful item",9.99
Gadget,"Contains ""special"" characters",19.99Empty Values
Empty fields are parsed as NULL:
name,email,phone
Alice,alice@example.com,555-1234
Bob,,555-5678
Carol,carol@example.com,SELECT * FROM data.csv WHERE email IS NULL
-- Returns: BobOptions
Custom Delimiter
# Pipe-delimited
loq -i:CSV -iSeparator:"|" "SELECT * FROM data.txt"
# Semicolon-delimited (common in European CSV)
loq -i:CSV -iSeparator:";" "SELECT * FROM data.csv"Skip Header Rows
# Skip first 2 rows before header
loq -i:CSV -iHeaderRow:3 "SELECT * FROM data.csv"Input Codepage
# UTF-8 (default)
loq -i:CSV -iCodepage:UTF8 "SELECT * FROM data.csv"
# Latin-1
loq -i:CSV -iCodepage:LATIN1 "SELECT * FROM data.csv"
# Windows-1252
loq -i:CSV -iCodepage:1252 "SELECT * FROM data.csv"Schema Detection
loq samples the first 100 rows to infer column types. The most specific type that fits all values is chosen:
- Integer - All values are whole numbers
- Float - All values are numbers (including decimals)
- Boolean - All values are true/false
- String - Default fallback
Type Coercion
Mixed types promote to the more general type:
value
42
3.14
100Results in Float type (integers can be represented as floats).
Forcing Types
Use SQL functions to convert types:
SELECT
CAST(id AS INTEGER) AS id,
TO_TIMESTAMP(date_str, '%Y-%m-%d') AS date
FROM data.csvExamples
Basic Query
loq "SELECT * FROM users.csv LIMIT 10"Filtering
loq "SELECT name, salary FROM employees.csv WHERE salary > 50000"Aggregation
loq "SELECT department, COUNT(*), AVG(salary)
FROM employees.csv
GROUP BY department"Sorting
loq "SELECT * FROM products.csv ORDER BY price DESC LIMIT 20"Multiple Files
# Using wildcards
loq "SELECT * FROM 'logs/*.csv'"
# With recursive option
loq -recurse:2 "SELECT * FROM 'logs/*.csv'"Joining CSV Files
loq "SELECT u.name, o.total
FROM users.csv u
JOIN orders.csv o ON u.id = o.user_id"Output
CSV Output (Default)
loq "SELECT name, age FROM users.csv"Output:
name,age
Alice,32
Bob,28Other Formats
# JSON output
loq -o:JSON "SELECT * FROM users.csv"
# Table format
loq -o:DATAGRID "SELECT * FROM users.csv"Performance
CSV parsing is optimized for performance:
- 64KB I/O buffers reduce system calls
- Pre-allocated vectors minimize memory allocations
- Fast-path type parsing for common cases
- Parallel processing for large files (10K+ rows)
Tips for Large Files
# Use LIMIT during exploration
loq "SELECT * FROM huge.csv LIMIT 100"
# Filter early to reduce data
loq "SELECT * FROM huge.csv WHERE status = 'error'"
# Select only needed columns
loq "SELECT id, message FROM huge.csv"Troubleshooting
Wrong Column Types
If columns are detected as wrong type:
-- Force integer parsing
SELECT CAST(id AS INTEGER) FROM data.csv
-- Force date parsing
SELECT TO_TIMESTAMP(date_col, '%Y-%m-%d') FROM data.csvEncoding Issues
For non-UTF-8 files:
loq -i:CSV -iCodepage:LATIN1 "SELECT * FROM data.csv"Missing Headers
If the file has no header row, columns are named column1, column2, etc.
Quoted Delimiter Issues
Ensure fields with delimiters are properly quoted:
# Correct
"Smith, John",42
# Incorrect (will split incorrectly)
Smith, John,42