// article
Part 2 of 2 in the database performance series. Part 1 is the reference guide.
Atlas trades 0–3 DTE SPY options. The system needs to ingest market data, calculate features, and execute trades inside the same second. For most of last summer it could do exactly none of that, because it wouldn’t start. Database initialization was taking 6.6 seconds, and 37 features that depended on the database never got past their import-time checks.
This is the writeup of what was wrong, how I found it, and the fix — which turned out to be a single replaced pattern.
The first useful thing I did was add timing around the two suspect calls. Two log lines were enough.
start_time = time.time()
self._primary_connection = duckdb.connect(database=db_path, read_only=False)
logger.debug(f"Connection created in {(time.time() - start_time)*1000:.1f}ms")
# Output: Connection created in 12.3ms
schema_time = time.time()
self._initialize_schemas()
logger.debug(f"Schema check completed in {(time.time() - schema_time)*1000:.1f}ms")
# Output: Schema check completed in 6588.4ms
Connect was fast: 12ms. Schema initialization was 6,588ms. Almost the entire startup was in one function.
_initialize_schemas was doing what looked sensible: ask the catalog whether each table exists, and create it if not.
def _initialize_schemas(self):
"""Check if tables exist, create if missing"""
# Check trades table
result = self._primary_connection.execute("""
SELECT COUNT(*) FROM information_schema.tables
WHERE table_name = 'trades'
""").fetchone()
if result[0] == 0:
self._primary_connection.execute("""
CREATE TABLE trades (...)
""")
# Repeat for quotes table...
# Repeat for each index...
information_schema queries scan the catalog. In DuckDB, doing this repeatedly during startup is expensive — the catalog gets walked once per table, once per index. The cost is proportional to how much you check, not how much you actually create.
The pattern is correct logic and wrong implementation. The database already knows how to do this check itself.
Three coordinated changes:
information_schema checks with CREATE TABLE IF NOT EXISTSdef _initialize_schemas_fast(self):
"""Initialize schemas using CREATE TABLE IF NOT EXISTS"""
start_time = time.time()
self._primary_connection.execute("""
CREATE TABLE IF NOT EXISTS trades (
timestamp_ns BIGINT,
symbol VARCHAR,
price DOUBLE,
size INTEGER,
side VARCHAR,
exchange VARCHAR,
conditions VARCHAR,
notional DOUBLE,
date DATE
)
""")
self._primary_connection.execute("""
CREATE TABLE IF NOT EXISTS quotes (
timestamp_ns BIGINT,
symbol VARCHAR,
bid DOUBLE,
ask DOUBLE,
bid_size INTEGER,
ask_size INTEGER,
exchange VARCHAR,
conditions VARCHAR,
date DATE
)
""")
schema_time = (time.time() - start_time) * 1000
logger.debug(f"Fast schema initialization: {schema_time:.1f}ms")
CREATE TABLE IF NOT EXISTS is idempotent and cheap. DuckDB checks existence at the catalog level without scanning information_schema.
This single change took schema init from 6,644ms to 8.2ms — 810×.
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_trades_symbol_timestamp
ON trades(symbol, timestamp_ns)
""")
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_quotes_symbol_timestamp
ON quotes(symbol, timestamp_ns)
""")
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_trades_date
ON trades(date)
""")
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_quotes_date
ON quotes(date)
""")
The first two cover the three query patterns the trading code actually issues:
WHERE symbol = ? AND timestamp_ns > ?WHERE symbol = ? AND timestamp_ns BETWEEN ? AND ?WHERE symbol = ? ORDER BY timestamp_ns DESC LIMIT NThe date indexes serve daily aggregations and end-of-day reports.
def connect(self):
"""Create primary database connection"""
start_time = time.time()
with self._connection_lock:
db_path = self._config.get('db_path', 'market_data.db')
threads = self._config.get('threads', 4)
memory_limit = self._config.get('memory_limit', '2GB')
self._primary_connection = duckdb.connect(
database=db_path,
read_only=False
)
self._primary_connection.execute(f"SET threads TO {threads}")
self._primary_connection.execute(f"SET memory_limit = '{memory_limit}'")
self._initialize_schemas_fast()
total_time = (time.time() - start_time) * 1000
logger.info(f"DatabaseManager connected in {total_time:.1f}ms")
if total_time > 50:
logger.warning(f"Connection took {total_time:.1f}ms (target: <50ms)")
Target was sub-50ms. Measured 8.2ms.
Trading systems are concurrent — market data ingest, feature calc, and execution all run in their own threads. Cursors need to be thread-safe without serializing every access.
def get_cursor(self, component_name: str):
"""Get thread-local cursor with fast-path for existing cursors"""
thread_id = threading.current_thread().ident
cursor_key = f"{component_name}_{thread_id}"
# Fast path: no lock for existing cursors
if cursor_key in self._thread_cursors:
self._cursor_last_used[cursor_key] = time.time()
return self._thread_cursors[cursor_key]
# Slow path: lock only when creating
with self._cursor_lock:
if cursor_key not in self._thread_cursors:
self._thread_cursors[cursor_key] = self._primary_connection.cursor()
self._cursor_last_used[cursor_key] = time.time()
return self._thread_cursors[cursor_key]
Cached cursor lookup is lock-free. Only allocation pays the lock cost. Measured: 0.01ms first access, ~0.00ms thereafter, 200,000 lookups/second under load.
Threads come and go. Unused cursors get reclaimed on a 30-second sweep.
def _cleanup_unused_cursors(self):
"""Remove cursors not used in last 60 seconds"""
current_time = time.time()
with self._cursor_lock:
cursors_to_remove = []
for cursor_key, last_used in self._cursor_last_used.items():
if current_time - last_used > 60:
cursors_to_remove.append(cursor_key)
for cursor_key in cursors_to_remove:
try:
self._thread_cursors[cursor_key].close()
except:
pass
del self._thread_cursors[cursor_key]
del self._cursor_last_used[cursor_key]
Under bursty concurrent writes, DuckDB will occasionally return a lock conflict. The handler retries with exponential backoff, gives up after 10 attempts so a stuck call doesn’t hang forever.
def execute_with_retry(self, cursor, query: str, params=None):
"""Execute query with automatic retry on lock conflicts"""
delay = 0.1 # 100ms
for attempt in range(10):
try:
if params:
return cursor.execute(query, params)
else:
return cursor.execute(query)
except duckdb.IOException as e:
if "database is locked" in str(e).lower():
if attempt < 9:
logger.warning(f"Database locked, retry {attempt + 1}/10 after {delay:.1f}s")
time.sleep(delay)
delay *= 2.0
continue
raise
raise RuntimeError("Failed to execute query after 10 attempts")
Backoff sequence: 100ms, 200ms, 400ms, 800ms, …, capped at ten attempts.
| Metric | Before | After | Change |
|---|---|---|---|
| Connection time | 6,644ms | 8.2ms | 810× |
| Health check | — | 0.1ms | — |
| Cursor access, first | — | 0.01ms | — |
| Cursor access, cached | — | ~0.00ms | — |
| Simple query | — | 0.05ms | — |
The 37 features that depended on the database now load. The relevant categories:
The optimized manager exposes a fast health check for production monitoring:
def health_check(self) -> Dict[str, Any]:
"""Fast health check for production monitoring"""
start_time = time.time()
try:
if self._lifecycle_phase != LifecyclePhase.CONNECTED:
return {
'healthy': False,
'phase': self._lifecycle_phase.value,
'error': 'Not connected',
'check_time_ms': 0
}
cursor = self.get_cursor('health_check')
cursor.execute('SELECT 1').fetchone()
check_time = (time.time() - start_time) * 1000
return {
'healthy': True,
'phase': self._lifecycle_phase.value,
'check_time_ms': round(check_time, 2),
'active_cursors': len(self._thread_cursors),
'max_connections': self._max_connections,
'config': self._config
}
except Exception as e:
check_time = (time.time() - start_time) * 1000
return {
'healthy': False,
'error': str(e),
'check_time_ms': round(check_time, 2)
}
Thresholds: critical alert if connect time >50ms, warning if health check >1ms. Current measured: 0.1ms health checks, 8.2ms connect.
1. information_schema queries are expensive in startup paths. Don’t ask the catalog whether a table exists if you can just write CREATE IF NOT EXISTS.
2. One composite index serves many query shapes. (symbol, timestamp_ns) covered three different live query patterns. Don’t over-index — make each index earn its place.
3. Fast paths matter where lookups are frequent. Lock-free reads for the cached case, lock only on allocation. The cost difference between 0.01ms and 0.00ms doesn’t sound like much until you do it 200,000 times a second.
4. Production monitoring belongs in the database manager, not bolted on later. The 0.1ms health check is cheap; not having it is more expensive than the cost of every check combined.
5. The fastest code is the code that doesn’t run. The 810× win wasn’t algorithmic — it was 6,644ms of unnecessary catalog scans, removed.
Part 1 of the series: database performance optimization reference.