10. Process data (proc.*) and dynamic backtesting (design specification)
This chapter explains the definition and access methods of process data in qteasy, as well as the selection of static branches vs dynamic branches during backtesting and the consistency conventions between them, for reference when implementing and extending. For usage-level explanations, read it together with “How Strategies Declare and Use Data” and the API documentation.
10.1. 1. 背景与目标
1.1 Meaning of Process Data
Some strategies need to rely on data that changes with the backtest or live trading execution path, for example:
Current/historical positions, available cash;
Historical filled quantity, fill price, transaction costs;
Market value, total assets, etc. derived from positions and prices.
Such data cannot be pre-generated in one go before a backtest starts; it can only be maintained during runtime by the Backtester (backtesting) or Trader (live trading), and provided to the strategy at each step of signal generation according to the “currently visible scope”. We collectively refer to it as process data.
1.2 Design Objectives
Unified entry point: Like static historical data, process data is obtained via
Strategy.get_data(), reducing the learning curve.No look-ahead: When generating the signal for step k, the strategy cannot see the execution results of step k; it can only use the history of completed steps.
Backtest/live consistency: The same set of strategies and the same
get_data('proc.xxx')calling pattern can be used in both backtesting and live trading; when process data is needed, it follows the dynamic execution path, otherwise it can remain consistent with the original static path in terms of results.
10.2. 2. 过程数据的统一定义(proc.*)
2.1 Naming and Exposure Method
All process data is exposed to the strategy in the form
proc.<field_name>, such asproc.own_cashandproc.trade_records.Process data does not need to be declared via
data_typesin the strategy’s__init__. It is injected at runtime by the Backtester / Trader, and the strategy only needs to callget_data('proc.xxx', ...)inrealize()as needed.
2.2 Built-in fields implemented
The process data fields implemented in the current version and available for use in strategies include:
Category |
Field name |
Meaning |
|---|---|---|
Account scalar |
|
Total cash in the account at the start of the current step |
|
Cash available for placing orders at the start of the current step |
|
|
Total asset market value at the start of the current step (position valuation + cash) |
|
Position vector |
|
Total position quantity of each instrument at the start of the current step |
|
Sellable quantity for each instrument at the start of the current step |
|
|
Position market value for each instrument at the start of the current step (calculated from the internal price and positions) |
|
Execution results |
|
Actual executed quantity for each instrument at each step (positive for buys, negative for sells) |
|
Transaction costs for each instrument at each step |
|
|
Execution price for each instrument at each step |
For the time semantics and visibility constraints of the above fields in backtesting and live trading, see Section 4.
2.3 Future extensible fields (optional)
Fields that can be further extended by design include: proc.realized_pnl, proc.unrealized_pnl, proc.last_trade_price, proc.last_trade_volume, etc. Refer to the implementation and documentation for details.
10.3. 3. 访问接口:Strategy.get_data() 与 proc.*
3.1 Static data (no proc. prefix)
Single source:
self.get_data('close_E_d'); multiple sources:self.get_data('close_E_d', 'high_E_d').Static data does not support the
lag/windowparameters; if provided, an EnglishValueErroris raised.
3.2 Process data (proc. prefix)
Call examples:
self.get_data('proc.own_cash'): the cash series up to the current step;self.get_data('proc.own_cash', lag=0): the cash value at the most recent step;self.get_data('proc.own_cash', lag='1d'): the step corresponding to looking back 1 day by time;self.get_data('proc.own_cash', window='5d'): a window slice over the past 5 days.
Constraints:
A single call allows only one
proc.*field; if multiple fields are used or it is mixed with static data in the same call, raise an EnglishValueError.lagandwindowcannot be specified at the same time;lagcan be an integer (steps) or a string (e.g.,'1d','8h'), andwindowis a string (e.g.,'5d','8h').
Return value: always
np.ndarray; the shape and data type are subject to the API documentation.
10.4. 4. 回测分支选择与过程数据协作
4.1 Static branch and dynamic branch
Static branch (
_backtest_static_operator): Callrun_strategiesonce for all time steps to generate all signals, then complete the backtest using Numba vectorized functions such asbacktest_batch_steps. Suitable for strategies that do not depend on process data.Dynamic branch (
_backtest_dynamic_operator): Loop over time steps; at each step generate signals → parse and simulate fills → update positions and cash, then move to the next step. Process data is maintained by the Backtester and injected into the Operator before each step, so strategies can access it viaget_data('proc.xxx').
Before running, the Backtester decides which branch to take via Operator.check_dynamic_data().
4.2 Decision logic of check_dynamic_data() (current implementation)
If any of the following is true, return True and take the dynamic branch:
op_type == ‘stepwise’: The Operator is explicitly configured to stepwise mode.
Using proc. in strategy source code*:
_strategies_use_proc_data()checks whether the source code of each strategy’srealize()contains'proc.'or"proc.". If so, it is considered to depend on process data.
Therefore, as long as get_data('proc.xxx') is called in realize(), it will automatically take the dynamic branch without any declaration. Process data is accessed only via proc.*; declaring it via DataType is no longer supported (legacy op_* types have been removed).
4.3 No look-ahead guarantee
When the strategy generates a signal at step k:
Account/position data (e.g.,
own_cashes,own_amounts) is visible at most up to index[0..k](i.e., the state at the start of the current step);Trade-related data (e.g.,
trade_records,trade_cost) is visible at most up to[0..k-1], excluding trades that have not yet occurred in this step.
In implementation, Operator’s
_current_signal_indexand Strategy’s_get_process_data_single()slice by the above ranges. The Backtester updates the index before generating signals at each step, ensuring no look-ahead.
4.4 The injection relationship between Backtester and Operator
Backtester (dynamic branch): at the
_backtest_dynamic_operatorentry point, injectown_cashes,available_cashes,own_amounts_array,available_amounts_array,trade_records_array,trade_cost_array,trade_price_array,trade_price_data, etc. into the Operator as_process_data_sources, and set_process_time_indexto a timeline aligned withop_signal_index.Operator: In
run_strategy(step_index), before each call tostg.generate(), compute and update_current_signal_indexbased ongroup_timing_tableandgroup_merge_type, for Strategy to slice the “currently visible” process data.
4.5 Process data in live trading (Trader)
When operator.check_dynamic_data() is True, Trader will, in _run_strategy(), do the following:
Assemble the current account cash, positions, available quantities, current prices, etc. into single-step
_process_data_sourcesand_process_time_index(a single live-trading run is treated as one step);Within this step, the strategy can call
get_data('proc.own_cash'), etc. to obtain the current account/position view; the trade history is empty within this step, consistent with the semantics of “no trades have occurred in this step yet”.
10.5. 5. 动态/静态路径一致性约定
When a strategy does not use process data: it should take the static branch; if it takes the dynamic branch for other reasons, the backtest results should be exactly identical numerically to the static branch (under the same configuration and data).
When a strategy uses process data: it must go through the dynamic branch; otherwise,
_process_data_sourcesnot being injected will trigger a RuntimeError.Test convention: use
StaticSignalStg(purely static) andProcAwareButStaticLogicStg(calls proc but does not use it for signals) to backtest under the same configuration, and assert numerical consistency forown_cashes,own_amounts_array,trade_records_array, etc.; see Group B intests/test_process_data_api.py.
10.6. 6. 测试与文档索引
Dedicated tests:
tests/test_process_data_api.pyGroup A: the behavior of
check_dynamic_data()under purely static strategies / strategies using proc.*;Group B: backtest array consistency between static strategies and “calling proc but logically equivalent” strategies;
Group C:
get_dataerror behavior for static multi-source, rejecting lag/window, proc single-field, and mixed calls;Group D: no look-ahead validation for
proc.trade_records;Group E: correctness of the real dynamic-strategy path based on process data.
Project memory:
.cursor/rules/process-data-and-dynamic-backtest.mdc(implementation and convention summary).Strategy data overview: How strategies declare and use data; Backtest entry points and modes: Backtesting, live trading, and optimization.