Data

Data modules bring time series data into AWTS. Modules are available for most major brokers and data vendors. Data sources are typically price data for tradable instruments - but can also be for any other data, including fundamentals and alternative data.

Principally, a data source is a time series / timestamped sequence of data points

All data modules/sources expose the same data API, so you can interact with data from any source in the same way using the same code.

Key benefits

Consistent streaming: Stream historic and realtime data in the same way, including transparently streaming historic into realtime

Reliability: Data modules handle disconnects & retries automatically, so your code doesn't have to

Performance: Local storage & smart caching mean data can be streamed into backtests at full speed

Less boilerplate: No need to cope with rate limits, connection limits, disconnect/retry logic - the data services handle all this for you

Consistent API: One standard clean API across all data providers, with full unary and streaming support.

QoL: Make life easier for your code; e.g. start "streaming" from the last 24 hours of data right through to "now" - no need to get + stream + handle overlap in your own logic.

Essentially the Data services make it quicker and easier for you to build robust and performant strategies.

Instruments & prefixes

Each data service registers one or more "prefixes" which form part of the symbol you subsequently use when querying data. The prefix will typically be the broker/vendor name.

E.g. the dataoanda module provides market data from the Oanda CFD broker and uses the prefix OANDA - so instruments there are referenced as OANDA:NAS100_USD or OANDA:GBP_USD.

See the Symbology documentation for more on how AWTS handles instrument symbology.

Latency & timeliness

The term 'real time' is broad and tends to be used in an approximate sense.

E.g. 'real time' price data for equities or futures is considered immediate, but there is inevitably a delay or latency involved; partly from the exchange(s) to your broker/vendor, and again from your broker/vendor to you. Near-time would be a more precise description.

For other data sources, the lag can be far greater; e.g. economic data often represents the previous month or quarter and is often not available until days or weeks into the next period.

Furthermore, data can be revised after the fact. This is the case for equity price data (e.g. EOD reporting of dark pool trades) and for economic data (e.g. revisions of non-farm payrolls data).

AWTS data sources include latency metadata and update/revision support so that, whether running backtests/historic or realtime your strategies can 'know' the right data from the point in time that it can be known. This is key to how the Execution Engine delivers an accurate time-based simulation for backtesting and market replays.

The latency characteristics of each data source are source-specific and are documented in the relevant vendor's Data module.

Caching

One of the mechanisms AWTS data modules use to increase performance is to dynamically cache historic market data.

Then, when you request data for backtesting or analysis, the cached data held locally can be used to serve results extremely quickly.

The caching is smart, and automatically self-invalidates when events occur that affect historic data; e.g. after stock splits.

Data is cached automatically when it is first received from the upstream broker/vendor. However, you can also use the API or CLI tool to 'precache' data that you will expect to need. This can be particularly useful.

Note: for backtesting, fills are simulated using the lowest available data resolution; e.g. 1-second bars or tick data. It is worth ensuring ample disk space is available to keep this data cached for maximum backtesting performance.

Streaming

The data service supports streaming candles from any timeframe, and from any point in time. Use this to stream unlimited backdata into an export file, or to seamlessly transition a real-time strategy from populating backdata (e.g. to populate moving averages) into live streaming data.

Live streaming considerations

Each broker/vendor has its own pecularities around how real time market data is provided.

In particular, 'live' data is not always completely live and 'closed' candles are not always completely closed. Ultimately, as well as latency between you and the broker, the broker has upstream latency from themselves to the exchange(s) where trades actually happen. So your hourly candle can't close immutably on the hour - some data comes in after the fact.

Each broker/vendor behaves slightly differently, and the peculiarities of each vendor are documented on their own Data module documentation. However, the AWTS modules handle these peculiarities for you (with configurability where relevant) so that your code can just freely stream data.

Primary considerations

  • Fast vs Accurate - do we delay candles so they don't get corrected? or deliver asap?
  • Padding - do we deliver a record when there's no data, or not?
Candle latency vs correctness
Updating already-delivered candles

Streaming records

  • NewClosed - a new candle that can be considered 'closed'
  • NewBlank - a new blank candle for a period where no trades happened
  • Partial - a partial/incomplete candle for a period that hasn't ended yet
  • Update - a correction to an already-sent candle (or blank)

Example

  • Candle 12:00:00, vol=1234
  • Candle 12:01:00, vol=3456