Stream

CREATE STREAM

Stream is a key concept in Timeplus. All data lives in streams, no matter static data or data in motion. We don't recommend you to create or manage TABLE in Timeplus.

Append-only Stream

By default, the streams are append-only and immutable. You can create a stream, then use INSERT INTO to add data.

Syntax:

CREATE STREAM [IF NOT EXISTS] [db.]<stream_name>
(
    <col_name1> <col_type_1> [DEFAULT <col_expr_1>] [compression_codec_1],
    <col_name1> <col_type_2> [DEFAULT <col_expr_2>] [compression_codec_2]
)
SETTINGS <event_time_column>='<col>', <key1>=<value1>, <key2>=<value2>, ...

info

Stream creation is an async process.

If you omit the database name, default will be used. Stream name can be any utf-8 characters and needs backtick quoted if there are spaces in between. Column name can be any utf-8 characters and needs backtick quoted if there are spaces in between.

Data types

Timeplus Proton supports the following column types:

int8/16/32/64/128/256
uint8/16/32/64/128/256
bool
decimal(precision, scale) : valid range for precision is [1: 76], valid range for scale is [0: precision]
float32/64
date
datetime
datetime64(precision, [time_zone])
string
fixed_string(N)
array(T)
uuid
ipv4/ipv6

For more details, please check Data Types.

Event Time

In Timeplus, each stream with a _tp_time as Event Time. If you don't create the _tp_time column when you create the stream, the system will create such a column for you, with now64() as the default value. You can also choose a column as the event time, using

SETTINGS event_time_column='my_datetime_col'

It can be any sql expression which results in datetime64 type.

Retention Policies

Proton supports retention policies to automatically remove out-of-date data from the streams.

For Historical Storage

Proton leverages ClickHouse TTL expression for the retention policy of historical data. When you create the stream, you can add TTL to_datetime(_tp_time) + INTERVAL 12 HOUR to remove older events based a specific datetime column and retention period.

For Streaming Storage

You can set the retention policies for streaming storage when you create the stream or update the setting after creation.

CREATE STREAM .. SETTINGS logstore_retention_bytes=.., logstore_retention_ms=..;

ALTER STREAM .. MODIFY SETTINGS logstore_retention_bytes=.., logstore_retention_ms=..;

Versioned Stream

Versioned Stream allows you to specify the primary key(s) and focus on the latest value. For example:

CREATE STREAM versioned_kv(i int, k string, k1 string)
PRIMARY KEY (k, k1)
SETTINGS mode='versioned_kv', version_column='i';

The default version_column is _tp_time. For the data with same primary key(s), Proton will use the ones with maximum value of version_column. So by default, it tracks the most recent data for same primary key(s). If there are late events, you can use specify other column to determine the end state for your live data.

Changelog Stream

Changelog Stream allows you to specify the primary key(s) and track the add/delete/update of the data. For example:

CREATE STREAM changelog_kv(i int, k string, k1 string)
PRIMARY KEY (k, k1)
SETTINGS mode='changelog_kv', version_column='i';

CREATE RANDOM STREAM

You may use this special stream to generate random data for tests. For example:

CREATE RANDOM STREAM devices(
  device string default 'device'||to_string(rand()%4),
  location string default 'city'||to_string(rand()%10),
  temperature float default rand()%1000/10);

The following functions are available to use:

rand to generate a number in uint32
rand64 to generate a number in uint64
random_printable_ascii to generate printable characters
random_string to generate a string
random_fixed_string to generate string in fixed length
random_in_type to generate value with max value and custom logic

When you run a Timeplus SQL query with a random stream, the data will be generated and analyzed by the query engine. Depending on the query, all generated data or the aggregated states can be kept in memory during the query time. If you are not querying the random stream, there is no data generated or kept in memory.

By default, Proton tries to generate as many data as possible. If you want to (roughly) control how frequent the data is generated, you can use the eps setting. For example, the following SQL generates 10 events every second:

CREATE RANDOM STREAM rand_stream(i int default rand()%5) SETTINGS eps=10

You can further customize the rate of data generation via the interval_time setting. For example, you want to generate 1000 events each second, but don't want all 1000 events are generated at once, you can use the following sample SQL to generate events every 200 ms. The default interval is 5ms (in Proton 1.3.27 or the earlier versions, the default value is 100ms)

CREATE RANDOM STREAM rand_stream(i int default rand()%5) SETTINGS eps=1000, interval_time=200

Please note, the data generation rate is not accurate, to balance the performance and flow control.

info

New in Proton v1.4.2, you can set eps less than 1. Such as eps=0.5 will generate 1 event every 2 seconds. eps less than 0.00001 will be treated as 0.

For testing or demonstration purpose, you can create a random stream with multiple columns and use the table function to generate random data at once. The number of rows generated by this way is predefined and subject to change. The current value is 65409.

CREATE EXTERNAL STREAM

Please check Read/Write Kafka with External Stream.

CREATE STREAM​

Append-only Stream​

Data types​

Event Time​

Retention Policies​

For Historical Storage​

For Streaming Storage​

Versioned Stream​

Changelog Stream​

CREATE RANDOM STREAM​

CREATE EXTERNAL STREAM​