Implementation of revolving door data compression algorithm in PostgreSQL

Background

In application fields such as the Internet of Things, monitoring, sensors, and finance, data is generated in a streaming manner in the time dimension, and the amount of data is very large.

For example, the performance monitoring view we often see is a curve drawn by many points in the time dimension.

Another example is the trend data of the financial industry and so on.

Implementation of revolving door data compression algorithm in PostgreSQL

Let\’s imagine that if each sensor or indicator generates 1 point every 100 milliseconds, that is 864,000 points in a day.

There are a lot of sensors or indicators. For example, if there are 1 million sensors or indicators, the amount per day is close to 100 million.

Suppose we want to draw a graph for a time period. With so many points, it will take a long time to render.

So is there a good compression algorithm that can ensure distortion and compress data well?

Principle of the revolving door compression algorithm

The revolving door compression algorithm (SDT) is a linear trend compression algorithm. Its essence is to replace a series of continuous compression algorithms with a straight line determined by the starting point and the end point. data points.

This algorithm needs to record the length of each time interval, starting point data and end point data. The end point data of the previous period is the starting point data of the next period.

The basic principle is relatively simple, see the figure.

Implementation of revolving door data compression algorithm in PostgreSQL

Implementation of revolving door data compression algorithm in PostgreSQL

The first data point a has one point above and one below, and the distance between them and point a is E (i.e. door width), These two points serve as the two fulcrums of the \”door\”.

When there is only the first data point, both doors are closed; as the number of points increases, the door will gradually open; Notice that the width of each door can be retracted. Within a certain period of time, once the door is opened, Cannot be closed;

As long as the two doors are not parallel, or the sum of the two internal angles is less than 180° (the algorithm in this article will use this to make judgments), this \”revolving door\” operation can be Continue.

The first time period in the figure is from a to e, The result is to replace the data points (a, b, c, d, e) with a straight line from point a to point e; which plays a compression role of controllable distortion (E)

The second one. The time interval starts from point e, and the two doors are closed at the beginning, and then gradually opened. The subsequent operation is similar to the previous paragraph.

Implementing the revolving door compression algorithm in PostgreSQL

Through the revolving door. The principle of the algorithm can be understood as having several necessary inputs.

  • A point with x coordinate and y coordinate (if it is a point on the time axis, it can be converted into this form through epoch)

  • E, the width of the gate, plays a role in controlling the compression distortion

Example

Create a test table

create table tbl(id int, — ID, optional val numeric, — value (such as sensor or point value in the financial industry) t timestamp — value timestamp);

Insert 100,000 pieces of test data

insert into tbl select generate_series (1,100000), round((random()*100)::numeric, 2), clock_timestamp()+(generate_series(1,100000) || \’ second\’)::interval ; test=> select * from tbl limit 10;

id | val | t

—-+——-+- —————————

1 | 31.79 | 2016-08-12 23:22:27.530318

2 | 18.23 | 2016-08-12 23:22:28.530443

3 | 5.14 | 2016-08-12 23:22:29.530453

4 | 90.25 | 2016-08-12 23:22:30.530459

5 | 8.17 | 2016-08-12 23:22:31.530465

6 | 97.43 | 2016-08-12 23:22:32.53047

7 | 17.41 | 2016-08-12 23:22:33.530476

8 | 0.23 | 2016-08-12 23:22 :34.530481

9 | 84.67 | 2016-08-12 23:22:35.530487

10 | 16.37 | 2016-08-12 23:22:36.530493

(10 rows)

How to convert time into a value on the X-axis, assuming that every 1 second is 1 unit of the X coordinate

test=> select (extract(epoch from t)-extract( epoch from first_value(t) over())) / 1 as x, — divided by 1 second as 1 unit

val, t from tbl limit 100;

x | val|t

——————+——-+—————- ————

0 | 31.79 | 2016-08-12 23:22:27.530318 1.00012493133545 | 18.23 | 2016-08-12 23:22:28.530443 2.00013494491577 | 5.14 | 2016-08-12 23:22:29.530453 3.00014090538025 | 90.25 | 2016-08-12 23:22:30.530459 4.00014686584473 | 8.17 | 2016-08-12 23:22:31.530465 5.00015187263489 | 97.43 | 2016-08-12 23:22:32.53047 6.00015807151794 | 17.41 | 2016-08-12 23:22:33.530476 7.00016307830811 | 0.23 | 2016-08-12 23:22:34.530481 8.00016903877258 | 84.67 | 2016-08-12 23:22:35.530487

Write a function to implement the spiral door algorithm

create or replace function f (

i_radius numeric, — compression radius

i_time timestamp, — start time

i_interval_s numeric, — time conversion interval (seconds, for example, every 5 seconds represents 1 unit on the coordinates interval, 5) is used here

query text, — data that needs to be compressed by revolving door, example \’select t, val from tbl where t>=%L order by t limit 100\’, the select clause must be fixed and must be sorted by t

OUT o_val numeric, — value, ordinate y (jump point y)

OUT o_time timestamp, — time , abscissa x (jump point x)

OUT o_x numeric — jump point x, converted by o_time)

returns setof record as $

declare

v_time timestamp; — time variable

v_x numeric; — v_time is converted to v_x

v_val numeric; – – y coordinate

v1_time timestamp; — previous point time variable

v1_x numeric; — previous point v_time Convert to v_x

v1_val numeric; — The y coordinate of the previous point

v_start_time numeric; — Record the time coordinate of the first point, used to calculate the x offset

v_rownum int8 := 0; –Used to mark whether it is the first row

v_max_angle1 numeric; — Maximum upper door angle

v_max_angle2 numeric; — Maximum lower door angle

v_angle1 numeric; — Angle of upper door angle

v_angle2 numeric; — Angle of lower door begin

for v_time, v_val in execute format(query, i_time)

LOOP

— The first line, the first point, is the actual point to be recorded

v_rownum := v_rownum + 1; if v_rownum=1 then

v_start_time := extract(epoch from v_time);

v_x := 0;

o_val := v_val;

o_time := v_time;

o_x := v_x;

— raise notice \’rownum=1 %, %\’, o_val,o_time;

return next; — Return to the first point

else

v_x := (extract(epoch from v_time) – v_start_time) / i_interval_s; — Generate X coordinates

SELECT 180-ST_Azimuth(

ST_MakePoint(o_x, o_val+i_radius), — Point on the door

ST_MakePoint( v_x, v_val) — next point

)/(2*pi())*360 as degAz, — Upper angle

ST_Azimuth(

ST_MakePoint(o_x, o_val-i_radius), — lower point

ST_MakePoint(v_x, v_val) — next point

p>

)/(2*pi())*360 As degAzrev — lower angle

INTO v_angle1, v_angle2;

select GREATEST(v_angle1, v_max_angle1), GREATEST(v_angle2, v_max_angle2) into v_max_angle1, v_max_angle2; if (v_max_angle1 + v_max_angle2) >= 180 then — Find the point outside the quadrilateral, output the previous point, and start again from the previous point Calculate quadrilateral

— raise notice \’max1 %, max2 %\’, v_max_angle1 , v_max_angle2;

— Restore

v_angle1 := 0;

v_max_angle1 := 0;

v_angle2 : = 0;

v_max_angle2 := 0; — The door is fully opened and the value of the previous point is output

o_val := v1_val;

o_time := v1_time;

v1_x := (extract(epoch from v1_time) – v_start_time) / i_interval_s; — Generate the X coordinate of the previous point

o_x := v1_x;

— Use the new door to calculate the new angle with the current point

SELECT 180-ST_Azimuth(

ST_MakePoint(o_x, o_val+i_radius), — door point

ST_MakePoint(v_x, v_val) — next point

) /(2*pi())*360 as degAz, — upper angle

ST_Azimuth(

ST_MakePoint(o_x, o_val-i_radius), — next point

ST_MakePoint(v_x, v_val) — next point

)/(2*pi())*360 As degAzrev — Lower angle

INTO v_angle1, v_angle2; select GREATEST(v_angle1, v_max_angle1), GREATEST(v_angle2, v_max_angle2) into v_max_angle1, v_max_angle2;

— raise notice \’new max %, new max %\’, v_max_angle1 , v_max_angle2;

— raise notice \’rownum1 %, %\’, o_val, o_time;

return next; end if;

— Record the current value and save it as the previous point of the next point

v1_val := v_val;

v1_time := v_time;

end if;

END LOOP;

end;

$ language plpgsql strict;

Compression test

The door width is 15, the starting time is \’2016-08-12 23:22:27.530318\’, and every 1 second represents 1 X coordinate unit.

test=>

select * from f (

15, — door width=15

\’2016-08-12 23:22:27.530318\’, — Start time

1, — Time coordinate conversion interval, 1 second

\’select t, val from tbl where t>=%L order by t limit 100\’ — query

);

o_val | o_time | o_x

——-+ —————————-+——————

18.23 | 2016-08-12 23:22:28.530443 | 0

5.14 | 2016-08-12 23:22:29.530453 | 1.00001287460327

90.25 | 2016-08-12 23:22:30.530459 | 2.00001883506775

……

87.90 | 2016-08-12 23:24:01.53098 | 93.0005400180817

29.94 | 2016-08-12 23:24:02.530985 | 94.0005450248718

63.53 | 2016-08-12 23:24:03.53099 | 95.0005497932434

12.25 | 2016-08-12 23:24:04.530996 | 96.0005559921265

83.21 | 2016-08-12 23:24:05.531001 | 97.0005609989166

(71 rows)

You can see 100 points, compressed into 71 points.

Compare the original 100 point values

test=> select val, t, (extract(epoch from t)-extract(epoch from first_value(t) over() ))/1 as x from tbl where t>\’2016-08-12 23:22:27.530318\’ order by t limit 100;

val | t | x

——-+——————— ——-+——————

18.23 | 2016-08-12 23:22:28.530443 | 0

5.14 | 2016-08-12 23:22:29.530453 | 1.00001001358032

90.25 | 2016-08-12 23:22:30.530459 | 2.0000159740448

……

83.21 | 2016-08-12 23:24:05.531001 | 97.0005581378937

87.97 | 2016-08-12 23:24:06.531006 | 98.0005631446838

58.97 | 2016-08-12 23:24:07.531012 | 99.0005691051483

(100 rows)

Use excel drawing to compare before and after compression

The above is after compression The data plot, the following is the data plot before compression

The position marked in red is the data compressed through the revolving door algorithm.

Distortion is controllable.

Implementation of revolving door data compression algorithm in PostgreSQL

Implementation of streaming compression

This article is omitted, but it is actually very simple. Change this function to create a The array is a function of input parameters.

In the form of lambda, the number can be obtained from the streaming input pipeline in real time and executed.

It can also be written as an aggregate function based on PostgreSQL Called in the streaming database pipelineDB to implement streaming computing

http://www.pipelinedb.com/

Summary

Through the revolving door algorithm, Real-time compression of streaming data such as IT monitoring, finance, electricity, water conservancy, Internet of Things, etc.

The data does not need to be extracted from the data. Once the database is loaded, the calculation and compression can be completed in the database.

Users can also perform streaming data compression according to actual needs. Similarly, the data does not need to be loaded from the database, it can be done on the database side. Complete.

PostgreSQL is as powerful and easy to use as ever.

Reference

  1. http://baike.baidu.com/view/3478397.htm

  2. http://postgis.net/ docs/manual-2.2/ST_Azimuth.html

  3. https://www.postgresql.org/docs/devel/static/functions-co nditional.html

  4. http://gis.stackexchange.com/questions/25126/how-to-calculate-the-angle-at-which-two-lines-intersect -in-postgis

  5. http://gis.stackexchange.com/questions/668/how-can-i-calculate-the-bearing-between-two-points-in-postgis

  6. http://www.pipelinedb.com/

For more in-depth technical content, please follow Yunqi Community

本站内容及图片来自网络,版权归原作者所有,内容仅供读者参考,不承担相关法律责任,如有侵犯请联系我们:609448834

Like (0)
华夏门网的头像华夏门网
Previous 2024年12月23日
Next 2024年12月23日

相关推荐

  • 科拓旋转闸(全高双门)是一款可靠性高的全高十字旋转门

    旋转闸(全高双门)是一款可靠性高的全高十字旋转门,整体结构精巧,易于安装和维修,配备了标准化的机械和电气系统,可避免两个人同时穿越,水力阻尼系统,确保工作稳定。同时,为符合安全出口的需要,在发生突发事件或停电时,可以将转轴设置在失效保护状态,也就是转动轴可以自由转动,并转换成双向自由通过状态。 旋转闸(全高双门) 从外观上来说,90度的全高转闸有四根柱子,当…

    旋转门 2023年10月29日
    310
  • 会有哪个大冤种跟我一样,把卫生间门做成外开的?赶紧改了吧

    会有哪个大冤种跟我一样会把卫生间的门做成外开的? 我就是那个把卫生间的门做成外开的大冤种,今天跟大家讲一下我是怎么一步一步掉进这个坑里的。 我这个房子原始户型是一个卫生间,计划一改二,所以走廊左边的这些墙在拆旧的时候就已经敲掉了。水电交底的时候设计师和项目经理都说卫生间的照明开关和浴霸开关放在门外面也可以,放在门里面也可以,看我怎么选择。 但是我想我的浴霸是…

    旋转门 2024年7月18日
    200
  • 旋转门、蝴蝶门这些奇特的车门你都见过吗?

    在全球汽车工业长达一百多年的发展历史中,出现过诸多稀奇古怪的设计,当然车门也包含其中,旋转门、蝴蝶门这些奇特的车门你都见过吗? 旋转门 旋转门这一概念是由柯尼塞格提出,英文为dihedral door,开启方式是向前旋转,旋转轴同时向前移动,开门的样子真的非常炫酷,但缺点就是两侧距离一定要保持充足,要不然很容易磕到。 蝴蝶门 蝴蝶门是以剪刀门为原型打造出来的…

    旋转门 2023年9月26日
    1690
  • 疼!女童旋转门内追逐嬉戏被卡,监控还原惊险一幕

    8月28日 苏州某大厦内 一名4岁女童的右脚卡在 大厅的旋转门中无法动弹 通过监控得知 三个小孩在无大人照看的情况下 在大厅旋转门中奔跑嬉闹 一个不小心 该4岁女童重重地滑倒在地 右脚踝也顺势被门缝牢牢卡住 消防员接到报警后立即到场 为女童做好万全的防护措施后 利用切割机、撬棒等工具 将金属门框拆除 经过10分钟的紧张救援 成功救出被卡女童 女童大腿根部受轻…

    旋转门 2023年10月16日
    400
  • 两翼自动旋转门的空间效果怎么样?

      作为旋转门里的佼佼者,两翼自动旋转门显得大气,它开阔的门面、高贵的风格,可以说是大楼的画龙点睛,屹立在建筑的潮流前沿。它具有独特的性能和设计,使工作在这里的人可以充分感受到现代技术的便利和舒适。今天西恩电气小编就和大家来聊一聊两翼自动旋转门。   与其他自动门相比,两翼自动旋转门的宽度更大,有更好的空间效果,能和大厅形成呼应,从而提高了大楼的整体格调,被…

    旋转门 2023年11月7日
    40

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:[email protected]

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信