pyflink query 语句执行获取数据速度很慢,where子句不过滤数据么?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

pyflink query 语句执行获取数据速度很慢,where子句不过滤数据么?

肖越
connector 从数据库读取整张表格,执行:
env.sql_query("select a , b, c from table1 left join table2 on a = d where b = '103' and c = '203' and e = 'AC' and a between 20160701 and 20170307 order by biz_date")
其中表 a 的数据量很大,能有1千万条,但匹配出来的数据只有250条,本机执行要10分钟!
了解到 flink 1.11存在where子句不会先过滤数据,请问flink1.12 仍存在这个问题么?怎么优化呢?