数组越界

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

数组越界

allanqinjy
各位好,
     flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink 跑会报数组越界,这个是什么问题?
     


 

Reply | Threaded
Open this post in threaded view
|

回复:数组越界

Yichao Yang
图看不到
flink内置udf和hive udf不同,有些udf下标是从1开始的








各位好,
     flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink 跑会报数组越界,这个是什么问题?
     




 
Reply | Threaded
Open this post in threaded view
|

Re:回复:数组越界

allanqinjy
我觉得要是从1开始,那么编译的时候就应该报异常了,而不是提交作业运行报。




Caused by: java.lang.ArrayIndexOutOfBoundsException: 22369621
18-05-2020 16:27:14 CST INFO - at org.apache.flink.table.runtime.util.SegmentsUtil.getByteMultiSegments(SegmentsUtil.java:598)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.table.runtime.util.SegmentsUtil.getByte(SegmentsUtil.java:590)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.table.runtime.util.SegmentsUtil.bitGet(SegmentsUtil.java:534)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.table.dataformat.BinaryArray.isNullAt(BinaryArray.java:117)
18-05-2020 16:27:14 CST INFO - at BatchCalc$822.processElement(Unknown Source)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:550)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:527)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:487)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:748)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:734)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:93)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
18-05-2020 16:27:14 CST INFO - at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)








在 2020-05-18 16:38:16,"1048262223" <[hidden email]> 写道:

>图看不到
>flink内置udf和hive udf不同,有些udf下标是从1开始的
>
>
>
>
>
>
>
>
>各位好,
>&nbsp; &nbsp; &nbsp;flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink 跑会报数组越界,这个是什么问题?
>&nbsp; &nbsp; &nbsp;
>
>
>
>
>&nbsp;
Reply | Threaded
Open this post in threaded view
|

Re: 回复:数组越界

Benchao Li
数组长度是运行时的问题,编译期并不知道数组的长度。而且现在好像也没有检查下标是不是合法(比如必须大于0)。我们以前也经常遇到这种问题。

allanqinjy <[hidden email]> 于2020年5月18日周一 下午5:15写道:

> 我觉得要是从1开始,那么编译的时候就应该报异常了,而不是提交作业运行报。
>
>
>
>
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 22369621
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.table.runtime.util.SegmentsUtil.getByteMultiSegments(SegmentsUtil.java:598)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.table.runtime.util.SegmentsUtil.getByte(SegmentsUtil.java:590)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.table.runtime.util.SegmentsUtil.bitGet(SegmentsUtil.java:534)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.table.dataformat.BinaryArray.isNullAt(BinaryArray.java:117)
> 18-05-2020 16:27:14 CST INFO -  at BatchCalc$822.processElement(Unknown
> Source)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:550)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:527)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:487)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:748)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:734)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:93)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> 18-05-2020 16:27:14 CST INFO -  at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
>
>
>
>
>
>
>
>
> 在 2020-05-18 16:38:16,"1048262223" <[hidden email]> 写道:
> >图看不到
> >flink内置udf和hive udf不同,有些udf下标是从1开始的
> >
> >
> >
> >
> >
> >
> >
> >
> >各位好,
> >&nbsp; &nbsp;
> &nbsp;flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink
> 跑会报数组越界,这个是什么问题?
> >&nbsp; &nbsp; &nbsp;
> >
> >
> >
> >
> >&nbsp;
>


--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: [hidden email]; [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: 数组越界

Leonard Xu
Hi, allanqinjy

运行时抛ArrayIndexOutOfBoundsException 是不符合预期的,感觉是个bug。
可以复现的haul,方便提供下复现的sql和数据吗?

Best,
Leonard Xu


> 在 2020年5月18日,17:37,Benchao Li <[hidden email]> 写道:
>
> 数组长度是运行时的问题,编译期并不知道数组的长度。而且现在好像也没有检查下标是不是合法(比如必须大于0)。我们以前也经常遇到这种问题。
>
> allanqinjy <[hidden email]> 于2020年5月18日周一 下午5:15写道:
>
>> 我觉得要是从1开始,那么编译的时候就应该报异常了,而不是提交作业运行报。
>>
>>
>>
>>
>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 22369621
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.table.runtime.util.SegmentsUtil.getByteMultiSegments(SegmentsUtil.java:598)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.table.runtime.util.SegmentsUtil.getByte(SegmentsUtil.java:590)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.table.runtime.util.SegmentsUtil.bitGet(SegmentsUtil.java:534)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.table.dataformat.BinaryArray.isNullAt(BinaryArray.java:117)
>> 18-05-2020 16:27:14 CST INFO -  at BatchCalc$822.processElement(Unknown
>> Source)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:550)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:527)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:487)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:748)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:734)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:93)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
>> 18-05-2020 16:27:14 CST INFO -  at
>> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-05-18 16:38:16,"1048262223" <[hidden email]> 写道:
>>> 图看不到
>>> flink内置udf和hive udf不同,有些udf下标是从1开始的
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 各位好,
>>> &nbsp; &nbsp;
>> &nbsp;flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink
>> 跑会报数组越界,这个是什么问题?
>>> &nbsp; &nbsp; &nbsp;
>>>
>>>
>>>
>>>
>>> &nbsp;
>>
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: [hidden email]; [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: 数组越界

Leonard Xu
Hi,  allanqinjy

方便贴下查询的query吗?今天在排查另外一个问题时也遇到了这个问题,我建了issue来跟踪[1],想看下是不是相同原因。


Best,
Leonard
[1] https://issues.apache.org/jira/browse/FLINK-17847 <https://issues.apache.org/jira/browse/FLINK-17847>

> 在 2020年5月18日,19:52,Leonard Xu <[hidden email]> 写道:
>
> Hi, allanqinjy
>
> 运行时抛ArrayIndexOutOfBoundsException 是不符合预期的,感觉是个bug。
> 可以复现的haul,方便提供下复现的sql和数据吗?
>
> Best,
> Leonard Xu
>
>
>> 在 2020年5月18日,17:37,Benchao Li <[hidden email]> 写道:
>>
>> 数组长度是运行时的问题,编译期并不知道数组的长度。而且现在好像也没有检查下标是不是合法(比如必须大于0)。我们以前也经常遇到这种问题。
>>
>> allanqinjy <[hidden email]> 于2020年5月18日周一 下午5:15写道:
>>
>>> 我觉得要是从1开始,那么编译的时候就应该报异常了,而不是提交作业运行报。
>>>
>>>
>>>
>>>
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 22369621
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.table.runtime.util.SegmentsUtil.getByteMultiSegments(SegmentsUtil.java:598)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.table.runtime.util.SegmentsUtil.getByte(SegmentsUtil.java:590)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.table.runtime.util.SegmentsUtil.bitGet(SegmentsUtil.java:534)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.table.dataformat.BinaryArray.isNullAt(BinaryArray.java:117)
>>> 18-05-2020 16:27:14 CST INFO -  at BatchCalc$822.processElement(Unknown
>>> Source)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:550)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:527)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:487)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:748)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:734)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:93)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
>>> 18-05-2020 16:27:14 CST INFO -  at
>>> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 在 2020-05-18 16:38:16,"1048262223" <[hidden email]> 写道:
>>>> 图看不到
>>>> flink内置udf和hive udf不同,有些udf下标是从1开始的
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 各位好,
>>>> &nbsp; &nbsp;
>>> &nbsp;flink1.10,在跑flink批量sql的适合语法通过没问题,在运行脚步的适合报错如下,hive脚步跑没有问题,不知道为什么flink
>>> 跑会报数组越界,这个是什么问题?
>>>> &nbsp; &nbsp; &nbsp;
>>>>
>>>>
>>>>
>>>>
>>>> &nbsp;
>>>
>>
>>
>> --
>>
>> Benchao Li
>> School of Electronics Engineering and Computer Science, Peking University
>> Tel:+86-15650713730
>> Email: [hidden email]; [hidden email]
>