The smartest stupid problem that you might see today

A few days ago, someone much smarter than me shared with me the following problem. He said it involves "nothing else but simple arrays and for-loops, but has some weird observations to make". I couldn't solve it, even though the solution is simple enough that it can probably be understood by higher-rated pupils on CF. I thought it might be interesting to you, so here it is:

You are given an array $$$a$$$ with $$$n$$$ positive integers (you don't know anything about their maximum value). Find out the maximum difference between two elements if the array $$$a$$$ would be sorted.

Example:
Input: $$$n = 4, a = [11, 2, 7, 5]$$$
Output: $$$4$$$, because if we were to sort the array $$$a$$$, then we would have $$$[2, 5, 7, 11]$$$, so the maximum difference between two adjacent elements is $$$4$$$.

Of course, you can just sort the array in $$$O(n \log n)$$$ and find the difference that way. Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is. But there is actually a way to find it in $$$O(n)$$$, without sorting the array.

Hint

Solution

I will call $$$max$$$ the maximum value in $$$a$$$ and $$$min$$$ -- the minimum value. You can calculate these in $$$O(n)$$$ time.

The solution kind of uses a "grouping" strategy you might see in sqrt-decomposition problems. Any maximum difference is at least $$$d = \frac{max-min}{n-1}$$$. (For a proof, draw $$$min$$$ and $$$max$$$ on the number line and imagine that you start in $$$min$$$ and you want to get to $$$max$$$ by doing $$$n-1$$$ jumps to the right. If all jumps would be smaller in length than $$$d$$$, then you cannot reach the value $$$max$$$, so there must be at least one jump that is at least $$$d$$$.)

With this $$$d$$$ calculated, we can go through the array $$$a$$$ and place the values from $$$a$$$ into $$$n$$$ buckets:

first bucket: values from $$$min$$$ to $$$min + d-1$$$
second bucket: values from $$$min + d$$$ to $$$min + 2d - 1$$$
third bucket: values from $$$min + 2d$$$ to $$$min + 3d - 1$$$
...
$$$n-1$$$-th bucket: from $$$min + (n-1)d$$$ to $$$min + nd - 1$$$
$$$n$$$-th bucket: from $$$min + nd = max$$$ to $$$max$$$ (it will store one value).

Now, iterate through the buckets from first to last. We have two types of candidates for the maximum difference:

numbers within the same bucket: calculate max value — min value from the bucket
numbers from different buckets: minimum number from the next non-empty bucket minus the maximum number from the current bucket. This drawing might be helpful:

buckets

Why does this work? Well, we always have two non-empty buckets (one contains $$$min$$$, another one contains $$$max$$$), and if we want to take the maximum difference using elements from two buckets, then:

we can only choose adjacent buckets (ignoring empty buckets in-between): we can't jump over a non-empty bucket, because that means we don't take adjacent numbers in the sorted version of array $$$a$$$.
for a similar reason we cannot jump over values within a bucket, so we have to choose the maximum from the left bucket and the minimum from the right bucket.

For each bucket you just have to store its maximum and minimum value (because those are the only ones used in the calculations). So the solution is to go through the array, to find the $$$max$$$ and $$$min$$$ for the whole array. Then go again in the array, updating the max and min values of the corresponding bucket of the current element. Then, go through each bucket and calculate the differences described above (this can be done in $$$O(n)$$$ if you keep a variable that points to the current bucket and one that points to the next bucket).

Total runtime: $$$O(n)$$$.

What is pretty neat about this solution is that you can quite easily extend this problem to solve the case for non-integers and negative numbers.

Hope you enjoyed the problem!

Comments (29)

Write comment?

lotusblume

11 days ago, # |

+15

https://leetcode.com/problems/maximum-gap/description/

→ Reply

tvmpqx_8601

https://csacademy.com/contest/interview-archive/task/consecutive-max-difference/

Maksim1744

+154

Cool! Now find minimum. Seriously though.

Um_nik

11 days ago, # ^ |

+91

The cool thing about it

lrvideckis

9 days ago, # ^ |

← Rev. 2 →

Spoiler

+22

I get the joke, but I can't just let it slide. The whole point is that it is $$$O(n)$$$.

second_saturday

This is famous interview problem: https://www.interviewbit.com/problems/maximum-consecutive-gap/

HedayatYamin

← Rev. 3 →

-8

nice problem

Ar7e9

it is wrong, think harder.

Bibingka

spitting this nonsense. I'm not if he used even a 1% of his brain power.

avighnakc

-21

You probably just implemented it wrong lmao

sorry, I don't see how that solution is correct.

a = [5 2 1 4] sorted_a = [1 2 4 5] max = 5 second_max = 4 diff = 1 max_diff = 2 ???

1e3bonzakura

10 days ago, # ^ |

found the guessforces expert

nskybytskyi

+54

A mandatory pedantic comment

Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is

This is a horrible way to phrase it because you immediately proceed to contradict yourself:

I will call max the maximum value in a ... You can calculate these in O(n) time

Maybe you wanted to say that the radix sort has a pseudopolynomial complexity? However, this is not true, because its complexity is O(wn), where n is the number of keys, and w is the key length. It is polynomial because the input size is also O(wn).

Maybe you wanted to say that w is large, in which case O(wn) is polynomial but slow? However, your proposed solution also becomes slow if you account for the increasing cost of arithmetic operations. For example, subtraction becomes O(w), and division by n-1 may be even slower.

Maybe it's possible to construct a computational system where the complexities of primitive operations work out in favor of your proposed solution, but I doubt that it will be practical.

hashman

I think for $$$a_i \le 10^9,$$$ we can assume that the cost of such arithmetic operations is basically constant (because of word size), but it is not feasible to create an array of size $$$10^9$$$ in order to perform radix sort.

estoy-re-sebado

You can radix sort by the first bit, then by the second bit, etc. This way, you can sort 32-bit integers in 32 linear-time passes. (or be smarter and do 4 passes, each sorting by a group of 8 bits)

In general, you can sort $$$N$$$ numbers of $$$W$$$ bits in $$$O(WN)$$$ time.

+24

I see. For some reason, I mistook radix sort for counting sort. 🤡

You must be thinking about counting sort, not radix sort.

+37

Yes, I realized that later. I'm a clown

For me, it was pretty clear that he meant that you don't know the maximum value at the time of writing the program.

As if someone hardcodes key length in radix sort.

Well, radix sort wouldn't be significantly slower even with big numbers, but still. I agree with the things you said, but trying out both versions and measuring their runtime have more worth than talking about them.

+39

Cool problem! I think there is a small mistake in your explanation.

explanation

dimastrakhal

' with n positive integers (you don't know anything about their maximum value'
' I will call max the maximum value in a and min -- the minimum value. You can calculate these in O(n) time'

Totally not contradictory.

TwentyOneHundredOrBust

10 days ago, # |

+16

numbers within the same bucket: calculate max value — min value from the bucket

isn't this both not correct and unnecessary?

I agree, if the lower bound is the box size you never need to consider 2 elements from the same box. Also it would only be a valid candidate if the number of elements in the box were exactly 2.

xcx0902

Why this?

Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is.

If you read the input array, you will definitely know the maximum value.

They are saying that it can be really large, so you can't use radix sort, because it takes $$$\mathcal{O(n \cdot \ell)}$$$. But as described in nskybytskyi's comment above, if the numbers are really quite large, the comparision and division operations described in this solution take $$$\mathcal{O(\ell)}$$$ time anyway (where $$$\ell$$$ is the length of the binary representation of the number).

Cyclonestopper9000

8 days ago, # |

← Rev. 4 →

What about using a vector containing ints (or lls) called values? Then, as you traverse the elements of the input array, you can set values[index]=value. Then, couldn't you just loop through 0 to n-1, and find the maximum value of the absolute value of (values[index+1]-values[index]), and output the max? Please let me know if I misunderstood the problem.

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	173
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	157
6	maroonrk	156
7	-is-this-fft-	152
8	Petr	146
8	orz	146
10	pajenegod	145

PurpleThinker's blog