-
Notifications
You must be signed in to change notification settings - Fork 0
/
README III Advanced Binary search + sweep line
372 lines (339 loc) · 18.5 KB
/
README III Advanced Binary search + sweep line
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
1 Advanced Binary Search:
The purpose of binary search is to divide possible solutions into half, the detailed procedure is as follow:
1) estimate the range of results: what is the upperbound ? what is the lowerbound ?
2) guess the possible solution within the estimated range: always get the middle of the range
3) establish examination condition to check if guessed result is valid: according to the object of the question
4) Adjust estimated range according to the results of examination condition test: either move forward or backward
when should we use binary search ? we can estimate the range of solution and the question is to ask some optimal case(max/min) within
the estimated range
Several sample code of while loop in binary search:
1) exit when both upperbound and lowerbound are the same, the upperbound is never reachable thus we need to make end+1 to make sure
the biggest upperbound we can get is the reachable value !! The goal is the get the biggest one that is smaller than target
while st < end:
mid = int((st + end) / 2)
if mid < target:
st = mid + 1
else:
end = mid
return st - 1
2) exit when both upperbound and lowerbound are the same, the upperbound is always reachable. The goal is to get the smallest one
that is bigger than target
while st < end:
mid = int((st + end) / 2)
if mid <= target:
st = mid + 1
else:
end = mid
return st
3) exit when upperbound is less than lowerbound, the result has to be one single point within the range. This usually running
slower than 1) and 2)
while st <= end:
mid = int((st + end) / 2)
if mid > target:
end = mid - 1
elif mid < target:
st = mid + 1
else:
return mid
4) exit when there is just two points st and end, there have to be three points within the range at least, usually applicable to
non-integer operation
while st + 1 < end:
mid = (st + end) / 2
if mid < target:
st = mid
else:
end = mid
return st
Application I: find the duplicated number between 1 and n, the length of input array is between 1 and n + 1
analysis: 1) Since the range of duplicated number is between 1 and n 2) the condition is that there is a duplicated one in the
array. Thus we can use binary search to solve it
while loop strategy: the smallest one that is bigger than target
Application II: Wood cut, given n picecs of length of wood, cut them into pieces with equal length from n picecs that have to
be bigger than or equal to k, find the longgest length of each k piece
analysis: 1) Since the range of the length of each piece is between 1 and max(picecs), let res be length of our result
2) the condition is sum([int(piece/res) for piece in pieces]) >= k. Thus we can use binary search to solve it
while loop strategy: the biggest one that is smaller than target(remember to let end + 1 !!)
Application III: Copy Books, given n book pages and we have k person, each person can only consecutive books in the pages array
assuming 1 page takes 1 minute, find the minimal minute to finish reading all n pages by k person
analysis: 1) Since the range of minutes is between min(pages) and sum(pages), let res be minimal minutes takes to complete
2) let a[i] be each person's allocation minutes to finish reading allocated books, the condition is sum(a[i] where i in 1..k)
is equals to sum(pages) and res = max(a[i] where i in 1..k) to be minimize
while loop strategy: the smallest one that is bigger than target
Application IV: Find peak element I, given n elements find the peak index such that p[i-1] < p[i] > p[i+1]
analysis: 1) the peak element index is within range from 0 to len(p)-1 2) the condition is that p[i-1] < p[i] > p[i+1]
while loop strategy: single point
Application V: First Bad Version, given n element, once there is a code version bad, all following version will become bad
as well, our goal is to find the first one that is False
analysis: 1) the result index is between 1 and n 2) the condition is to find the first False
while loop stragegy: the smallest one that is bigger than target
Application VI: Find peak element II, given a matrix that outermost rows/columns are always smaller than inner rows/columns
find the index of the peak element such that p[i-1][j] < p[i][j] & p[i+1][j] < p[i][j] & p[i][j-1] < p[i][j] & p[i][j+1] < p[i][j]
analysis: 1) the range of index is within matrix 2) the condition is as above
while loop strategy: we get the mid of rows to find the maximum columns,i+1/i-1 rows in the same column bigger than maximum
columns will contains peak(up/down), thus we cut matrix into half according to rows vertically and we get the mid of columns
to find the maximum row in vertical and j+1/j-1 columns in the same row bigger than maximum rows will contains peak(left/right),
thus we cut matrix into half according to column horizontally. Combing, we get a quarter of target area
the termination principle is single point
Application VII: Divide two integers, given two integers, without multipli0 cation/divide/module to get result of division in
integer
analysis:1) the range of result is between 0 and dividend 2) the condition is res * divisor + remain == dividend where remain
is less than divisor
while loop strategy: we use bit operation to perform multiplication: a << c is equal to a * pow(2,c) and start from bit 31 to
bit 0
for i in range(end,-1,-1): #end is 31
if remain + (divisor << i) <= dividend:
remain += divisor << i
quotient += pow(2,i)
Application VIII: sqrt(x) II, perform square root on double digits
analysis:1) the range of sqrt(x) is between 0 to x 2) let res be sqrt(x), res*res <= x
while loop strategy: the triple strategy
while st < end - base:
mid = (st + end) / 2
if mid * mid >= x:
end = mid
else:
st = mid
return st
Applicatio IX: sqrt(x), perform square root on integer digits
analysis: 1) the range of sqrt(x) is between 1 and x 2) let res be sqrt(x), res*res <= x
while loop strategy: the biggest one that is smaller than target
while st <= end:
mid = int((st+end)/2)
if mid * mid < x:
st = mid + 1
elif mid * mid > x:
end = mid - 1
else:
return mid
return st - 1
Application X: Smallest Rectangle Enclosing Black Pixels, given a position i,j of a index which value is 1 in a binary matrix,
there is only one black region that connects with each block points vertically and horizontally, find the smallest rectangle
area that contains all black points
analysis: 1) we need to find the four extreme points of the rectangle, they all within binary matrix 2) the condition is that
we start stretching from given position in up/down/left/right four directions, in up/left direction we find the first "1" from
st(lower) to end(upper) direction, which applies the smallest one that is bigger than target while loop strategy; in down/right
direction we find the last "1" from st(lower) to end(upper) direction, which applies the biggest one that is smaller than target
(note let end + 1)
while low < high:
mid = int((low + high) / 2)
if ((is_vertical == True and "1" in image[mid][min:max]) or (is_horizontal == True and "1" in [image[x][mid] for x in range(min,max)])) and is_upleft == True:
high = mid
elif ((is_vertical == True and "1" in image[mid][min:max]) or (is_horizontal == True and "1" in [image[x][mid] for x in range(min,max)])) and is_upleft == False:
low = mid + 1
elif ((is_vertical == True and "1" not in image[mid][min:max]) or (is_horizontal == True and "1" not in [image[x][mid] for x in range(min,max)])) and is_upleft == True:
low = mid + 1
elif ((is_vertical == True and "1" not in image[mid][min:max]) or (is_horizontal == True and "1" not in [image[x][mid] for x in range(min,max)])) and is_upleft == False:
high = mid
if is_upleft == True:
return low #this strategy we can get the end value
else:
return low - 1 #this strategy we cannot get the end value, thus we need to make high + 1
stretch from four directions:
max_row, max_col = len(image), len(image[0])
lowest_row = self.binary_stretch(image, 0, x, 0, max_col, True, False, True)
highest_row = self.binary_stretch(image, x+1, max_row, 0, max_col, True, False, False)
left_col = self.binary_stretch(image, 0, y, lowest_row, highest_row+1, False, True, True)
right_col = self.binary_stretch(image, y+1, max_col, lowest_row, highest_row+1, False, True, False)
return (highest_row - lowest_row + 1) * (right_col - left_col + 1)
Application XI: Maximum Average Subarray II, given an array, find the subarray with the maximum average with length >= k
analysis: 1) the maximum average is between min(array) and max(array) 2) the condition is that we have to check if the given
array have such subarray whose average is >= estimated average
while loop strategy: since we are allowed to have some degree of inaccuracy, we can use triple strategy on float
The trick is to determine if given array has a subarray whose average is bigger than or equal to the estimated average
Below is how we achieve this:
for i in range(len(nums)):
nums[i] -= average #we minus average, so that we just need to find if there is subarray whose sum is bigger or equal to 0 !!
sum_array, sum_part, min_before = 0, 0,
for i in range(k):
sum_array += nums[i]
if sum_array >= 0:
return True
#This is how we determine if there exsits a subarray with length >= k, its sum is bigger than or equal to zero !!!
for i in range(k,len(nums)):
sum_array += nums[i]
sum_part += nums[i-k]
min_before = min(min_before, sum_part)
if sum_array - min_before >= 0: #sum_array is the maximum sum value of subarray with length >= k
return True
return False
Then, we use triple strategy to get our result in while loop manner:
while st < end - base: #ase is 1e-8 which is the degree of precision
mid = (end + st) / 2
if self.check_ifbiggest(mid, nums, k) == False:
end = mid
else:
st = mid
return st
2 sweep line:
we just need to remember if we encounters [] range, we should consider using sweep line
1) split starting and ending point into a whole new array
2) sort the array (if the value ties occur, we sort according to their type in decreasing order, which means ending point comes
before starting point([10,20],[20,30] we should sort like [10<starting>,20<ending>,20<starting>,30<ending>]
Application Building outline: HashHeap + sweep line
when we encounters a starting x-pxis, we push into hashheap, when we encounters a ending x-pxis, we delete from hashheap
there are two types of things we need to take care while moving our sweep line:
1 when encounters a start bracket bar(left side of bar,"enter" the bar)
when we encounters a new starting bracket, we compare the height with our current highest bar in the hashheap,
if the new one is higher and also this is not a vertical line which means the left x axis of our previous highest bar is not
equal to current new starting bracket's left x axis then we append this into our result . The reason is because there could be
a higher bar on the same vertical line
2 when encounters a end bracket bar(right side of bar,"leave" the bar)
we first save the previous highest bar and then delete new bar height from our hashheap, then if heap is empty, we add this into
our result; if heap is not empty and we compare the deleted height with the new highest bar in our new heap after deleting this
one, if new height is less than deleted height, then we add this into our result !
Sample code:
max_heap, sweep_line_table, result_list, index = hashheap(), [], [], 0
for boundary in buildings:
sweep_line_table.append([boundary[0],boundary[2],1])
sweep_line_table.append([boundary[1],boundary[2],2])
sweep_line_table.sort(key=lambda x:(x[0],x[2],x[1]))
while index < len(sweep_line_table):
if sweep_line_table[index][2] == 1:
if max_heap.get_size() > 0:
top_ele = max_heap.get_peek()
if top_ele[0] < sweep_line_table[index][1]:
x_row = sweep_line_table[index][0]
#consider the case when x_row is equal to top_ele then we do nothing because there could be a higher bar on this vertical and this does not form a outline
if x_row != top_ele[1]:
result_list.append([top_ele[1],x_row,top_ele[0]])
max_heap.push([sweep_line_table[index][1],sweep_line_table[index][0]])
elif sweep_line_table[index][2] == 2:
last_top = max_heap.get_peek()
max_heap.delete(sweep_line_table[index][1])
if max_heap.get_size() > 0:
top_ele = max_heap.get_peek()
if sweep_line_table[index][1] > top_ele[0]:
result_list.append([last_top[1],sweep_line_table[index][0],last_top[0]])
max_heap.update_x_val(sweep_line_table[index][0])
else:
result_list.append([last_top[1],sweep_line_table[index][0],last_top[0]])
index += 1
return result_list
Standard sample code to find the last position of target in sorted array by using binary search:
if len(nums) == 0:
return -1
st, end = 0, len(nums)-1
while st < end:
middle = int((st + end) / 2)
if nums[middle] > target:
end = middle
else:
st = middle + 1
if nums[st] == target and st + 1 == len(nums):
return st
if st > 0 and nums[st-1] == target and st + 1 < len(nums) and nums[st + 1] > target:
return st - 1
if (st == 0 and nums[st] != target) or (st > 0 and nums[st-1] != target):
return -1
Standard sample code to find higher index and lower index of target in sorted array:
st, end, low_idx = 0, len(A)-1, 0
while st < end:
middle = int((st + end) / 2)
if A[middle] < target:
st = middle + 1
else:
end = middle
if A[st] == target:
low_idx = st
else:
return 0 #does not exist at all
#find the higher index of target:
st, end, high_idx = 0, len(A)-1, 0
while st < end:
middle = int((st + end) / 2)
if A[middle] <= target:
st = middle + 1
else:
end = middle
if A[st] == target:
high_idx = st
else:
high_idx = st - 1
return high_idx - low_idx + 1
Standard sample code to find target in rotated sorted array:
while st <= end:
middle = int((st + end) / 2)
if A[middle] == target:
return middle
else:
if A[middle] > A[end]: #case when more than half on the left is bigger increasing subarray
if A[middle] < target:
st = middle + 1
elif A[middle] > target and target > A[end]:
end = middle - 1
elif A[middle] > target and target <= A[end]:
st = middle + 1
else: #case when more than half on the right is smaller increasing subarray
if A[middle] > target:
end = middle - 1
elif A[middle] < target and target <= A[end]:
st = middle + 1
elif A[middle] < target and target > A[end]:
end = middle - 1
return -1
Attention about performance when perform "x" in array operation in binary search while loop:
Normally, we prefer to use "x" in array than to use "x" not in array, because in operation returns much quickly than not in
since not in have to scan the whole array !!!
Using binary search to solve problem find the longest increasing subsequence !!
idea: we prepare an inserted array with first([0]) be 0-math.inf and the rest be math.inf
we insert number into that array by finding the first element that >= number and replace that original element to insert
thus, the [0] will be the smallest number and the last non-math.inf be the biggest number, which are in the longest increasing
sequence !
#find the first >= target and replace it !
def binary_search(self, nums, target):
st, end = 0, len(nums)-1
while st < end:
middle = int((st + end) / 2)
if nums[middle] >= target:
end = middle
else:
st = middle + 1
return st
#main function
dp_table = [math.inf for i in range(size+1)]
dp_table[0] = 0-math.inf
for i in range(size):
index = self.binary_search(dp_table, nums[i])
dp_table[index] = nums[i]
for i in range(size, 0, -1):
if dp_table[i] != math.inf:
return i
return 0
Similar problem: Russian Doll Envelope(sort array first, then using the method from finding the longest increasing subsequence)
sort the array in increasing order by weight([0]) and if ties sort by decreasing order of height([1])
--> sorted(input_array, key=lambda x:(x[0],0-x[1]))
the reason we sort the ties case in decreasing order of height is to prevent increasing invalid envelopes size such as (4,3),
(4,5), but if we put the order like (4,5),(4,3) , the overall number will not increase at all !!!
#find the first >= target
def binary_search(self, nums, target):
st, end = 0, len(nums)-1
while st < end:
middle = int((st + end) / 2)
if nums[middle] < target:
st = middle + 1
else:
end = middle
return st
main function:
size = len(envelopes)
if size < 1:
return 0
input_array, result = sorted(envelopes, key=lambda x:(x[0],-x[1])), [math.inf]*(size+1)
result[0] = 0-math.inf
for i in range(size):
index = self.binary_search(result, input_array[i][1])
result[index] = input_array[i][1]
for i in range(size, 0, -1):
if result[i] != math.inf:
return i
return 0
using binary search to look for the first element >= target
st, end = 0, len(input_array)
while st < end:
mid = int((st + end) / 2)
if input_array[mid] >= target:
end = mid
else:
st = mid + 1
if end == 0: #meaning all elements in the input_array are >= target !!!
if end == len(input_array): #meaning all elements in the input_array are < target !!!