G12FCO -- Formal Computation

Coursework 1 -- Algorithms -- Solutions

bottom of page

  1. The median problem: The intermediate stage is that we have a certain range of the array within which to search for the median. Within the range, which is initially the whole array, we use DNF to partition into three smaller ranges, which we can use to refine the search. The search stops when the median element is in the `white' part of the array, for all the white elements are equal. Thus:
    		lo = 1, hi = N, target = N/2
    		loop pivot = A[target], red = lo, white = lo, blue = hi
    		     while white <= blue
    			do if A[white] < pivot
    			     then swap (red, white), red = red + 1, white = white + 1
    			     elseif A[white] = pivot then white = white + 1
    			     else swap (blue, white), blue = blue - 1
    			   endif
    		     endwhile
    		     if red > target then hi = red - 1
    		       elseif blue < target then lo = blue + 1
    		       else return pivot
    		     endif
    		endloop

    Here, swap (red, white) means something like dummy = A[red], A[red] = A[white], A[white] = dummy. This is basically the Quicksort algorithm but without bothering to sort the bits of the array that cannot contain the median. For fuller discussion of the invariants, of correctness and of termination, refer back to DNF and Quicksort in the sorting lecture. With ordinary luck, the successive ranges will roughly halve each time, so we DNF the entire array once, then half of it a second time, then a quarter a third time, then an eighth, etc., so the total work is O(N). However, if we are extremely unlucky, then the pivot will always be a particularly large or small element, so that [like Quicksort] the worst case is O(N^2). We choose to use the element in the target' position as pivot; this works extremely well in the special cases where the array is sorted or anti-sorted, and is as good as anything else if the array is randomly shuffled; and it slightly simplifies evaluation of the median when it turns white. But we could get slightly better worst-case and average-case performance by choosing the pivot more carefully, as discussed under Quicksort. You could, but it's not worthwhile, pick out the case where lo = hi, so that only one element is left in range and therefore has to be the median.

    Marks: [out of 10] 3 for the general strategy [a gift, since this was discussed in the problem class!], 4 for more-or-less correct pseudo-code [or actual code!], 3 for sensible attempts at discussion of termination, correctness and time taken, basically along the lines of the discussion in lectures. Bonus points for extras, such as those discussed above.

  2. The integer square-root problem: The suggestion was to use as invariant that we have two integers lo and hi such that lo² <= N < hi², which we first have to establish. We can obviously start with lo set to zero [or one, if you believe that N must be greater than zero]. What about hi? The value N obviously works; this has the minor disadvantage that in the early stages we're going to have to square our guesses to see whether to update lo or hi, and if the guesses are comparable with N, then squaring them may cause overflows -- on a 32-bit machine, we won't be able to find the square root of a number bigger than around 128000, even if we use unsigned arithmetic. So I prefer to start hi at one, and double up until its square is bigger than N; this will work for any N less than 2^31. Having found an appropriate hi, we can use binary chop until lo + 1 = hi, at which stage, combining this equation with the invariant, we know that lo is the desired integer square root.

    Putting these bits together, we have:

    		lo = 0, hi = 1
    		while hi*hi <= N do hi = 2*hi endwhile
    		while hi - lo > 1
    		   do mid = (hi + lo)/2
    		      if mid*mid > N then hi = mid else lo = mid endif
    		endwhile
    		result = lo

    Analysis: If lg N is k, then it takes roughly k/2 steps in the first loop before hi*hi exceeds N -- which it will eventually [ignoring overflows] as hi is increasing. After that, each time round the second loop hi - lo is being halved, so that loop too will terminate in about k/2 steps. Correctness is guaranteed by the invariant.

    Marks: [out of 10] 2 for setting up the invariant correctly, 3 for correctly programming the binary chop, 2 for correct test of termination and result, 3 for sensible comments about termination and correctness. Bonus marks for: testing that N is sensible, ensuring that hi*hi doesn't overflow, and sensible comments about efficiency.

  3. Bipartite Graphs: The basis of the attack was given in the question -- we paint the nodes black and white in a consistent way, and either complete the painting or else arrive at a contradiction. There are two somewhat different styles of progression. In one, we `provisionally' paint the nodes, then firm up the painting as we check each node in turn. In the other, we paint the nodes when we first know what colour they must be, but also mark all the edges, then we check each edge in turn. The following implements the second method.

    We start with everything grey. We arbitrarily colour one node white, and all the edges out of that node red [meaning `to be processed']. Then we look at each red edge in turn. If the edge connects two black nodes or two white nodes [it can't connect two grey nodes as we only paint an edge red when we colour one of its nodes], the graph is proven to be non-bipartite. If it connects a black to a white, we paint the edge blue [meaning `processed'] and carry on. If it connects black to grey, we paint the grey white, and the grey edges out of the new white are painted red, for later processing; if white to grey, we paint the grey black, and again mark its grey edges for processing; in either case, we again paint the current red edge blue, to mark that we've already done this edge. When we run out of red edges, we go back to see if there are still any grey nodes; if so, we paint one white, and proceed as before, if not then we've painted all nodes black and white with no inconsistency, and so the graph is bipartite.

    		paint all nodes grey, and all edges grey
    		while there are any grey nodes
    		   do choose a grey node, paint it white, and paint
    				all grey edges out of it red
    		      while there are are red edges
    			 do choose a red edge
    			    if its nodes are the same colour
    			      then return "not bipartite"
    			      elseif either node is grey
    			      then if the other end is white
    				     then paint this end black
    				     else paint this end white
    				   endif
    				   paint all grey edges out of this end red
    			    endif
    			    paint this edge blue
    		      endwhile
    		endwhile
    		return "bipartite"

    Implementing the other method is left as an exercise. Basically, we initially paint grey nodes [say] yellow and brown, to mean white or black, but not yet processed. Then we process a yellow node by painting it white and then looking at the nodes it connects to -- white or yellow means failure, brown or black is OK, and a grey node must be painted brown. Similarly, we process brown nodes by painting them black, and checking that they are connected only to yellow or white nodes, or to grey nodes which now turn yellow.

    Analysis: Each edge [in the method pseudo-coded above] is painted grey, then red, then blue, with no back-tracking. Similarly, each node is painted grey then either black or white. Every time round the inner loop paints one red edge blue, so this process must terminate within time O(N^2), where N is the number of nodes and thus N^2/2 a bound on the number of edges. Each outer loop repaints a grey node, so that loop too must terminate. Every grey node must be repainted eventually, else the outer loop will continue; every time a node is repainted, all grey edges connected to it are painted red, so every edge must eventually be painted red and then blue, at which stage we ensure that its nodes are of opposite colours. So the algorithm is correct and terminates in time O(N^2). We've left some inessential choices in the pseudo-code.

    Marks: [out of 15] 5 for a more-or-less correct description of the algorithm, 5 for more-or-less correct pseudo-code, 5 for analysis showing that the algorithm is correct and terminates. Bonus points for anything neat and/or extra.

top of page


border=0 G12FCO home page. border=0 This coursework. border=0 Lecture 3 .
Copyright © Dr A. N. Walker, 1996, 2016.