COMP3600/6466 Algorithms Assignment 2

COMP3600/6466 — Algorithms

Convenor: Hanna Kurniawati

Lecturer: Hanna Kurniawati & Pascal Bercher

E-mail: comp 3600 [email protected]

Assignment 2

Due: Wednesday, 6 October 2021 23:59 Canberra Time

Grace Period Ends: Thursday, 7 October 2021 13:00 Canberra Time

Late Penalty: 100%

IMPORTANT NOTES (PLEASE READ BEFORE WORKING ON THE SOLUTION):

• Please read the entire of this question sheet before starting to work on the solution.

• Your assignment should consist of a single .pdf, named A2-[studentID].pdf and a single .cpp/.java file, named

A2-[studentID].cpp or A2-[studentID].java. You should zip these two files together into a single file, named the

.zip file A2-[studentID].zip and submit via Wattle before the due date.

• We provide 13 hours grace period. This means, there will be no penalty if you submit before the grace period

ends. However, we will NOT accept assignment submission beyond this time (i.e., late penalty after the grace

period ends is 100%).

– You can update your assignment until the grace period ends.

• Assignment marking:

– The total mark you can get in this assignment is 100 points.

– This assignment will contribute 20% to your overall mark.

– In each question, you need to provide a convincing argument (including mathematical derivation) that

supports your answer. The argument is worth 80% of the marks, while the solution alone is 20%.

• Discussion with your colleagues are allowed and encouraged. However, you need to work on the assignment

on your own AND write the names you discussed this assignment with.

In all the questions below, you can assume that all n are positive integers.

[30 pts] Part A

1. [10 pt] Given the following recurrence, please answer the questions below

T (n) =

{

1 n = 1

T (n − c) + T (c) + log(n) n > 1

where c is a constant.

(a) [2.5 pt] Can Master Theorem be used to solve this recurrence?

(b) [7.5 pt] Please find a tight asymptotic bound (i.e., Θ) to the solution of the above recurrence.

2. [5pt] Please solve the recurrence T (n) = 5T (n2 ) + n

2 √n

3. [5 pt] As a programmer in a start-up company, you have been tasked to write a general sorting program (i.e.,

one that sort any type of data given a comparison function) that runs in O(n) time, where n is the number of data

to be sorted. Is this task feasible? Please explain.

4. [10 pt] For each statement below, please identify if it is Always True, Always False, or sometimes True. Please

provide an explanation for your answer.

(a) [5 pt] Any non-stable sorting algorithm can be made stable.

(b) [5 pt] Finding the minimum value in a MAX-HEAP takes O(2h) time in the worst case, where h is the

height of the tree.

1

COMP3600/6466 Algorithms Assignment 2

[40 pts] Part B

5. [10 pt] The town pub is planning an end-of-lockdown celebration to be held the day the lockdown ends. All

100 tickets to this celebration has been sold out. To make things fun, the pub owner, Ms Pub, plans to serve

drinks on different types of cups, where people born on the same month would be served with the same type of

cups. He wonders how many of each of the 12 different types of cups should he prepare. Can you help Ms Pub?

Please use the expected value of attendees born in each month to estimate the number of cups (per type) that

Ms Pub needs to prepare. You can assume that everyone who has bought the ticket will come and the month

each attendee were born is distributed uniformly at random.

6. [15 pt]

Figure 1: Some of the glasswares sent to Arrebnac’s Science Museum. Photos taken from Related Objects section of https:

//www.metmuseum.org/art/collection/search/249475.

The city Arrebnac has just received a new exhibit for its Science Museum. The exhibit consists of n artistic

glasswares (some of them are shown in Figure 1), each with a unique volume. Together with these glasswares,

comes n jars of pebbles. Each jar has a unique mixed of small pebbles and indicates the volume of exactly

one of the glasswares, in the sense that when poured into the glassware, the entire pebbles in the jar will fill it

exactly, no more and no less. In the exhibit, the glassware will be put side by side with the jar of pebbles that

indicate its volume, and a robot mechanism has been built, such that with a touch of a button, one can see the

jar of pebbles being poured gently into the glassware and vice versa.

All glasswares and the jars of pebbles have arrived in good conditions. However, information about which jar

of pebbles should be associated with which glassware were nowhere to be found. Unfortunately, due to the

odd geometry of the glasswares and the different mixed of pebbles in each jar, direct comparisons between the

the glasswares’ volumes and direct comparisons between the jars of pebbles’ volumes are impossible. In fact,

the only safe way to find the right association between the glassware and the jar of pebbles is to use the robot

mechanism to pour the pebbles into the glassware until it is full and check if the entire pebbles in the jar fits

exactly in the glassware.

The exhibition manager knows that he can resolve the problem, but with Θ(n2) comparisons, and hence time,

even in the best case. Considering the exhibition will start the next morning, he is looking for an asymptotically

faster method, one that takes only Θ(n log n) time on average. Can such a method be devised? If it can, please

provide the method as a pseudo-code and show that the time complexity of your method is indeed Θ(n log n) on

average. Otherwise, please explain why such a method is impossible.

7. [15 pt] The Arrebnac Streaming Company (ASC) Pty. Ltd. needs a fast method to sort its entire video col-

lections based on the number of views it has over the past week. These videos have been sorted based on the

number of past week views, but they are only sorted within the same genre and released year. Considering

ASC have a collection of almost any type of genre known in the film industry, dating back from the early 1900,

efficiency of the sorting method matters a lot. Please help ASC by answering the following questions:

(a) [10 pt] Develop a sorting mechanism that utilises the sorted order of the videos within each group, such

that the time complexity of sorting the entire video collection is O(n logm) in the worst case, where n is

the number of videos in the entire collection, while m is the number of groups (genre and released year).

Please provide a pseudo-code.

(b) [5 pt] Show that the method you propose indeed have a worst case time complexity of O(n logm).

2

COMP3600/6466 Algorithms Assignment 2

[30 pts] Part C

8. In a galaxy far far away, the Planet Htrae is still fighting a health crisis over the spread of a new highly contagious

and deadly disease, called DV. However, the Htrae’s scientists have now produced vaccines that can make the

recipient becomes immune to DV. To vaccinate as many people as soon as possible, the Senate of Htrae has

ordered each Local Area (LA) to setup a vaccination hub. The Senate requires that the hub be setup at a

location that is as close as possible to everyone, especially to the vulnerables and essential workers who may

have difficulties travelling far from their homes or offices. To help LA government decide the location, the

Senate asked that a software be developed to automatically decide the optimal location for the vaccination hub.

The idea is the software will utilise information about the center coordinates (x, y) of clusters of homes, offices,

nursing homes, hospitals, etc., as well as a weighting w ∈ (0, 1] that indicates the importance of each of

those clusters. The weights are set such that the sum of the weights of the entire data would be 1. Based on

these information, the software computes the position v = (xv, yv) of the vaccination hub that will minimise

Dv =

∑n

i=1 wi(|xi − xv| + |yi − yv|), where (xi, yi) and wi are the coordinate and the weight of cluster-i, while n is

the number of clusters in a particular LA.

It is known that the position of v that will minimize Dv can be found by finding the weighted median for the

X-axis and Y-axis independently. Suppose P = [(w1, p1), (w2, p2), · · · , (wn, pn)] is a weighted data points sorted

in ascending order, first based on its data values (p) and second based on its weight (w). Then, we compute the

weighted median of P as pk, where k = argmink(

∑k

i=1 wi) ≥ 0.5. Note that although this definition relies on

sorted arrays, given an unsorted data, the weighted median can be found without sorting the data first.

Your company, and eventually you as one of the company’s lead programmer, has been tasked to develop the

software. For this purpose, you need to answer the following questions:

(a) [10 pt] Design a randomized algorithm to find the weighted median of a set of n weighted unsorted data.

The algorithm should have an average running time of Θ(n) and use Θ(1) extra storage outside of the input

arrays. Your answer should provide the pseudo-code and explanation to show that the average running

time and extra storage requirement are satisfied.

You can assume the input will be 2 arrays: An array, say XPos, which is a 1-dimensional position of the

data (not necessarily sorted), and an array W, which contains the weights that indicate the importance of

the data. Suppose n is the length of these arrays, then an example input would be XPos[i] = xi for i ∈ [1, n]

while W[i] is the weight for xi. You can assume the data in XPos are independently and uniformly dis-

tributed in [XMin, XMax], where XMin and XMax refer to very large negative and positive numbers.

The output is the weighted median of the input data.

(b) [10 pt] Please develop a program for deciding the vaccine hub position in each LA using the algorithm

you have developed in 8a and named your program a2-[your UID].cpp or a2-[your UID].java.

If you use C/C++, please use the template as provided in A2-prgInfo to ensure you use the right input and

output format when run on the terminal / command prompt. If you use Java, please make sure that your

program can run from terminal / command prompt, accept the input file, and output the solution to the

terminal / command prompt, as per specification.

The following are the program specifications

Input to the Program: The program will accept one argument, which is the name of the input file. The

input file contains n + 1 lines, where n is the number of building clusters in the LA. The first line consists

of one integer number, n. Each line in the next n lines consists of three integer numbers, separated by

a white space. They represent the weight w along with the x and y coordinate of each cluster. You can

assume x and y are integer values, where the minimum and maximum follows the bound of int data type,

while w is of type float. Example:

3

0.25 1 2

0.4 3 1

0.35 4 5

3

COMP3600/6466 Algorithms Assignment 2

The above input means the LA has 3 residential/office clusters that the vaccine hub must serve. The cluster

centered at (1, 2) has a weight (importance) of 0.25, the one centered at (3, 1) has a weight of 0.4, and the

one centered at (4, 5) has a weight of 0.35.

Output of the Program: A single line in the standard output, containing two integer numbers separated

by a white space. The first number is xv and the second number is yv (aka., the position where the vaccine

hub should be set up). The output of the example input is:

3 2

Program Marking: If your program compiles and runs, you will get 2 points. We will then run your

program on 4 test cases: with increasingly many clusters (up to 100,000,000 clusters). For each test

case, your program will be given

(

50·#clusters

100,000 + 0.5

)

ms CPU time to find a solution. The time limit will

be rounded up to 2 decimal digit and is only for finding the position of the vaccination hub. It does not

include the loading time. For marking purposes, we will run a test on the loading time for out test cases,

and offset the loading time from the timing of your program. You can assume your program will have

access to at most 8GB RAM. It will be run as a single thread process on a computer with Intel i7 3.4GHz

processor. For each test case that your program solves correctly within the given time and memory limit,

you will get 2 points.

Tips:

• Be careful not to use arrays allocated in the program stack.

• We provide additional test cases in A2-prgInfo. However, these are just examples. You should NOT

extract specifications from these test cases. Specifications should follow the description in this ques-

tion sheet.

(c) [10 pt] Experimental analysis of the time complexity of your algorithm and comparison against the theo-

retical analysis (8a). You will need to have sufficient data to fit a function well and will need to generate

your own test cases for this purpose.

oOo That’s all for the questions folks oOo

Somehints:

Q6:Sortingmightbeuseful.

Q7:Heapcouldbeuseful.

Q8a:RandQuickSortmightbeworththinkingabout.

4

欢迎咨询51作业君