Homework Help, Tuesday Feb 9th, 2010

Two students just showed me something that I never saw before.
Their program did not process any dbms_output.put_line() statements.

If that happens to you, try to click ALT-F9.

Homework Help, Monday Feb 8th, 2010

Some students did not find any TAB characters in the saved file.
This seems to depend on the version of Word and/or browser used.
However, the students figured out a solution. Copy and paste the file into an XCEL spread sheet.
Then save the file as CSV type (comma separated values).
Then inside of IMPORT, set the separator to comma.

As to the question how to generate substrings, that is part of the assignment. There are a number of Oracel web sites which explain string management.

As to the question how to create the new table, the easiest way is to use a
create new_table ... as select ....
construct. In the select - from part you can extract the short column name, but you need to use an alias for it to create the new table correctly.
This was meant as a hint, not as a complete solution.
Figuring this out is part of the assignment.

Homework 1

Due Date: 2/15/2010
Points: 50

(1) a) Build a data table as follows:   [5 points]

- Go to http://www.bloomberg.com/markets/stocks/movers_index_spx.html

- Write down the meaning of all the columns.  You will NEED them later.

- Copy and paste 500 data rows into a WORD file.
(Start at 3M and end at ZIONS BANK).

- Now create a database table called STOCKS using the IMPORT feature of AQUA.
Use the column names you wrote down.

- Show (print out and hand in) the complete table.   Yes, I know it is 500 rows.


b) Build a table as follows:  [5 points]

- Go to http://ab.hoovers.com/information/12320725.html?serv=GGLFTN22580055&t=180&num=8664643191&file=information/12320725.html&cm_ven=PAID&cm_cat=GGL&cm_pla=FTN&cm_ite=fortune%20500%20companies

- Use the same approach as in a) to build the table.  I showed it in class and it worked.

There is another way of doing it, but it is more complicated.  But now that I wrote
it already I leave it here.

- View the page source.

- Open a WORD file.

- Paste all the date from the page source starting at the TABLE tag 
before the first company and ending at the /TABLE tag after the last
company into a file.  Add the HTML and BODY tags at the beginning and the
corresponding /BODY and /HTML tags at the end.  Save the result as 
a .html (or .htm) file in your home directory public_html.  

If you don't have such a directory... consider setting up a home
directory as part of this homework. [No grade on that.]

- Open that .html file in a web browser.

- Now copy and paste all the data AGAIN into a WORD file (a different one).

- Save the file as .txt

- Import all the data into AQUA as a table called FORTUNE.
Use the column names as given on the Web page.

- Show (hand in) the complete table. 


WE WILL REUSE THESE TABLES IN LATER ASSIGNMENTS.

-------------

Now we would like to know which companies are in both tables
so that we can find out interesting things about them.
Like... what is the stock volume of all companies in NYC.

The problem is that one table contains full names in mixed
case, while the other table contains abbreviations in upper case.

Thus...

c) Create a table STOCKS_SHORT and a table FORTUNE_SHORT that
should each contain an ADDITIONAL COLUMN with the first 6 letters 
of the company names as capital letters.  Let's call these columns
STOCKS_ABBREV and FORTUNE_ABBREV.

Hand in the FIRST PAGE of each new table only.
[2 point]

d) Write a SELECT statement that will show the abbreviations and
full names from both tables where the abbreviations are the same
in both columns.
[1 point]

MAKE SURE THAT YOU HAND IN ALL SQL CODE AND ALSO THE RESULT
AS A PRINTOUT.  HERE AND EVERYWHERE ELSE.


e) LOOK AT THE RESULTS.
Yes, there are a couple of hundred.

You will see that there are cases where the abbreviations match
but the real names in the two tables are obviously different companies.
Mark those cases of mismatches by hand for yourself.
You don't need to hand them in.

But here is what you need to do:  Write UPDATE statements for every
such case that changes one of the abbreviations (however and whichever
you like) so that this mismatch does not happen again.

After you are done, rerun d) and make sure the problem does not
occur anymore.  SHOW THIS RESULT.

The point of this exercise is to see that real data is ugly.
Not what you normally get in a homework.
But this is what you will deal with in a job.

NOTE that we did not even solve all possible problems.
It could happen that two abbreviations do NOT match, but the
companies are the same.  But we will NOT deal with this problem
in this homework.

[5 points]


f) Now run a query that shows all companies in FORTUNE_SHORT that
are missing in STOCKS_SHORT.  Use your abbreviation column, of course.
[1 point]

g) Now run a query that shows all companies in STOCKS_SHORT that
are missing in FORTUNE_SHORT.  Use your abbreviation column, of course.
[1 point]


MAKE SURE THAT YOU HAND IN ALL SQL CODE AND ALSO THE RESULT
AS A PRINTOUT.  HERE AND EVERYWHERE ELSE.



(2) Write SQL queries to find the answers for the following questions.
Wherever a join is necessary you use the abbreviation columns.
[1 point each]

a) Write a SELECT statement to find all companies in NY City.

b) Write a SELECT statement to find all companies with stock prices greater than $100.

c) Write a SELECT statement using a join of the two tables to find companies in 
New York with a stock price between $50 and $100.  

d) SELECT all companies with a volume of more than 10,000,000 and a stock price
between $20 and $30.

e) SELECT all companies from Houston, TX and display them in DESCENDING order by
stock price.


(3) Write a PL/SQL program, using an explicit or implicit cursor
that prints to the screen the following result:

Show pairs of companies that are next to each other (in alphabetical order)
and that have stock prices between $40 and $50.  
In other words, companies in two consecutive lines in the STOCKS tables
with stock prices between $40 and $50.

If there are three (or more) such companies in three rows, then display them as two 
pairs.

Make sure you use a loop.
[10 points]

(4) Write a PL/SQL program, using an explicit or implicit cursor
that prints to the screen the following result:

Which location has the largest number of Fortune 500 companies and how
many are there at this location?

Do NOT PRINT OUT ANYTHING MORE than what is necessary to answer this question.
Providing more than the correct answer would result in points taken off.

You can divide "the work" between SELECT statements and PL/SQL statements
however you like as long as there is one loop.
[10 points]


(5) Write SQL queries to find the answers for the following questions.
[1 point each]

a) Which company had the smallest volume?

b) Which company had the largest percent increase?

c) How many companies have a stock price between $30 and $60?

d) What was the average volume for all companies?

e) What is the total sum of volumes for all companies in Minneapolis, MN.


===============================================================

VOLUNTARY QUESTION:  No extra credit.

Solve (5) e) with at most 3 SELECT statements (they may be nested).

================================================================

THE FOLLOWING APPLIES TO ALL ASSIGNMENTS:

- Create a nice cover page with 
  your name, assignment number, class number and a descriptive
  title of the assignment.
  
  All parts of your assignment need to be stapled 
  together, or held together by a folder.  Points will be taken off
  for not complying with these rules.

  If you use a folder, write your name ALSO on the folder.  Or use
  a folder with a transparent cover page.

  No handwriting anywhere.
  No handwriting anywhere.
  No handwriting anywhere.

  Make things look nice and professional.  

- Documentation:
  Do a reasonable amount of internal documentation.
  Internal documentation typically means that you explain
  your code or SQL statements.
  No external documentation unless I specifically ask for it.

- Test runs:
  Show your table after you build it.  Show testruns.
  Show as much as you can.  The burden of proof is up to you.
  NO TESTRUNS, NO POINTS. 
  (Or very, very few.)  You may record the results in a file.
  Thus, you don't need to do screen dumps.
  You will not be punished for too many test runs.

- Hardcopy:
  All code, documentation, and testruns will be accepted as Hardcopy
  (Printout) ONLY.  No email, No fax etc.

- Grading:
  Grading takes into account correctness of the results, but also
  how well they are documented, and what their visual appearance is.
  Things should look nice and PROFESSIONAL.  Not sloppy.  Use a spelling
  corrector for your comments or (later on) write-ups.  Some people do this
  very nicely.  Don't cut corners here, or you will look bad in comparison.
  PROGRAMS MUST BE INDENTED.

- Cheating:
  The usual policies apply.  0 points for "suspiciously similar" results.
  Report to the Dean in case of repetition.

- Files:
  You must keep all your files of every assignment up to the end of
  the semester.  If there is any doubt about your assignment I will
  ask you to demonstrate it online.

- Late policy:
  I will take off 10 points for being late up to 1 week.  Programs
  that are handed in more than a week late will receive no credit
  at all, unless you have a good reason with WRITTEN documentation
  (usually medical).