This lecture is based on a lecture on the 
Web of Sibel Adali and partially on the
Database Security chapter in Elmasri & Navathe.

Database Security and Authorization
-----------------------------------

Access control deals with stopping users from
logging into computers that they are not authorized 
to use.  "Network security" deals with stopping
unauthorized people from eavesdropping (listening 
in) on network traffic.

Database security is considered a separate topic
from the above.

Absolute security is not possible.  We are trying to
make it expensive for an unauthorized user to read
and/or destroy our data.  Ideally it should be more
expensive to steal the data than to obtain it legally.
At the same time, we try to limit our own costs due to
security.

----------
Basic Terms:

Security:  Protection of information against unauthorized
disclosure, alteration or destruction.

Policy:  The rules you want to enforce.  May be expressed
in English.  

Example rules: 1) Normal users cannot read each other's 
data.
2) Database administrators can read all data.

Mechanisms: The SW and/or Hardware that enforces the policy.
----------

Types of Security Policy:

Access Control:  

Who can access what.  Two parts:
Authentication: Are you who say you are?
Authorization: Are you permitted to see this data?

Flow Control:

Are you permitted to copy the information?

Inference Control:

Stop people from inferring data they are not permitted to see
from data that they are permitted to see.  This is a problem
in Statistical Databases.  (More about that later.)


Cryptographic Control:

Encrypt data on disk.
-----------

Basic Model of Access Control:

Every data unit has an owner.
Read, modify, delete are privileges of the owner.

An owner may grant privileges to other people.
An owner may revoke privileges from other people.
(This assumes that authentication works correctly.)

Information about privileges is stored under
heavy protection.  (Separate from other data.)
------------

Access Matrix:
(Example)

       |Object 1  |  Object 2  |  Object 3 |
------------------------------------------------------
User 1 |  rwd     |   r        |  r        |
User 2 |  r       |   r        |  rwd      |
User 3 |  r       |   null     |  r        |
User 4 |  r       |   rwd      |  r        |

Imagine that Object 1 is the table Employees that is 
owned by User 1.

Capability List: A row of the Access Matrix
Access List: A column in the Access Matrix
-------------

Example Mechanism for Access Control

Authorization Table:

Which user has access to which column of which table?
And which kind of access, r? w? d?

This is implemented as a VIEW.  Every "normal" user can see
the view only, but not the table. The administrator can see the
table. The view does not contain all columns.

Example: Master Table of Employees

Name|Address| Office Number| Home Phone| Cell Phone| Login| Password
--------------------------------------------------------------------

Create a view that shows everything, except for the Password column.
Everybody can see that view, but NOT the table.


Query Modification:

The user's query is automatically modified so that
the user sees only what he is permitted to see.

Example: User types in

select * from Master_Table

The system translates this into

select Name, Address, Office_Number, Home_Phone, Cell_Phone, Login
from Master_Table

Or

Materialized View:

That means, create the view as a table and update automatically.
Needs more space, but retrieval is faster!

Example:

create table Master_Table_View as
    (select Name, Address, Office_Number, Home_Phone, Cell_Phone, Login
    from Master_Table)

Create a table from another table.
    
---------------

DBA: 

In non-military systems and non-high-security systems, 
the DBA has all the privileges
and can see everything.  In military, medical environment: 
"Needs to know" principle.  You can only see it if you need to know
it to do your job.

---------------

Flow Control:

Definition Flow: A flow from object X to object Y exists 
when a user (or his program) reads from X and writes to Y.

If X is not accessible to User 1 and Y is accessible to
User 1 then this flow allows User 1 to read something he
is not permitted to read if somebody copied data from X to Y.

Flow control is difficult.
---------------


Multilevel Security:

Commonly used in military environments.  

Most popular: Bell and LaPadula model.

Security classes:

TS (Top Secret)
S  (Secret)
C  (Confidential)
U  (Unclassified)  Everybody may see this.

Confidential, Secret and Top Secret together are called Classified.

There are subjects (user, program) and objects (tables,
rows, columns, operations, etc.)

Every subject and every object is assigned one 
security class from the above four choices.

Simple Security Property:

A subject may READ an object only if the subject's security
class is greater-or-equal to the security class of
the object.

Thus a "Secret user" may read a "Secret document"
and a "Classified document" but he may not read a
"Top Secret document."  (Everybody may read "unclassified.")

More unintuitive is the 
Star Property (*-Property)

A subject may WRITE an object only if the subject's security
class is less-or-equal to the security class of the
object. 

Why?  Otherwise a subject that has Top Secret
clearance might copy a Top Secret document and write
it as an Unclassified document, making it readable
to everybody. We don't want that. 

Intuitively... users may "read down" or equal.
Users may "write up" or equal.
-------------

Different possible mechanisms for Multilevel Policies

1) A table with data of different classifications
is broken into several tables, one for each
classification.  Then the several tables are stored
at different places (disks).

DB 1: U
DB 2: C 
DB 3: S 
DB 4: TS 

2) Reference Monitor: A heavily guarded program that
checks that you only get data you are entitled to.

3) Distributed Mechanism:  Have a separate copy
of the whole database with data for a specific
classification (or lower). 4 Overlapping databases!
Problem: An update to the Unclassified Database 
has to be propagated to all other
databases.  (Synchronization problem).

Example:

DB 1: U
DB 2: C U
DB 3: S C U
DB 4: TS S C U
--------------

Inference Controls:

(Needed for Statistical Databases)

Statistical Database:

A database that contains many confidential
records.  Users are only allowed to access
count, sum, average, min, max from the database.

Users are not allowed to access individual
records.

However, a smart user can (sometimes) derive the information
he wants by inference.  Inference control
tries to stop that.

A tuple (row, record) is called "compromised"
if it is possible to infer it.
----------------

Example:

Hospital Statistical Database.

How many patients in the hospital are male, with four children,
age 40-45?
Answer: 1

How many patients in the hospital are male, with four children,
age 40-45, and have AIDS?
Answer: 1

If we have prior knowledge that John Smith is
42, has 4 children, and was admitted to the 
hospital, we have just verified that John Smith
has AIDS.
------------------
The result of a query BEFORE applying the
aggregation operator (avg, sum, count,...) is called Query Set.

By prohibiting Query Sets of size 1, we can
block the above inference. (Or of "small size.")
We have done inference control.
(If the query set is small, the system does not give an answer.)
------------------
Example 2:  In the company...

Q: How many employees are there?
A: 20
Q: What is the average salary of all employees?
A: 43000
Q: How many employees have fewer than 4 children?
A: 19
Q: What is the average salary of employees with
fewer than 4 children?
A: 28000

X ... Salary of the 20th person.

We can now infer that 
19 * 28000 + X = 20 * 43000
If we know that John Smith has 4 children or more,
then we can infer that John Smith has a
salary of X = $328000.  

-----------
Here requiring a minimum query set size does not help!
We could limit both the minimum size and the maximum size.
But maximum is different for every table!

Other Solution: Add extra "random noise" records 
to the query set.

If the query set is large, those extra records have 
little effect.  If the query set is small, then this is
probably a query we are trying to block.
This is called error inoculation.

Or: Do not compute the aggregate on the whole query set,
but on a random large sample.  (This is considered
the best solution nowadays.)  

Another technique is to block repeated queries that
refer to the same query set.  However, the intruder can
come back another day and ask his questions.  The database
cannot remember all the queries.

(In military applications, indeed, all access is recorded.
This is called an Audit Trail.  If we cannot stop the
bad guys, at least we can find out after the fact what
they did.)
------------

There is no guaranteed safety.
Inference control rarely done nowadays.

If it is important enough, check Audit Trails for
unusual query patterns.  (Today this is already done
semi-automatically, with data mining.)
-------------

Cryptography:

This is a whole semester lecture by itself.

Two main kinds:  Public Key Encryption.
Symmetric Encryption.

In Symmetric Encryption the same key is used
for encryption and decryption.  Once the key
has fallen into the wrong hands, everything is lost.  
We need to change keys frequently.  We need secure 
channels to send keys.

In public key encryption, the decryption key
is different from the encryption key, and it
is computationally very hard to derive the
decryption key from the encryption key.  Now
the encryption key is made public!

Breakthrough work on public key encryption
was made by Ron Rivest, Adi Shamir, Len Adleman. 
(RSA)

But symmetric encryption is more efficient than
public key encryption.  So one method is to
send the KEY for a symmetric encryption session
using public key encryption.

Typically encryption based on 
- prime factorization of very large numbers. 
- or on integer discrete logarithms
[discrete means using MODULO arithmetic.]
(Both of these are difficult to do.)

Look up PGP (Pretty Good Privacy!) on the Web.
(Check Philip Zimmermann's book.)







--------------

Access Control in Oracle

CREATE ROLE  (but I don't have the privilege to do this).

This defines a group of users.  Then I can assign
users to this role.  And I can assign tables to this
role.  Then all users of this role can access all tables
of this role.

Examples of simple access control:

grant insert on x to public
/

Everybody with an Oracle account can now insert into my
table x.

Same thing for:

delete, select, update, alter, index & all (all means all of the above)

If you already granted it and grant it again, the system does
not tell you it was already granted.

revoke all on x from rd37    -- take away access from rd37
/
revoke alter on x from bv7
/

grant insert, update on x to yc7
/
grant alter on x to yc7 with grant option
/

That means that user yc7 may herself grant
alter on the table x to other people!


grant insert on UCIDS632 to mr293, as739
/

This complicates revoke:

revoke alter on x from yc7 cascade constraints
/
(Revoke from yc7 and everybody she passed it on to.)

Which users may access my tables?
select * from user_tab_privs;

There are a bunch of other useful tables.
Check here: http://www.ss64.com/ora/revoke.html

e.g.:
user_tab_privs_made
user_tab_privs_recd

Who did I give access to my table X to?

select * from user_tab_privs where table_name = 'X'