This lecture is based on a lecture on the Web of Sibel Adali and partially on the Database Security chapter in Elmasri & Navathe. Database Security and Authorization ----------------------------------- Access control deals with stopping users from logging into computers that they are not authorized to use. "Network security" deals with stopping unauthorized people from eavesdropping (listening in) on network traffic. Database security is considered a separate topic from the above. Absolute security is not possible. We are trying to make it expensive for an unauthorized user to read and/or destroy our data. Ideally it should be more expensive to steal the data than to obtain it legally. At the same time, we try to limit our own costs due to security. ---------- Basic Terms: Security: Protection of information against unauthorized disclosure, alteration or destruction. Policy: The rules you want to enforce. May be expressed in English. Example rules: 1) Normal users cannot read each other's data. 2) Database administrators can read all data. Mechanisms: The SW and/or Hardware that enforces the policy. ---------- Types of Security Policy: Access Control: Who can access what. Two parts: Authentication: Are you who say you are? Authorization: Are you permitted to see this data? Flow Control: Are you permitted to copy the information? Inference Control: Stop people from inferring data they are not permitted to see from data that they are permitted to see. This is a problem in Statistical Databases. (More about that later.) Cryptographic Control: Encrypt data on disk. ----------- Basic Model of Access Control: Every data unit has an owner. Read, modify, delete are privileges of the owner. An owner may grant privileges to other people. An owner may revoke privileges from other people. (This assumes that authentication works correctly.) Information about privileges is stored under heavy protection. (Separate from other data.) ------------ Access Matrix: (Example) |Object 1 | Object 2 | Object 3 | ------------------------------------------------------ User 1 | rwd | r | r | User 2 | r | r | rwd | User 3 | r | null | r | User 4 | r | rwd | r | Imagine that Object 1 is the table Employees that is owned by User 1. Capability List: A row of the Access Matrix Access List: A column in the Access Matrix ------------- Example Mechanism for Access Control Authorization Table: Which user has access to which column of which table? And which kind of access, r? w? d? This is implemented as a VIEW. Every "normal" user can see the view only, but not the table. The administrator can see the table. The view does not contain all columns. Example: Master Table of Employees Name|Address| Office Number| Home Phone| Cell Phone| Login| Password -------------------------------------------------------------------- Create a view that shows everything, except for the Password column. Everybody can see that view, but NOT the table. Query Modification: The user's query is automatically modified so that the user sees only what he is permitted to see. Example: User types in select * from Master_Table The system translates this into select Name, Address, Office_Number, Home_Phone, Cell_Phone, Login from Master_Table Or Materialized View: That means, create the view as a table and update automatically. Needs more space, but retrieval is faster! Example: create table Master_Table_View as (select Name, Address, Office_Number, Home_Phone, Cell_Phone, Login from Master_Table) Create a table from another table. --------------- DBA: In non-military systems and non-high-security systems, the DBA has all the privileges and can see everything. In military, medical environment: "Needs to know" principle. You can only see it if you need to know it to do your job. --------------- Flow Control: Definition Flow: A flow from object X to object Y exists when a user (or his program) reads from X and writes to Y. If X is not accessible to User 1 and Y is accessible to User 1 then this flow allows User 1 to read something he is not permitted to read if somebody copied data from X to Y. Flow control is difficult. --------------- Multilevel Security: Commonly used in military environments. Most popular: Bell and LaPadula model. Security classes: TS (Top Secret) S (Secret) C (Confidential) U (Unclassified) Everybody may see this. Confidential, Secret and Top Secret together are called Classified. There are subjects (user, program) and objects (tables, rows, columns, operations, etc.) Every subject and every object is assigned one security class from the above four choices. Simple Security Property: A subject may READ an object only if the subject's security class is greater-or-equal to the security class of the object. Thus a "Secret user" may read a "Secret document" and a "Classified document" but he may not read a "Top Secret document." (Everybody may read "unclassified.") More unintuitive is the Star Property (*-Property) A subject may WRITE an object only if the subject's security class is less-or-equal to the security class of the object. Why? Otherwise a subject that has Top Secret clearance might copy a Top Secret document and write it as an Unclassified document, making it readable to everybody. We don't want that. Intuitively... users may "read down" or equal. Users may "write up" or equal. ------------- Different possible mechanisms for Multilevel Policies 1) A table with data of different classifications is broken into several tables, one for each classification. Then the several tables are stored at different places (disks). DB 1: U DB 2: C DB 3: S DB 4: TS 2) Reference Monitor: A heavily guarded program that checks that you only get data you are entitled to. 3) Distributed Mechanism: Have a separate copy of the whole database with data for a specific classification (or lower). 4 Overlapping databases! Problem: An update to the Unclassified Database has to be propagated to all other databases. (Synchronization problem). Example: DB 1: U DB 2: C U DB 3: S C U DB 4: TS S C U -------------- Inference Controls: (Needed for Statistical Databases) Statistical Database: A database that contains many confidential records. Users are only allowed to access count, sum, average, min, max from the database. Users are not allowed to access individual records. However, a smart user can (sometimes) derive the information he wants by inference. Inference control tries to stop that. A tuple (row, record) is called "compromised" if it is possible to infer it. ---------------- Example: Hospital Statistical Database. How many patients in the hospital are male, with four children, age 40-45? Answer: 1 How many patients in the hospital are male, with four children, age 40-45, and have AIDS? Answer: 1 If we have prior knowledge that John Smith is 42, has 4 children, and was admitted to the hospital, we have just verified that John Smith has AIDS. ------------------ The result of a query BEFORE applying the aggregation operator (avg, sum, count,...) is called Query Set. By prohibiting Query Sets of size 1, we can block the above inference. (Or of "small size.") We have done inference control. (If the query set is small, the system does not give an answer.) ------------------ Example 2: In the company... Q: How many employees are there? A: 20 Q: What is the average salary of all employees? A: 43000 Q: How many employees have fewer than 4 children? A: 19 Q: What is the average salary of employees with fewer than 4 children? A: 28000 X ... Salary of the 20th person. We can now infer that 19 * 28000 + X = 20 * 43000 If we know that John Smith has 4 children or more, then we can infer that John Smith has a salary of X = $328000. ----------- Here requiring a minimum query set size does not help! We could limit both the minimum size and the maximum size. But maximum is different for every table! Other Solution: Add extra "random noise" records to the query set. If the query set is large, those extra records have little effect. If the query set is small, then this is probably a query we are trying to block. This is called error inoculation. Or: Do not compute the aggregate on the whole query set, but on a random large sample. (This is considered the best solution nowadays.) Another technique is to block repeated queries that refer to the same query set. However, the intruder can come back another day and ask his questions. The database cannot remember all the queries. (In military applications, indeed, all access is recorded. This is called an Audit Trail. If we cannot stop the bad guys, at least we can find out after the fact what they did.) ------------ There is no guaranteed safety. Inference control rarely done nowadays. If it is important enough, check Audit Trails for unusual query patterns. (Today this is already done semi-automatically, with data mining.) ------------- Cryptography: This is a whole semester lecture by itself. Two main kinds: Public Key Encryption. Symmetric Encryption. In Symmetric Encryption the same key is used for encryption and decryption. Once the key has fallen into the wrong hands, everything is lost. We need to change keys frequently. We need secure channels to send keys. In public key encryption, the decryption key is different from the encryption key, and it is computationally very hard to derive the decryption key from the encryption key. Now the encryption key is made public! Breakthrough work on public key encryption was made by Ron Rivest, Adi Shamir, Len Adleman. (RSA) But symmetric encryption is more efficient than public key encryption. So one method is to send the KEY for a symmetric encryption session using public key encryption. Typically encryption based on - prime factorization of very large numbers. - or on integer discrete logarithms [discrete means using MODULO arithmetic.] (Both of these are difficult to do.) Look up PGP (Pretty Good Privacy!) on the Web. (Check Philip Zimmermann's book.) -------------- Access Control in Oracle CREATE ROLE (but I don't have the privilege to do this). This defines a group of users. Then I can assign users to this role. And I can assign tables to this role. Then all users of this role can access all tables of this role. Examples of simple access control: grant insert on x to public / Everybody with an Oracle account can now insert into my table x. Same thing for: delete, select, update, alter, index & all (all means all of the above) If you already granted it and grant it again, the system does not tell you it was already granted. revoke all on x from rd37 -- take away access from rd37 / revoke alter on x from bv7 / grant insert, update on x to yc7 / grant alter on x to yc7 with grant option / That means that user yc7 may herself grant alter on the table x to other people! grant insert on UCIDS632 to mr293, as739 / This complicates revoke: revoke alter on x from yc7 cascade constraints / (Revoke from yc7 and everybody she passed it on to.) Which users may access my tables? select * from user_tab_privs; There are a bunch of other useful tables. Check here: http://www.ss64.com/ora/revoke.html e.g.: user_tab_privs_made user_tab_privs_recd Who did I give access to my table X to? select * from user_tab_privs where table_name = 'X'