This is one of the webpages of Libarid A. Maljian at the Department of Physics at CSLA at NJIT.

New Jersey Institute of Technology

College of Science and Liberal Arts

Department of Physics

Introductory Astronomy and Cosmology, Section 101

Phys 202-101

Fall 2024

Third Examination lecture notes

Our Star, the Sun

The Sun is a star. We know more about the Sun than any other star, since the Sun is by far the closest star to the Earth. The Sun is roughly one hundred and fifty million kilometers from the Earth. This seems distant by human standards, but in fact this is extremely close by astronomical standards. The nearest stars besides the Sun are more than two hundred thousand times further from the Earth as compared with the Sun. Therefore, the Sun is indeed extremely close to the Earth by astronomical standards, enabling astrophysicists to learn much more about the Sun than any other star in the universe.

Astrophysicists use the symbol R_☉ to denote the radius of the Sun, as we discussed earlier in the course. The radius of the Sun R_☉ is such a fundamental unit in stellar astrophysics that it is called a solar radius. The Sun is enormous; one solar radius R_☉ is roughly equal to seven hundred thousand kilometers. This is roughly one hundred times the Earth’s radius. In other words, one solar radius R_☉ is roughly equal to 100R_⊕, where astrophysicists use the symbol R_⊕ to denote the radius of the Earth, as we discussed earlier in the course. Since one solar radius R_☉ is roughly equal to 100R_⊕, the Sun has a volume roughly one million times the Earth’s volume, since the volume of a sphere is directly proportional to the cube of its radius and one hundred cubed is one million. In other words, we could fit roughly one million Earths inside the Sun! Astrophysicists use the symbol M_☉ to denote the mass of the Sun, as we discussed earlier in the course. The mass of the Sun M_☉ is such a fundamental unit in stellar astrophysics that it is called a solar mass. The mass of the Sun is tremendous; one solar mass M_☉ is roughly one thousand times the mass of Jupiter, which is itself more massive than the rest of the mass of the Solar System combined. Therefore, one solar mass M_☉ is roughly one thousand times the mass of the rest of the Solar System combined. More precisely, one solar mass M_☉ is roughly two nonillion kilograms. (Please refer to the following multiplication table, where each number is one thousand times the previous number: one, one thousand, one million, one billion, one trillion, one quadrillion, one quintillion, one sextillion, one septillion, one octillion, one nonillion, one decillion. Caution: this multiplication table is only correct in American English. These same words are used for different numbers in British English.) Astrophysicists have determined the mass of the Sun using Kepler’s third law. There are eight planets and millions of asteroids and millions of comets all orbiting the Sun, and astrophysicists use their orbital parameters together with Kepler’s third law to determine the mass of the Sun. Even though all these different objects have completely different orbits, Kepler’s third law always yields the same result for the mass of the Sun. We can combine the distance to the Sun with the intensity of sunlight we receive from the Sun to calculate the luminosity of the Sun. The luminosity of any object is the total amount of energy it radiates every second, commonly known as the power output. The luminosity or the power output of any object is measured in watts. Astrophysicists use cursive (script) ℒ for luminosity. For any object with luminosity ℒ, the intensity of the light I at a distance r from the object is given by the equation I = ℒ / 4πr². This equation assumes that the object radiates energy isotropically (equally in all directions). Thus, the total energy radiated by the object cuts through a sphere centered on the object, and the surface area of a sphere of radius r is 4πr². This equation also reveals why a lightbulb for example looks brighter when closer and dimmer when further. Doesn’t the lightbulb radiate a constant luminosity (constant power output) regardless of distance? Indeed it does, but that same luminosity has spread out over a large sphere if we are far from the lightbulb. Hence, that constant luminosity is diluted over the large sphere, and thus a smaller fraction of that luminosity enters our eye. Conversely, that same luminosity is concentrated over a small sphere if we are close to the lightbulb, and thus a larger fraction of that luminosity enters our eye. We certainly know our distance from the Sun, and we can measure the intensity of sunlight at our distance from the Sun. Thus, the only unknown remaining in the equation I = ℒ / 4πr² is the luminosity of the Sun. Astrophysicists use the symbol ℒ_☉ to denote the luminosity of the Sun. The luminosity of the Sun ℒ_☉ is such a fundamental unit in stellar astrophysics that it is called a solar luminosity. The luminosity of the Sun is enormous; one solar luminosity ℒ_☉ is roughly four hundred septillion watts. (Again, please refer to the above multiplication table.) The Sun has been radiating roughly four hundred septillion watts every second for roughly five billion years, and it will continue to do so every second for the next roughly five billion years! This begs the following question: what is the source of this incredible luminosity? More plainly, why does the Sun shine? We will reveal the answer to this question shortly.

The surface temperature of the Sun is roughly six thousand kelvins. Astrophysicists have determined the Sun’s surface temperature using two different methods. Firstly, we can graph the amount of light from the Sun as a function of the wavelength of the light. The resulting graph is a continuous blackbody spectrum, although there is an absorption spectrum superimposed upon that continuous blackbody spectrum as we will discuss shortly. From the peak of this continuous blackbody spectrum, we can calculate the surface temperature of the Sun. More plainly, we are calculating the surface temperature of the Sun from its color. As we discussed earlier in the course, the amount of energy radiated from a hot, dense object often follows the blackbody spectrum, which is a continuous spectrum with its peak radiation within a band of the Electromagnetic Spectrum determined by the temperature of the object. In particular, hotter temperatures correspond to higher photon energies (which are also at higher frequencies and shorter wavelengths), while cooler temperatures correspond to lower photon energies (which are also at lower frequencies and longer wavelengths). In other words, a hot, dense object’s primary radiation is displaced as its temperature changes. This is the Wien displacement law. More precisely, the Wien displacement law states that the wavelength of a hot, dense object’s primary radiation is inversely proportional to its temperature, assuming we measure temperature with correct units such as kelvins or rankines. At one or two thousand kelvins, objects radiate primarily red visible light. At three or four thousand kelvins, objects radiate primarily orange visible light. At five or six thousand kelvins, objects radiate primarily yellow visible light. At roughly ten thousand kelvins, objects radiate primarily blue visible light. Notice how hotter temperatures displace the primary radiation to higher and higher photon energies (which are also higher and higher frequencies and shorter and shorter wavelengths), while cooler temperatures displace the primary radiation to lower and lower photon energies (which are also lower and lower frequencies and longer and longer wavelengths). It is commonly known that the Sun is a yellow star. For example, every young child will use a yellow crayon when asked to draw the Sun. From that yellow color, we can use the Wien displacement law to calculate that the surface temperature of the Sun is roughly six thousand kelvins. We can also calculate the surface temperature of any hot, dense object using the Stefan-Boltzmann law, which states that the luminosity of any hot, dense object is directly proportional to the product of its surface area and the fourth power of its surface temperature. Since the shape of the Sun is very nearly a sphere and the surface area of a sphere of radius R is 4πR², the Stefan-Boltzmann law for the Sun states ℒ = σ(4πR²)T⁴, where T is the surface temperature. Also, σ (the lowercase Greek letter sigma) is a fixed number called the Stefan-Boltzmann constant. Warning: we use lowercase r for the distance from the hot object, and we use uppercase (capital) R for the actual radius of the hot object. In particular for the Sun, r is roughly one hundred and fifty million kilometers (our distance from the Sun), while R is roughly seven hundred thousand kilometers (the actual radius of the Sun). We already determined the luminosity of the Sun, and we certainly know the radius of the Sun. Therefore, the only unknown remaining in the Stefan-Boltzmann law ℒ = σ(4πR²)T⁴ is the surface temperature of the Sun, which we again calculate to be roughly six thousand kelvins, consistent with the Wien-displacement-law method.

From the absorption spectral lines (the Fraunhofer lines, as we discussed earlier in the course) superimposed upon the Sun’s continuous blackbody spectrum, we can determine the composition of the Sun. We discover that the Sun is composed of all the atoms on the Periodic Table of Elements, but not in equal amounts. Only two atoms account for close to one hundred percent of the Sun’s mass; all the other atoms on the Periodic Table of Elements account for only a tiny fraction (tiny percentage) of the Sun’s mass. Roughly seventy-five percent (three-quarters) of the Sun’s mass is hydrogen, and roughly twenty-five percent (one-quarter) of the Sun’s mass is helium. Again, all the other atoms on the Periodic Table of Elements make up a tiny fraction (tiny percentage) of the Sun’s mass.

The Sun radiates roughly four hundred septillion watts every second. The Sun has been radiating this tremendous luminosity for roughly five billion years, and it will continue to do so for the next roughly five billion years. What is the source of this incredible luminosity? More plainly, why does the Sun shine? This question was one of the great scientific debates of the 1800s (the nineteenth century). Chemical reactions provide nowhere nearly enough energy to account for the Sun’s luminosity over its long lifetime. It is not difficult to calculate that the Sun would consume all of its mass in only several thousand years if it derived its luminosity from chemical reactions, but the Sun has been shining for roughly five billion years. Gravitational contraction is not the Sun’s energy source either. Although gravitational contraction does convert gravitational energy into heat and light, it is not difficult to calculate that the Sun would need to collapse in several million years to account for its incredible luminosity. Although this several-million-year lifespan is an improvement over the several-thousand-year lifespan chemical reactions could provide, it is nevertheless still nowhere near the Sun’s actual lifetime, which is in the billions of years. Gravitational contraction is also called Kelvin-Helmholtz contraction, named for British physicist William Thomson Lord Kelvin and German physicist Hermann von Helmholtz, the two physicists who developed the mathematical details of gravitational contraction. Note that while the Sun was being born as a collapsing cloud of gas from within a diffuse nebula, it did derive its energy from Kelvin-Helmholtz (gravitational) contraction as we will discuss shortly, and indeed the Sun collapsed in only several million years while it was being born. However, the Sun eventually attained gravitational equilibrium, meaning outward pressure balances inward self-gravity. The Sun has been in gravitational equilibrium for roughly five billion years, and so Kelvin-Helmholtz (gravitational) contraction does not explain why the Sun has been shining for most of its lifetime. The 1800s (the nineteenth century) ended with this fundamental question unanswered. Why does the Sun shine? At the beginning of the 1900s (the twentieth century), the atomic theory of matter became firmly established. Moreover, physicists discovered that atoms are composed of even smaller particles: the nucleus at the center of the atom and electrons around the nucleus. Physicists discovered that chemical reactions involve the electrons around the nucleus. Physicists also discovered nuclear reactions, which involve the nuclei themselves. These nuclear reactions can generate thousands, even millions, of times more energy than chemical reactions. Perhaps the Sun derives its energy from nuclear reactions. Before we explore this idea further, we must discuss some fundamental physics.

There are four fundamental forces in the universe. Starting with the strongest force in the universe, we have the strong nuclear force, the electromagnetic force, the weak nuclear force, and finally the gravitational force is the weakest force in the universe. Actually, the gravitational force is by far by far by far by far the weakest force in the entire universe. The gravitational force is much much much much weaker than the other three forces. All of us have some familiarity with gravity. As we discussed earlier in the course, gravity causes everything in the universe to attract everything else in the universe. All of us also have some familiarity with the electromagnetic force. As we discussed earlier in the course, there are both positive and negative electrical charges in our universe. Positive and positive repel, negative and negative repel, and positive and negative attract. In other words, like charges repel, and unlike charges attract. Protons are positively charged, while electrons are negatively charged. Since unlike charges attract, the positive protons within the atomic nucleus attract the negative electrons around the atomic nucleus. This is what holds the atom together, the attraction between the positive protons within the nucleus and the negative electrons around the nucleus. However, what holds the nucleus of an atom together? The atomic nucleus is composed of protons and neutrons. The neutrons are neutral; this is why they are called neutrons! Since neutrons are neutral, they are not attracted to or repelled from anything electromagnetically. More importantly, the protons are positive. Hence, they repel each other electromagnetically. If the neutrons feel no electromagnetic attraction and if all the protons feel electromagnetic repulsion from each other, then what is holding the atomic nucleus together? We deduce that there must be another force within the nucleus that is stronger than the electromagnetic force and hence overpowers the electromagnetic repulsion of the protons, thus holding the nucleus together. This force is the strongest force in the entire universe, and it is called the strong nuclear force. The strong nuclear force must be stronger than the electromagnetic force, since the strong nuclear force must overpower the electromagnetic repulsion among the protons to hold the atomic nucleus together. The strong nuclear force attracts protons and protons together, the strong nuclear force attracts neutrons and neutrons together, and the strong nuclear force even attracts protons and neutrons together. Protons and neutrons are composed of even smaller particles called quarks and gluons, and the strong nuclear force is also responsible for holding quarks and gluons together to build protons and neutrons. More precisely, gluons are quantum-mechanical particles that are ultimately responsible for the strong nuclear force, just as an electromagnetic wave (light) is composed of photons and hence photons are ultimately responsible for the electromagnetic force. If the strong nuclear force is so powerful, why don’t all the protons and neutrons in the universe attract each other to form one giant nucleus? This does not occur because the strong nuclear force has a limited range. The gravitational force does not have a limited range. Regardless how close or how distant two objects are from one another, they will attract each other gravitationally with a strength that depends upon their masses and the distance between them. The electromagnetic force also does not have a limited range. Regardless how close or how distant two objects are from one another, they will attract or repel each other electromagnetically with a strength that depends upon their charges and the distance between them. However, the strong nuclear force does have a limited range. Protons and neutrons will not feel the strong nuclear force if they are further than a certain limited range. The range of the strong nuclear force is roughly the size of the nucleus of an atom, which is a few quadrillionths of a meter or a few trillionths of a millimeter or a few billionths of a micrometer. Hence, the strong nuclear force does overpower the electromagnetic force within the nucleus of an atom, but the strong nuclear force essentially vanishes outside of the nucleus of an atom. Hence, the limited range of the strong nuclear force ensures that all the protons and neutrons in the universe do not attract each other to form a giant nucleus. The limited range of the strong nuclear force only permits this force to hold quarks and gluons together within protons and neutrons and to hold protons and neutrons together within the nucleus of an atom. The weak nuclear force is responsible for certain weak nuclear reactions, hence its name. It also has a limited range, like the strong nuclear force.

The incredible strength of the strong nuclear force reveals why nuclear reactions generate so much more energy than chemical reactions. We will focus on two particular categories of nuclear reactions: nuclear fission reactions and nuclear fusion reactions. A nuclear fission reaction is the splitting of a larger, more massive (or heavier) nucleus into smaller, less massive (or lighter) nuclei. In fact, to fission anything means to split it in colloquial English. A nuclear fusion reaction is the merging of two smaller, less massive (or lighter) nuclei into a larger, more massive (or heavier) nucleus. In fact, to fuse anything means to merge them together in colloquial English. Nuclear fission reactions generate thousands of times more energy than chemical reactions, and nuclear fusion reactions generate thousands of times more energy than nuclear fission reactions, meaning that nuclear fusion reactions generate millions of times more energy than chemical reactions. To initiate a nuclear fission reaction, we must use a particle as a projectile that will collide with a more massive (heavier) nucleus; the collision causes the nucleus to split. We cannot use a proton as the projectile, since protons are positively charged, and the target nucleus is itself positively charged. Hence, the proton and the target nucleus will repel each other electromagnetically. We cannot use an electron as the projectile either. Although electrons are negatively charged and would be attracted to the positively charged nucleus that we are trying to split, an electron is almost two thousand times less massive (lighter) than a proton or a neutron. Hence, the electron has too little mass to split the target nucleus. If we wanted to demolish a condemned building, firing a bullet at the building would be fruitless. Regardless how fast the bullet may be moving, it has such little mass that it will not have sufficient momentum to demolish the condemned building. However, a wrecking ball is so massive that it carries sufficient momentum to demolish the condemned building, even if the wrecking ball is not moving particularly fast. If a proton is repelled by the target nucleus and if an electron has insufficient mass and thus insufficient momentum to split the target nucleus, the only particle left to try is a neutron. Although neutrons are neutral and thus will not be attracted to the target nucleus, they will not be repelled either. More importantly, the mass of a neutron is comparable to the mass of a proton. In fact, the mass of the neutron is a little bit more than the mass of a proton. Therefore, a neutron need not be moving particularly fast to carry sufficient momentum to split the target nucleus. Examples of massive (heavy) nuclei commonly used in nuclear fission reactions include uranium and plutonium. A particular example of a nuclear fission reaction is a neutron splitting a uranium nucleus into a krypton nucleus, a barium nucleus, and three neutrons. This nuclear fission reaction is more properly written + → + + 3. Note that is the symbol of the neutron in nuclear physics. Also note that in nuclear physics, we use the same symbol for the nucleus of an atom as a chemist would use for the entire atom. For example, chemists use the symbol for the barium-141 atom, but nuclear physicists use the same symbol for the barium-141 nucleus. Note that this nuclear reaction releases three neutrons, which can be used to split further nuclei. The result is a chain reaction. A chain reaction can be controlled, as is the case in nuclear power plants. A chain reaction can also be uncontrolled, as is the case in a nuclear fission bomb. In a nuclear power plant, lead rods are used to control the reaction rate. If the chain reaction is proceeding too quickly, lead rods are inserted into the reacting solution; these lead rods absorb some of the neutrons to reduce the splitting of the nuclei, thus slowing down the reaction. If the chain reaction is proceeding too slowly, lead rods are pulled out of the reacting solution, leaving more neutrons to split more nuclei thus speeding up the reaction.

A nuclear fusion reaction is the merging of two less massive (lighter) nuclei into a more massive (heavier) nucleus. However, all nuclei are positively charged. Therefore, all nuclei repel each another electromagnetically, which should prevent a fusion (a merging) of nuclei from ever occurring. As we discussed earlier in the course, temperature is a measure of the average energy of individual particles. In this course, we may assume that the average energy of individual particles corresponds to their average speed. In other words, particles move relatively faster at hotter temperatures, while particles move relatively slower at cooler temperatures. Imagine incredibly hot temperatures when nuclei are moving so fast that although they repel each other electromagnetically as they approach each other, their tremendous energies at these incredibly hot temperatures bring them within a few quadrillionths of a meter of one another despite their mutual electromagnetic repulsion. It is within this range that the strong nuclear force operates. Hence, the strong nuclear force will overpower the electromagnetic repulsion, and the nuclei will fuse together. Hydrogen is the least massive (the lightest) atom in the entire universe, and helium is the second least massive (the second lightest) atom in the entire universe. Hence, a particular example of a nuclear fusion reaction is hydrogen nuclei fusing into a helium nucleus. The threshold temperature at which hydrogen fuses into helium is several million kelvins. This is incredibly hot by human standards, but this threshold temperature would have been even hotter if it were not for Quantum Mechanics, the correct theory of molecules, atoms, and subatomic particles. At the foundation of Quantum Mechanics is the Heisenberg Uncertainty Principle, named for the German physicist Werner Heisenberg who not only formulated this fundamental principle but was also one of the physicists who formulated Quantum Mechanics itself. The Heisenberg Uncertainty Principle states that it is impossible for a subatomic particle to have a definite position (location) and a definite velocity (speed) at the same time. Because of this Heisenberg Uncertainty Principle, there is a fair probability for subatomic particles to overcome energy barriers even when they have insufficient energy to overcome the barrier. This is called quantum-mechanical tunneling. At first glance, this quantum-mechanical tunneling seems like unscientific nonsense, but quantum-mechanical tunneling has been proven for many subatomic particles, including electrons for example. In fact, modern electronic devices such as mobile telephones and computers would not function correctly without the quantum-mechanical tunneling of electrons. At several million kelvins of temperature, most hydrogen nuclei are still not moving sufficiently fast to quantum-mechanically tunnel through the electromagnetic repulsion between them, but temperature is a statistical measure of average speed. In other words, at any given temperature, whereas many particles move with a certain average speed, a small number of particles move much slower than the average speed, and a small number of particles move much faster than the average speed. At several million kelvins of temperature, a small fraction of hydrogen nuclei do move sufficiently fast that they are able to quantum-mechanically tunnel through the electromagnetic repulsion between them, enabling the strong nuclear force to fuse them together. Humans have achieved uncontrolled nuclear fusion reactions using nuclear fusion bombs. Humans have not yet succeeded in controlled nuclear fusion reactions. Since we do not yet have the technology to control nuclear fusion reactions at these incredible temperatures, all nuclear power plants use nuclear fission reactions, not nuclear fusion reactions.

The energy yield of both nuclear fission bombs and nuclear fusion bombs is measured in units of tons of trinitrotoluene (TNT), a chemical explosive. One ton of TNT has an explosive yield of roughly four billion joules of energy. One kiloton of TNT has an explosive yield of one thousand tons of TNT, since the prefix kilo- always means thousand. For example, there are one thousand meters in a kilometer, and there are one thousand grams in a kilogram. Since one kiloton of TNT has an explosive yield of one thousand tons of TNT and since one ton of TNT has an explosive yield of roughly four billion joules of energy, one kiloton of TNT therefore has an explosive yield of roughly four trillion joules of energy. One megaton of TNT has an explosive yield of one million tons of TNT, since the prefix mega- always means million. Since one megaton of TNT has an explosive yield of one million tons of TNT and since one ton of TNT has an explosive yield of roughly four billion joules of energy, one megaton of TNT therefore has an explosive yield of roughly four quadrillion joules of energy. The typical yield of a nuclear fission bomb is a few kilotons of TNT, and the typical yield of a nuclear fusion bomb is a few megatons of TNT. These incredible yields help us appreciate the extraordinary energies released from nuclear reactions. We can also appreciate the vast quantities of energy released from nuclear reactions by discussing the activation energy required to detonate these nuclear weapons. We require a powerful chemical explosive to heat uranium or plutonium to sufficient temperatures for neutrons to move sufficiently fast to initiate nuclear fission. Hence, the detonator of a nuclear fission bomb is a chemical explosive, such as TNT. We require a fission bomb to heat hydrogen to millions of kelvins of temperature so that the hydrogen nuclei can move sufficiently fast to fuse into helium nuclei. Hence, the detonator of a nuclear fusion bomb is a nuclear fission bomb! These activation energies also give us a comparative scale. Comparing a chemical explosion to a nuclear fission explosion is rather like comparing a nuclear fission explosion to a nuclear fusion explosion!

As we discussed, the Sun is roughly three-quarters hydrogen and roughly one-quarter helium. We now suspect that the Sun derives its energy from the nuclear fusion of hydrogen into helium. Unfortunately, the surface temperature of the Sun is only six thousand kelvins, as we discussed. This is nowhere nearly hot enough to fuse hydrogen into helium. However, the interior of the Sun is much hotter than six thousand kelvins. Theoretical calculations reveal that the core of the Sun is at roughly fifteen million kelvins of temperature. Even at this incredibly hot temperature, it is only a small fraction of hydrogen nuclei that move sufficiently fast to quantum-mechanically tunnel through the electromagnetic repulsion between them. However, the Sun is also incredibly massive. Although these incredibly hot temperatures are only attained in the Sun’s core, the solar core is massive enough that a small fraction of the enormous number of hydrogen nuclei that compose the solar core is an appreciable number. In other words, the Sun’s core is composed of such an incredible number of hydrogen nuclei that a fair amount of nuclear fusion occurs, even though nuclear fusion is somewhat improbable even at several million kelvins of temperature. In conclusion, the Sun shines because of the nuclear fusion of hydrogen into helium in its core at roughly fifteen million kelvins of temperature. Warning: most of the Sun is not hot enough for any nuclear fusion to occur. Only the Sun’s core is hot enough to fuse some hydrogen into helium. Therefore, the solar core is slowly but progressively becoming less and less hydrogen and more and more helium, while the rest of the Sun remains roughly three-quarters hydrogen and roughly one-quarter helium. In roughly five billion years, the solar core will exhaust its hydrogen, becoming nearly entirely helium. At that point, the Sun will begin to die, as we will discuss shortly. We emphasize again that the entire Sun will never become pure helium. Most of the Sun will remain roughly three-quarters hydrogen and roughly one-quarter helium, since the nuclear fusion of hydrogen into helium only occurs in the solar core.

The first step of the nuclear fusion reactions occurring in the solar core is the fusion of two protons into a deuteron. This reaction is more properly written + → + e⁺ + ν_e. Again, in nuclear physics we use the same symbol for the nucleus of an atom as a chemist would use for the entire atom. For example, chemists use the symbol for the hydrogen-1 atom (the protium atom), but nuclear physicists use the same symbol for the hydrogen-1 nucleus (simply a proton). Also, chemists use the symbol for the hydrogen-2 atom (the deuterium atom), but nuclear physicists use the same symbol for the hydrogen-2 nucleus (a deuteron). The symbol ν_e (the lowercase Greek letter nu) represents a neutrino, a quantum-mechanical particle that we will discuss shortly. The symbol e⁺ represents the antielectron, commonly known as the positron. For every particle in the universe, there is a corresponding antimatter particle. A particle of antimatter has the identical mass as its corresponding particle of ordinary matter, but the antimatter particle has the opposite electric charge as the ordinary matter particle. Other parameters are opposite as well. We have discussed that the proton is positively charged, but there is another particle with identical mass as the proton called the antiproton, which is negatively charged instead of positively charged. We have discussed that the electron is negatively charged, but there is another particle with identical mass as the electron called the antielectron, which is positively charged instead of negatively charged. This is why the antielectron is commonly known as the positron. Notice that the symbol of the antielectron (the positron) is e⁺, since we may regard this antimatter particle as a positive electron. Indeed, the symbol of the ordinary electron is e^–, since the ordinary electron is negatively charged. We emphasize that antimatter is not science fiction; antimatter is proven science fact. Physicists have synthesized antimatter particles for many decades. Antiprotons and antineutrons compose antinuclei, and antielectrons (positrons) can be attracted by these antinuclei to form antiatoms. Antiatoms can even chemically bond with each other to form antimolecules. Antimatter is extraordinarily rare in our universe, but this is fortunate actually. When a matter particle and its corresponding antimatter particle meet, they completely annihilate each other, becoming pure energy. This is the complete conversion of matter into energy. The overwhelming majority of particles in the universe are ordinary matter particles; antimatter particles are extraordinarily rare. All the stars, planets, moons, asteroids, and comets in the entire universe are composed of matter, not antimatter. In particular, the Sun is composed of ordinary matter. Thus, when the antielectron (the positron) is generated in this first step of the nuclear fusion in the solar core, the antielectron (positron) immediately annihilates with an ordinary electron, generating energy. The next step of the nuclear fusion reactions occurring in the solar core is the fusion of a proton and a deuteron into a helium-3 nucleus. This reaction is more properly written + → . The third and final step of the nuclear fusion reactions occurring in the solar core is the fusion of two helium-3 nuclei into a helium-4 nucleus (an alpha particle) plus two protons. This reaction is more properly written + → + + . The two protons produced by this final step can then fuse, thus initiating the first step of this nuclear reaction chain. Hence, the overall reaction of all of these nuclear fusion reactions is called the proton-proton cycle, since the fusion of two protons begins the reaction chain and two protons are produced by the end of the reaction chain which can begin the entire reaction chain over again. However, this may lead us to suspect that this nuclear reaction chain continues indefinitely, but this is false. If we construct the overall reaction, we discover that four protons fuse into a helium-4 nucleus (an alpha particle) plus energy plus two neutrinos. This overall reaction is more properly written 4 → + energy + 2ν_e. Hence, hydrogen is being converted into helium in the Sun’s core. Therefore, the solar core is slowly but progressively becoming less and less hydrogen and more and more helium. Again, only the solar core is hot enough for these nuclear fusion reactions to occur. Nuclear reactions do not occur throughout most of the Sun. Hence, most of the Sun remains three-quarters hydrogen and one-quarter helium. There will never come a time when the entire Sun is pure helium. However, the solar core will become nearly entirely helium in roughly five billion years. This will begin the death of the Sun, which we will discuss shortly. Hence, this proton-proton cycle will not continue indefinitely, since the solar core will eventually exhaust its supply of hydrogen, thus ending this nuclear reaction chain. Note that the first step of this reaction chain is governed by the weak nuclear force, which is a slow force. This contributes to the Sun’s long lifetime. Instead of consuming all of the hydrogen in its core in a short amount of time, the proton-proton cycle is slowed by the first step in the nuclear reaction chain, stretching out the conversion of hydrogen into helium in the solar core over a timescale of billions of years. The energy generated in the proton-proton cycle is in the form of high-energy photons in the gamma-ray part of the Electromagnetic Spectrum.

Although hydrogen and helium are gases at ordinary temperatures, the interior of the Sun is so hot that the hydrogen and helium atoms are ionized. The composition of the Sun is actually positively-charged nuclei, negatively-charged electrons, and high-energy photons all colliding with one another. This hot state of matter is called a plasma. Therefore, the high-energy photons created by the proton-proton cycle in the solar core cannot easily escape the Sun. They continuously collide with positive nuclei and negative electrons. Therefore, the trajectory (the path) of these photons is randomized. Of course, these photons do propagate in a straight line at the speed of light between collisions, but their overall trajectory (path) is not a straight line; it is a random trajectory (path) resulting from continuous collisions with nuclei and electrons. This type of trajectory is called a random walk, since it is rather like the path a pedestrian would take while aimlessly walking the streets of a city. Note therefore that light cannot travel easily through the Sun. In other words, the Sun is not transparent; the Sun is opaque. The layer of the Sun around the core where the photons execute this random walk is called the radiation zone. It takes somewhere between one hundred thousand years and one million years for a typical photon to escape out of the radiation zone. It would only take photons roughly two seconds to travel from the core of the Sun to the surface of the Sun if they could move in straight lines at the speed of light without suffering any collisions. However, photons take somewhere between one hundred thousand years and one million years to travel out of the radiation zone, due to their random walks resulting from their continuous collisions with nuclei and electrons. The next layer of the Sun around the radiation zone is the convection zone, where energy is transported much faster through rising masses of more hot plasma and sinking masses of less hot plasma. These are convection cells similar to the convection cells in the Earth’s asthenosphere that we discussed earlier in the course, although the convection cells in the Sun’s convection zone are much, much hotter. The outermost layer of the Sun around the convection zone is the photosphere, the actual surface of the Sun that we can see. At the photosphere, energy leaves the Sun in the form of electromagnetic waves (photons) from across the entire Electromagnetic Spectrum. More precisely, electromagnetic waves (photons) radiate from the photosphere with a continuous blackbody spectrum, primarily in visible light (peaking in yellow visible light) in accordance with the temperature of the photosphere (roughly six thousand kelvins) as determined by the Wien displacement law. The photons that leave the photosphere travel out into the surrounding outer space at the speed of light. Some of these photons spend roughly eight minutes traveling to the Earth. Each time we feel the warmth of sunlight upon us, we should reflect upon the journey that sunlight traveled before finally arriving upon us. First, the energy was created in the solar core through nuclear fusion reactions (the proton-proton cycle). Then, the energy spent between one hundred thousand years and one million years trying to escape from the Sun’s radiation zone. Then, the energy was transported faster by convection through the Sun’s convection zone. Then, the energy escaped the photosphere (the surface of the Sun), traveling through outer space toward the Earth for roughly eight minutes before finally bathing us with its warmth.

Our understanding of the interior of the Sun comes from theoretical calculations together with computer simulations. The results of this theoretical work can be tested through the observation of vibrations on the photosphere. The study of these vibrations is called helioseismology, since we may regard these vibrations as sunquakes. It is remarkable that our understanding of the interior of the Sun is tested through measuring sunquakes, just as our understanding of the interior of the Earth is tested through measuring earthquakes and our understanding of the interior of the Earth’s Moon is tested through measuring moonquakes, as we discussed earlier in the course. Our understanding of the interior of the Sun is also confirmed through the actual appearance of the photosphere. The surface of the Sun does not look smooth; the surface of the Sun looks grainy or sandy. This grainy or sandy appearance of the photosphere is called granulation. The photosphere is composed of both more bright granules and less bright granules. These granules on the photosphere reveal the convection in the convection zone beneath the photosphere. Rising masses of more hot plasma manifest themselves as more bright granules on the photosphere, while sinking masses of less hot plasma manifest themselves as less bright granules on the photosphere.

The Sun creates a powerful magnetic field. As we discussed earlier in the course, the Earth’s magnetic field is generated by its rotation together with circulating currents of molten metal in its outer core, and the magnetic field of a jovian, gas-giant (outer) planet is generated by its rotation together with circulating currents of electrically conducting hydrogen (metallic hydrogen) in deeper layers of the planet. Similarly, the Sun’s magnetic field is generated by its rotation together with convection cells of circulating hot plasma in its convection zone. However, the dynamics of the Sun’s magnetic field is complicated by the Sun’s differential rotation. We use the term rigid body rotation when every part of an object rotates together at the same rate, while we use the term differential rotation when different parts of an object rotate at different rates. Fluids suffer from differential rotation. For example, the jovian, gas-giant (outer) planets suffer from differential rotation, since their outer layers are composed primarily of hydrogen gas and helium gas. Solids suffer from rigid body rotation. For example, the terrestrial (inner) planets suffer from rigid body rotation, since they are composed primarily of metal and rock. Caution: this is actually an oversimplification. As we discussed earlier in the course, different parts of the Earth actually rotate at different rates. Nevertheless, as compared with the jovian, gas-giant (outer) planets, we may regard the Earth and all the terrestrial (inner) planets as suffering from rigid body rotation. The Sun is not a solid object. The Sun is a hot plasma, which is a type of fluid. Therefore, the Sun suffers from differential rotation. On average, the Sun rotates roughly once per month, but in actuality different parts of the Sun rotate at different rates. This differential rotation drags and stretches the Sun’s magnetic field lines. As magnetic field lines are stretched, they increase in tension, just as strings or elastic bands increase in tension when stretched. Eventually, magnetic field lines may break from too much tension, again just as strings or elastic bands may break from too much tension. When the Sun’s magnetic field lines break, they reconnect with complex patterns. After a magnetic break, a magnetic reconnection often causes magnetic field lines to anchor themselves at two places on the photosphere (the surface of the Sun). The magnetic field lines point out of the photosphere at one anchor, bend above the photosphere, and point back into the photosphere at the other anchor. Wherever they anchor themselves on the photosphere will be regions of very strong magnetic fields that block convection in the convection zone beneath these anchors, causing these regions of the photosphere to be less hot than the rest of the photosphere. These less hot regions with strong magnetic fields are called sunspots, since they appear black as compared with the rest of the surface of the Sun (the photosphere). The temperatures of these sunspots are still in the thousands of kelvins however; sunspots are simply not as hot as the rest of the photosphere at roughly six thousand kelvins. If the temperatures of sunspots are still in the thousands of kelvins, then these sunspots are hot enough to radiate visible light. Indeed, these sunspots are actually quite luminous; sunspots only appear black because we are comparing them with the rest of the surface of the Sun. Since broken and then reconnected magnetic field lines often anchor themselves at two places on the photosphere, sunspots often occur in pairs. One sunspot will have an outwardly directed magnetic field, while the other sunspot will have an inwardly directed magnetic field. Plasma eruptions on the photosphere often follow the Sun’s magnetic field lines. As such, a plasma eruption often forms an arch anchored at a pair of sunspots. This arched plasma eruption is called a solar prominence. If a tremendous amount of tension in the Sun’s magnetic field lines is finally liberated through a magnetic break followed by a magnetic reconnection, a violent plasma eruption will burst outward from the photosphere; this plasma eruption is called a solar flare. These solar flares travel outward from the Sun, and hence some of these solar flares travel toward the direction of the Earth. Fortunately, the Earth’s magnetic field shields us from most solar activities such as solar flares. However, our artificial satellites in orbit around the Earth are not well protected from solar activity. Our artificial satellites are continuously bombarded, damaged, and even on occasion completely destroyed by solar activities such as solar flares.

Astronomers have directly observed for roughly four hundred years (since the invention of the telescope) that the number of sunspots goes through a roughly eleven-year cycle. In one complete cycle, the number of sunspots increases then decreases over a time period of roughly eleven years. Furthermore, measurements of the radioactive isotope carbon-fourteen within trees have revealed that this roughly eleven-year solar cycle itself goes through a roughly two-hundred-year cycle. This is the de Vries cycle, named for the Dutch physicist Hessel de Vries, one of the pioneers of radiocarbon dating. According to the de Vries cycle, the Sun gradually increases in activity to what is called a solar maximum then gradually decreases in activity to what is called a solar minimum. Caution: the eleven-year solar cycles continue to occur throughout each two-century de Vries cycle. Since one complete de Vries cycle lasts for roughly two centuries, each solar maximum and each solar minimum lasts for roughly one hundred years. Over the past twelve thousand years (since the beginning of the current interglacial period of the Current Ice Age), there have been roughly sixty complete de Vries cycles, with each de Vries cycle having one solar maximum and one solar minimum. The Modern Maximum occurred throughout most of the twentieth century, and the Modern Minimum began toward the beginning of the twenty-first century (the current century). The roughly eleven-year sunspot cycle and the roughly two-century de Vries sunspot cycle both strongly determine variations in global temperatures on planet Earth, as we discussed earlier in the course. In particular, the Modern Maximum that occurred throughout most of the twentieth century contributed to the warming temperatures of that century, and the Modern Minimum that began toward the beginning of the twenty-first century (the current century) has already caused cooling temperatures that will continue for the rest of the current century.

The Sun’s atmosphere is composed primarily of hydrogen and helium. As we leave the photosphere (the surface of the Sun) and climb the solar atmosphere and ultimately travel into the surrounding outer space, we expect the temperature to become cooler and cooler, but this is not the case. As we leave the photosphere, the temperature actually becomes hotter. The lower layer of the solar atmosphere is the chromosphere. The temperature approaches roughly one hundred thousand kelvins as we climb the chromosphere. Because of these hot temperatures, the chromosphere radiates primarily ultraviolet light, in accordance with the Wien displacement law. The upper layer of the solar atmosphere is the corona, the main part of the Sun’s atmosphere. The solar corona is even hotter, roughly one million kelvins in temperature. Because of these even hotter temperatures, the solar corona radiates primarily X-rays, again in accordance with the Wien displacement law. It is only when we climb beyond the solar corona and travel into the surrounding outer space that the temperature finally cools. We do not understand why the solar atmosphere is so hot. Perhaps the Sun’s atmosphere is heated by prominences, flares, and other solar activities from the photosphere. Although this sounds reasonable, this theory is nevertheless not well developed. Since the solar atmosphere is so hot, its composition is primarily not hydrogen gas and helium gas but primarily ionized hydrogen (protons and electrons) and ionized helium (alpha particles and electrons). Moreover, the hot temperatures of the solar atmosphere cause many of these particles to move sufficiently fast that they can escape from the Sun’s gravitational attraction. The result is the solar wind, a stream of charged particles radiating outward from the Sun composed primarily of protons (hydrogen nuclei), electrons, and alpha particles (helium nuclei). This solar wind is capable of completely ionizing the Earth’s atmosphere in a fairly short amount of time. Fortunately, the Earth’s magnetic field is sufficiently strong to deflect most of the Sun’s solar wind. Some of the charged particles in the solar wind do however become trapped within the Earth’s magnetic field. These charged particles execute helical trajectories around the Earth’s magnetic field lines. These regions of the Earth’s magnetic field are called the Van Allen belts, named for the American physicist James Van Allen who discovered them. The charged particles within the Van Allen belts may create an aurora, either aurora borealis (or more commonly the northern lights) near the Earth’s north magnetic pole or aurora australis (or more commonly the southern lights) near the Earth’s south magnetic pole, as we discussed earlier in the course. If the Sun happens to be less active, its solar wind would be weaker, the resulting aurorae would appear less spectacular, and we would only be able to enjoy them near the Earth’s magnetic poles. If the Sun happens to be more active, its solar wind would be stronger, the resulting aurorae would appear more spectacular, and we would be able to enjoy them further from the Earth’s magnetic poles.

Neutrinos are extremely weakly interacting quantum-mechanical particles. Neutrinos do not participate in the strong nuclear force for example. Neutrinos also refuse to participate in the electromagnetic force, since they are electrically neutral. This is why they are called neutrinos! Of course, everything in the universe feels gravity, but the mass of a neutrino is such a tiny number that physicists have not yet succeeded in even measuring its value. Since the mass of a neutrino is so extraordinarily tiny, neutrinos do not noticeably feel gravity. Therefore, for all practical purposes neutrinos do not participate in the gravitational force. Whereas the photons that are created in the solar core spend between one hundred thousand years and one million years trying to escape from within the Sun as we discussed, neutrinos are so weakly interacting that they immediately escape from within the Sun after being created in the solar core. Since neutrinos propagate almost at the speed of light, the neutrinos created by the proton-proton cycle in the Sun’s core travel in straight lines from the solar core to the photosphere in roughly two seconds. The neutrinos continue to travel outward from the Sun, through its atmosphere and then into the surrounding outer space. Some of these neutrinos spend roughly eight minutes traveling to the Earth. Neutrinos are so weakly interacting that when these neutrinos arrive at the Earth, they simply pass through the Earth. Billions and billions of neutrinos from the Sun pass through our bodies every second of every day! Neutrinos are so weakly interacting that they do virtually nothing with the atoms that compose our bodies. This is not just the case during daytime when we are on the side of the Earth facing toward the Sun. This is also true during nighttime when we are on the side of the Earth facing away from the Sun. In this case, these solar neutrinos arrive at the Earth, pass straight through the Earth, and pass straight through or bodies on the nighttime side of the Earth. Every second of every day of our lives, billions and billions of solar neutrinos continuously pass through our bodies!

If we could detect these solar neutrinos, this would provide nearly real-time information about the solar core. The light we collect from the Sun may have taken roughly eight minutes to travel from the photosphere to the Earth, but those photons were actually created in the solar core at least one hundred thousand years ago and even up to one million years ago. If we only rely upon the light from the Sun to understand the interior of the Sun, then our knowledge about the solar core is actually up to one million years out of date. Of course, one million years is actually rather recent as compared to the Sun’s age of roughly five billion years. Nevertheless, it would be exciting to have information about the solar core that is only eight minutes old. Unfortunately, neutrinos are so weakly interacting that detecting them is virtually impossible. Although neutrinos do not participate in the gravitational force (practically speaking) or the strong nuclear force or the electromagnetic force, neutrinos do on occasion participate in the weak nuclear force. As we discussed, the first step of the proton-proton cycle is governed by the weak nuclear force, and note above that that nuclear reaction involves a neutrino. Several decades ago, physicists built neutrino detectors using the principles of neutrinos participating in the weak nuclear force. Nevertheless, neutrinos are so weakly interacting that even though billions and billions of solar neutrinos pass through these detectors every second of every day, a neutrino detector only detects one neutrino per day! Working at a neutrino detector is the most boring job in the world. On one day, we see a single blip on a screen. The following day, we see another single blip. The day after that, we see one single blip yet again. Boring! This is also frustrating, since we know that billions and billions of neutrinos are actually passing through the detector every second of every day, but we only detect one neutrino per day! Over several decades, physicists have only detected one-third of the number of neutrinos that theoretical calculations predict that we should be detecting from the Sun. This is called the solar neutrino problem. There have been many theories proposed over the decades to resolve the solar neutrino problem. One such idea is the theory of neutrino oscillations. There are three different flavors (or varieties or types) of neutrinos. According to the theory of neutrino oscillations, there is a certain probability that a neutrino can spontaneously change its flavor from one type to another type. Only one type of neutrino is created by the proton-proton cycle in the solar core. According to the theory of neutrino oscillations, some of these neutrinos spontaneously change their flavor during their roughly eight-minute journey from the Sun to the Earth. Perhaps we have only been detecting one-third of the number of neutrinos we should be detecting because our neutrino detectors can only detect one flavor of neutrino instead of all three flavors of neutrinos. This theory of neutrino oscillations was attacked and ridiculed by some physicists for decades until it was proven to be the correct theory to resolve the solar neutrino problem. Several years ago, physicists finally built neutrino detectors that could detect all three flavors of neutrinos. Not only have we detected all three flavors of neutrinos from the Sun, but totaling all three detected flavors has finally yielded results consistent with theoretical calculations. Hence, the resolution of the solar neutrino problem is indeed the theory of neutrino oscillations.

Stellar Properties

Other stars besides the Sun are at least two hundred thousand times further from the Earth as compared with the Sun. Therefore, we know much less about others stars as compared with our Sun. Nevertheless, we will attempt to determine the properties of other stars by applying the same procedures we applied to our Sun. Firstly, from the absorption spectral lines within a star’s light, we can determine the composition of the star. We discover that all stars are composed of all the atoms on the Periodic Table of Elements, but not in equal amounts. Only two atoms account for close to one hundred percent of the mass of all stars; all the other atoms on the Periodic Table of Elements account for only a tiny fraction (tiny percentage) of the mass of stars. All stars are composed of roughly seventy-five percent (three-quarters) hydrogen and roughly twenty-five percent (one-quarter) helium. Again, all the other atoms on the Periodic Table of Elements make up a tiny fraction (tiny percentage) of the mass of stars.

To determine the distance to stars, we measure their parallax. As we discussed earlier in the course, parallax is the apparent motion of an object, not because it is moving but because the observer is in fact moving. The motion of the Earth around the Sun causes the stars to appear to shift their positions in the sky by tiny amounts. By measuring the angle of this shift, we can determine the distance to the star. As we discussed earlier in the course, the orbit of the Earth around the Sun is an ellipse with a semi-major axis equal to one astronomical unit (1 au), roughly equal to one hundred and fifty million kilometers. We also discussed earlier in the course that the eccentricity of the Earth’s orbit around the Sun is so close to zero that its orbit is nearly a circle, and so we may regard one astronomical unit as the radius of the Earth’s roughly circular orbit around the Sun. More plainly, we may regard one astronomical unit as the distance between the Earth and the Sun. Astronomers define the parallax angle as the apparent angular shift of a star over a baseline of the Earth’s orbital radius. Further distances result in smaller parallax angles. Even the nearest stars (besides the Sun) are so distant that their parallax shifts are much smaller than even a one-degree angle. A one-degree angle is already small, since one degree is one full circle divided into three hundred and sixty equal parts. The parallax shifts of even the nearest stars (besides the Sun) are much smaller than even one degree! One sixtieth of a degree is written 1′ and is called one arcminute or one minute of arc. Notice that minutes of arc are indicated with a single prime. Caution: the single prime is also used for feet of length in the United States. One sixtieth of one arcminute is written 1″ and is called one arcsecond or one second of arc. Notice that seconds of arc are indicated with a double prime. Caution: the double prime is also used for inches of length in the United States. Since sixty multiplied by sixty is 3600, this means that one arcsecond is one degree divided into 3600 equal angles. A one-degree angle is already small, but now imagine dividing that small angle into 3600 equal angles! The nearest stars (besides the Sun) suffer parallax shifts even smaller than one arcsecond! Since stars must be incredibly distant to suffer such tiny parallax shifts, astronomers have defined a new unit of distance to measure distances to stars. The distance at which a star would appear to suffer a parallax of 1″ (one arcsecond or one second of arc) is called a parsec, abbreviated pc. The word parsec is derived from the three words parallax, arc, and second. It is not difficult to calculate that one parsec of distance is slightly more than two hundred thousand astronomical units. If we multiply two hundred thousand astronomical units by roughly one hundred and fifty million kilometers for each astronomical unit, we deduce that one parsec is roughly thirty-one trillion kilometers! This is an incredible distance, and the nearest stars (besides the Sun) are further than even this! One parsec is also equal to 3.26 light-years, where one light-year is the distance that light travels in a time of one year, as we discussed toward the beginning of the course. If a star suffers a parallax of 1″ (one arcsecond or one second of arc), then it is 1 pc (one parsec) distant, by the definition of the parsec. If a star suffers an even smaller parallax (as all stars besides the Sun do), then the star is at a proportionally further distance. For example, if a star suffers a parallax of one-half of one arcsecond, then it is two parsecs distant. If a star suffers a parallax of one-tenth of one arcsecond, then it is ten parsecs distant. We can also invert this argument and predict the parallax from the distance. For example, if a star is twenty parsecs distant, then it must suffer a parallax of one-twentieth of one arcsecond. If a star is fifty parsecs distant, then it must suffer a parallax of one-fiftieth of one arcsecond.

The Cosmological Distance Ladder is a list of methods to determine distances to astronomical objects. Any given method can only be used over a certain range of distances, and so we must use other methods for further distances. That new method can only be used over its own range of further distances, and so we must use yet another method for even further distances, and so on and so forth. The parallax method of determining distances is the lowest rung of the Cosmological Distance Ladder, since parallax angles are so tiny that we can only measure them for nearby stars within the so-called solar neighborhood. Beyond distances of a couple thousand parsecs, parallax angles become too tiny to measure even with modern telescopes. Therefore, we cannot measure the parallax for most of the stars of our Milky Way Galaxy, and measuring parallaxes beyond our Milky Way Galaxy is hopeless. We will spend the rest of this course adding higher and higher rungs to the Cosmological Distance Ladder until have a list of methods that will enable us to determine distances from nearby stars in the solar neighborhood all the way to the edge of the observable universe. Nearby stars are within the so-called solar neighborhood, nearby galaxies slightly beyond our Milky Way Galaxy are within the so-called galactic neighborhood, and the edge of the observable universe is called the cosmic horizon. Although we cannot use parallax to determine distances beyond the solar neighborhood, astrophysicists nevertheless continue to use the parsec as the unit of distance even for astronomical objects whose distances are determined using non-parallax methods. One thousand parsecs is called one kiloparsec (abbreviated kpc), since the prefix kilo- always means thousand. For example, there are one thousand meters in one kilometer, and there are one thousand grams in one kilogram. One million parsecs is called one megaparsec (abbreviated Mpc), since the prefix mega- always means million. One billion parsecs is called one gigaparsec (abbreviated Gpc), since the prefix giga- always means billion. Theoretically, one trillion parsecs would be called one teraparsec (abbreviated Tpc), since the prefix tera- always means trillion. However, the entire observable universe is only a few gigaparsecs across. As we will discuss toward the end of this course, the universe is expanding, and hence the observable universe is continuously growing in size. In many billions of years, the observable universe will eventually expand to become teraparsecs in size. However, the observable universe is presently only a few gigaparsecs across. Therefore, the teraparsec is not yet a physically meaningful unit of distance. Caution: current cosmological models suggest that the entire universe beyond the observable universe is actually infinite in size. It is the observable universe that is only a few gigaparsecs across, not the entire universe. We will make clear the distinction between the observable universe and the entire universe toward the end of the course.

Until we discuss higher rungs of the Cosmological Distance Ladder, for now our discussion can only focus upon the parallax method to measure distances to stars within a couple thousand parsecs (within the solar neighborhood). Nevertheless, there are still millions of stars within this distance. Therefore, we can discuss the determination of the luminosities of these nearby stars within the solar neighborhood. As we discussed, we can calculate the luminosity of any object from the intensity of its light I and its distance from us r using the equation I = ℒ / 4πr², where ℒ is the luminosity of the object. The intensity of a star’s light is often expressed as an apparent magnitude, while the luminosity of the star is often expressed as an absolute magnitude. This magnitude scale was formulated by the ancient Greek mathematician and astronomer Hipparchus of Nicaea. Hipparchus called the brightest stars we can see in the night sky first-magnitude stars. Bright stars that were not as bright as first-magnitude stars were called second-magnitude stars. Stars of intermediate brightness in the night sky were called third-magnitude stars. Dim stars were called fourth-magnitude stars, and the dimmest stars visible to the human eye were called fifth-magnitude stars. This magnitude scale is rather illogical, since dimmer stars are assigned higher magnitude numbers, while brighter stars are assigned lower magnitude numbers. Nevertheless, modern astrophysicists not only continue to use this magnitude scale, but modern astrophysicists have even quantified this magnitude scale. Firstly, there are decimal magnitudes. For example, a 4.3-magnitude star is brighter than a 4.7-magnitude star. As another example, a 2.5-magnitude star is dimmer than a 2.1-magnitude star. Secondly, the invention of the telescope enables us to observe stars much dimmer than even the dimmest stars that the naked eye is able to see. A sixth-magnitude star is even dimmer than a fifth-magnitude star, and a seventh-magnitude star is dimmer still. The Hubble Space Telescope has imaged stars all the way down to roughly thirtieth-magnitude! Thirdly, the magnitude scale is also quantified in the other direction. A zeroth-magnitude star is brighter than a first-magnitude star. A star with magnitude negative-one is even brighter than a zeroth-magnitude star, and a star with magnitude negative-two is brighter still. Our Sun has a magnitude of roughly negative-twenty-seven! More precisely, the modern quantified magnitude scale is a logarithmic scale. In particular, every unit on the magnitude scale corresponds to a factor of roughly 2.5 in brightness. For example, a sixth-magnitude star is roughly 2.5 times brighter than a seventh-magnitude star. A fifth-magnitude star is roughly 2.5 times brighter than a sixth-magnitude star, which makes a fifth-magnitude star roughly 6.25 times brighter than a seventh-magnitude star (since 2.5 times 2.5 is 6.25). A fourth-magnitude star is roughly 2.5 times brighter than a fifth-magnitude star, which makes a fourth-magnitude star roughly 6.25 times brighter than a sixth-magnitude star, which makes a fourth-magnitude star roughly 15.625 times brighter than a seventh-magnitude star (since 2.5 times 2.5 times 2.5 is 15.625). In brief, one magnitude of separation is roughly a factor of 2.5 in brightness, two magnitudes of separation is roughly a factor 6.25 in brightness, and three magnitudes of separation is roughly a factor of 15.625 in brightness. Four magnitudes of separation is nearly a factor of 40 in brightness, and five magnitudes of separation is nearly a factor of 100 in brightness! This reveals that lower magnitude stars are much brighter than higher magnitude stars, since we must multiply by a string of factors to calculate their relative brightnesses. Stated the other way around, higher magnitude stars are much dimmer than lower magnitude stars, since we must divide by a string of factors to calculate their relative brightnesses. The apparent magnitude of a star is how bright the star appears, depending upon its distance. The absolute magnitude of a star expresses its luminosity or its intrinsic brightness. More precisely, astronomers define the absolute magnitude of a star as the apparent magnitude the star would have if it were ten parsecs distant. It is easy to prove that this precise definition of absolute magnitude relates directly to luminosity or intrinsic brightness. Therefore, we will casually regard all three of these variables (luminosity, absolute magnitude, and intrinsic brightness) as essentially the same quantity. If a star has a relatively constant luminosity or intrinsic brightness as most stars do, then its absolute magnitude is a fixed number. However, the star will appear dimmer from further away, and the star will appear brighter when closer. This is precisely the same as the appearance of a lightbulb. Most lightbulbs have a fixed luminosity (power output), but a lightbulb will still appear dimmer from further away, and the lightbulb will still appear brighter when closer. Because of the illogical magnitude scale, the apparent magnitude of a star will be a higher number (since the star appears dimmer) when further from the star, and the apparent magnitude of a star will be a lower number (since the star appears brighter) when closer to the star. Again, Hipparchus of Nicaea assigned lower magnitude numbers to brighter stars, and Hipparchus of Nicaea assigned higher magnitude numbers to dimmer stars.

As we discussed, astrophysicists use two methods to determine the surface temperature of our Sun. Perhaps we can apply these same two methods to determine the surface temperatures of other stars. One of these methods uses the Wien displacement law (essentially using the color of the star), and the other method uses the Stefan-Boltzmann law ℒ = σ(4πR²)T⁴, where T is the surface temperature of the star, ℒ is the luminosity (or absolute magnitude or intrinsic brightness) of the star, and σ is the Stefan-Boltzmann constant. Warning: we use lowercase r for the distance from the star, and we use uppercase (capital) R for the actual radius (physical size) of the star. Let us first consider the Stefan-Boltzmann law. To use this equation to calculate the surface temperature of the star, we need the luminosity and the actual radius (physical size) of the star. Although we just discussed the determination of the luminosities of nearby stars in the solar neighborhood, our telescopes are not powerful enough to magnify even these nearby stars enough to actually see their physical radii (their physical sizes). Stars appear to be twinkling points of light to the naked eye, and most stars still appear to be twinkling points of light through even our most powerful telescopes. If we cannot measure the actual physical radii of stars (their physical sizes), then we cannot use the Stefan-Boltzmann law to calculate their surface temperatures. We are now forced to consider the Wien displacement law. Unfortunately, even nearby stars in the solar neighborhood are very dim, and so we receive insufficient light from them to graph their continuous blackbody spectra to find the primary wavelength of their light, which we require to calculate the surface temperature using the Wien displacement law. All seems lost, but roughly a century ago astronomers formulated an ingenious method to construct the continuous blackbody spectrum of a star in a coarse but effective way. We place a red filter on our telescope that permits only red light to enter the telescope. Thus, we measure the brightness of a star in red light only. This is called the star’s red magnitude with the symbol m_R. After removing the red filter, we then place a blue filter on the telescope that permits only blue light to enter the telescope. Thus, we measure the brightness of the same star in blue light only. This is called the star’s blue magnitude with the symbol m_B. After removing the blue filter, we then place a yellow-green filter on the telescope that permits only yellow-green light to enter the telescope. Thus, we measure the brightness of the same star in yellow-green light only. This is called the star’s visual magnitude with the symbol m_V. (Astronomers use the word visual since yellow-green corresponds with the primary wavelength of light emitted by our own Sun.) After measuring the brightness of the star at these different wavelengths, we then subtract these color magnitudes. The difference between two color magnitudes of the same star is called a color index. The three possible color indices we may calculate using these three filters are m_B–m_V (blue minus visual), m_V–m_R (visual minus red), and m_B–m_R (blue minus red). These color indices yield estimates for the surface temperature of the star. For example, if the star radiates more blue light than any other wavelength, its surface temperature must be hotter than the surface temperature of our own Sun. If the star radiates more red light than any other wavelength, its surface temperature must be cooler than the surface temperature of our own Sun. If the star radiates more yellow-green (visual) light than any other wavelength, its surface temperature must be roughly the same as the surface temperature of our own Sun. Because of the illogical magnitude scale, both m_B–m_V and m_V–m_R will be negative numbers for hot, blue stars. Also because of this illogical magnitude scale, both m_B–m_V and m_V–m_R will be positive numbers for cool, red stars. Moreover because of this illogical magnitude scale, m_B–m_V will be a positive number and m_V–m_R will be a negative number for intermediate-temperature, yellow-green stars like our Sun. In summary, we can estimate the surface temperature of a star by measuring its color magnitudes (brightnesses at different wavelengths) and calculating color indices (differences of color magnitudes). By using many more filters and carefully measuring the brightness of the star at many different wavelengths (colors), we can calculate many color indices (perform many subtractions) to coarsely but effectively pinpoint the peak wavelength of a star’s continuous blackbody spectrum, enabling us to fairly accurately calculate its surface temperature using the Wien displacement law. Now that we have calculated the surface temperature of the star, we can then use the Stefan-Boltzmann law to calculate the actual radius (physical size) of the star, since the actual radius (physical size) of the star is the only unknown remaining in that equation. This is remarkable. Even though our most powerful telescopes cannot magnify most stars to actually see their physical radii (their physical sizes), astronomers have nevertheless succeeded in calculating the physical radii (physical sizes) of stars using this procedure. As the decades have passed, astronomers have constructed larger and larger and hence more and more powerful telescopes. If a star is close enough and large enough, astronomers have eventually been able to magnify these stars sufficiently to actually see their physical radius (their physical size) through these very powerful telescopes. The actual radius (physical size) of stars that astronomers have directly measured through these very powerful telescopes is consistent with calculations from decades earlier using the distances, the luminosities, and the surface temperatures of stars.

As we discussed earlier in the course, the only reliable method to calculate the mass of any object in the universe is to use Kepler’s third law. Fortunately, most stars are members of binary star systems: two stars orbiting each other, as we will discuss shortly. Therefore, we may use the orbital parameters of the two stars (the orbital period and the semi-major axes of the orbits) to calculate the masses of the stars. In summary, astrophysicists have determined the composition of stars, the distance to stars, the luminosity or the absolute magnitude or the intrinsic brightness of stars, the surface temperature of stars, the physical radius (physical size) of stars, and the mass of stars.

At first, astronomers classified stars based on the strength of their hydrogen lines in their absorption spectra, since stars are composed mostly of hydrogen. Stars with the strongest hydrogen absorption lines were called A-type stars. Stars with strong hydrogen absorption lines but not as strong as A-type stars were called B-type stars. Stars with strong hydrogen absorption lines but not as strong as A-type stars or B-type stars were called C-type stars, and so on and so forth. In brief, stars with strong hydrogen absorption lines have a spectral type near the beginning of the English alphabet, while stars with weak hydrogen absorption lines have a spectral type near the end of the English alphabet. When astronomers determined the surface temperatures of stars using color magnitudes and color indices, they realized that stars should be classified based on their temperatures, not based on the strength of their hydrogen absorption lines. Therefore, astronomers reordered the stellar spectral types based on surface temperature. Astronomers discovered that the hottest, bluest stars are O-type stars. Stars that are hot and blue, but not as hot and not as blue as O-type stars, were the B-type stars. Next come A-type stars, which are white-hot stars, but not as hot as O-type or B-type stars. After A-type stars come F-type stars which are also white-hot stars, but not as white-hot as A-type stars. Next come G-type stars which are yellow-hot stars, like our own Sun. In fact, our Sun is considered a G-type star. Even cooler than G-type stars are K-type stars, which are orange in color. Finally, the coolest, reddest stars are M-type stars. In summary, the spectral types of stars in the correct order starting with the hottest stars are O, B, A, F, G, K, and finally M for the coolest stars. For several decades, all astronomers memorized this temperature sequence using the mnemonic, “Oh be a fine guy/gal, kiss me!” Astronomers have also quantified this spectral sequence. In particular, each of these spectral types is subdivided into ten subclasses running from zero through nine. The hottest, bluest stars have a spectral type O0 followed by O1, O2, O3, O4, O5, O6, O7, O8, and O9. After O9 would come B0, B1, B2, B3, B4, B5, B6, B7, B8, and B9. After B9 would come A0 through A9, then F0 through F9, G0 through G9, K0, through K9, and M0 through finally M9, the spectral type of the coolest, reddest stars. As a simple exercise, a K4-star is hotter than a K7-star. As another simple exercise, a B6-star is cooler than a B3-star. Using this quantified temperature sequence, our Sun is more precisely classified as a G2-star. We will discuss shortly that stars also have a luminosity type in addition to the spectral type. The luminosity type of a star is labeled with a Roman numeral, such as I, II, III, IV, and V. We will discuss the meaning of each of these luminosity types shortly. Our Sun’s luminosity type is Roman numeral V, as we will discuss. Therefore, our Sun’s full spectral-luminosity type is G2V. Again, G2 is our Sun’s spectral type, which indicates that our Sun is a yellow star. The Roman numeral V is our Sun’s luminosity type, as we will discuss shortly.

The Hertzsprung-Russell Diagram

The Hertzsprung-Russell diagram (or the H-R diagram for short) is the single most important diagram in all of astrophysics. This diagram is named for Danish astronomer Ejnar Hertzsprung and American astronomer Henry Norris Russell, the two astronomers who first constructed this diagram. The vertical axis of the Hertzsprung-Russell diagram is luminosity or absolute magnitude or intrinsic brightness. More luminous (intrinsically brighter) stars are toward the top of the Hertzsprung-Russell diagram, while less luminous (intrinsically dimmer) stars are toward the bottom of the Hertzsprung-Russell diagram. The horizontal axis of the Hertzsprung-Russell diagram is temperature or spectral type or color. Hotter, bluer stars are toward the left on the Hertzsprung-Russell diagram, while cooler, redder stars are toward the right on the Hertzsprung-Russell diagram. Since the horizontal axis of the Hertzsprung-Russell diagram is temperature or spectral type or color, the horizontal axis can be labeled with the spectral types O, B, A, F, G, K, and M. Again, notice that the hotter, bluer stars are toward the left, while the cooler, redder stars are toward the right. We emphasize that the vertical axis of the Hertzsprung-Russell diagram is the absolute magnitude, not the apparent magnitude. Therefore, we must measure the distance to a star to calculate its absolute magnitude (or luminosity or intrinsic brightness) before we can plot the star on the Hertzsprung-Russell diagram. Thus far in this course, we have only discussed the measurement of distances to nearby stars within the solar neighborhood, within a couple thousand parsecs. Until we discuss higher rungs of the Cosmological Distance Ladder, we can only discuss the construction of the Hertzsprung-Russell diagram for nearby stars, within the solar neighborhood. Fortunately, there are millions of stars within the solar neighborhood. Assuming that there is nothing particularly unusual with the stars in the solar neighborhood as compared with all other stars throughout the universe, we should be able to determine the fundamental properties of all the stars in the entire universe by constructing the Hertzsprung-Russell diagram for the stars within the solar neighborhood.

The first thing we notice when we construct the Hertzsprung-Russell diagram for the solar neighborhood is that the vast majority of the stars on the diagram are along a band from the upper left corner of the diagram to the lower right corner of the diagram. The astronomers Hertzsprung and Russell called this band the main part of the diagram. Hence, this band on the Hertzsprung-Russell diagram was eventually named the main sequence. We will clearly define what we mean by a main sequence star shortly. For now, hotter main sequence stars are more luminous (intrinsically brighter), while cooler main sequence stars are less luminous (intrinsically dimmer). Therefore, we may naïvely consider main sequence stars to be normal stars, since we simplistically expect hotter stars to be more luminous and cooler stars to be less luminous. Also notice that the vast majority of stars on the Hertzsprung-Russell diagram are main sequence stars, again persuading us to naïvely consider these main sequence stars to be normal stars. The entire main sequence is assigned the luminosity type Roman numeral V. Our Sun is a main sequence star, as are the vast majority of all stars. Hence, our Sun’s luminosity type is Roman numeral V. Thus, our Sun’s spectral-luminosity type is G2V, where G2 is the spectral type (meaning that our Sun is yellow hot) and Roman numeral V is the luminosity type (meaning that our Sun is a main sequence star).

Although the vast majority of stars on the Hertzsprung-Russell diagram are along the main sequence, there is a collection of stars on the upper right corner of the diagram and another collection of stars on the lower left corner of the diagram. The collection of stars in the upper right corner of the Hertzsprung-Russell diagram are intrinsically bright (since they are toward the top of the diagram) and cool (since they are toward the right on the diagram). How is it possible for a cool star to be intrinsically bright? Some students argue that these stars are only apparently bright, since they are closer to us, but this argument is incorrect. Again, the vertical axis of the Hertzsprung-Russell diagram is the absolute magnitude, not the apparent magnitude. Stars that are toward the top of the Hertzsprung-Russell diagram are not apparently bright because they happen to be close to us; stars that are toward the top of the Hertzsprung-Russell diagram are intrinsically bright. Thus, the stars on the upper right corner of the Hertzsprung-Russell diagram are truly intrinsically bright even though they are cool. How can this be the case? The Stefan-Boltzmann law ℒ = σ(4πR²)T⁴ reveals the answer. The luminosity is determined by two variables: radius (size) and temperature. The temperature is the more important variable, since it is raised to the fourth power in the Stefan-Boltzmann law. The radius (size) is the less important variable, since it is raised to only the second power in the Stefan-Boltzmann law. However, imagine a star with a radius (a size) so enormous that squaring its radius overpowers its cool temperature to the fourth power, resulting in a large luminosity. Thus, the stars on the upper right corner of the Hertzsprung-Russell diagram have high luminosities (intrinsically bright) because they are giant (since they are enormous) even though they are red (since they are cool). This is precisely why these stars are called red giants. This collection of stars on the upper right corner of the Hertzsprung-Russell diagram is more properly subdivided into red supergiants (the largest stars since they are the most luminous), the red bright giants, the red ordinary giants, and the red subgiants (the smallest red giants since they are the least luminous). The red supergiants have luminosity type Roman numeral I, the red bright giants have luminosity type Roman numeral II, the red ordinary giants have luminosity type Roman numeral III, and the red subgiants have luminosity type Roman numeral IV. Red supergiants are the largest stars in the entire universe; they have a radius comparable to the radius of the Earth’s orbit around the Sun! If we could replace our Sun with a red supergiant star, it would engulf the entire inner Solar System! We will often casually refer to the entire collection of stars on the upper right corner of the Hertzsprung-Russell diagram as simply red giants.

The stars in the lower left corner of the Hertzsprung-Russell diagram are intrinsically dim (since they are toward the bottom of the diagram) and hot (since they are toward the left on the diagram). How is it possible for a hot star to be intrinsically dim? Some students argue that these stars are only apparently dim, since they are further from us, but this argument is incorrect. Again, the vertical axis of the Hertzsprung-Russell diagram is the absolute magnitude, not the apparent magnitude. Stars that are toward the bottom of the Hertzsprung-Russell diagram are not apparently dim because they happen to be far from us; stars that are toward the bottom of the Hertzsprung-Russell diagram are intrinsically dim. Thus, the stars on the lower left corner of the Hertzsprung-Russell diagram are truly intrinsically dim even though they are hot. How can this be the case? The Stefan-Boltzmann law ℒ = σ(4πR²)T⁴ again reveals the answer. The luminosity is determined by two variables: radius (size) and temperature. The temperature is the more important variable, since it is raised to the fourth power in the Stefan-Boltzmann law. The radius (size) is the less important variable, since it is raised to only the second power in the Stefan-Boltzmann law. However, imagine a star with a radius (a size) so small that squaring its radius overpowers its hot temperature to the fourth power, resulting in a small luminosity. Thus, the stars on the lower left corner of the Hertzsprung-Russell diagram have low luminosities (intrinsically dim) because they are dwarfs (since they are small) even though they are white hot. This is precisely why these stars are called white dwarfs. Besides neutron stars and black holes, both of which we will discuss shortly, white dwarfs are the smallest stars in the entire universe; they have a radius roughly equal to the radius of planet Earth! In summary, the vast majority of stars are main sequence stars, where the main sequence runs from the upper left corner of the Hertzsprung-Russell diagram to the lower right corner of the Hertzsprung-Russell diagram. Some stars are red giants, which are toward the upper right corner of the Hertzsprung-Russell diagram, and some stars are white dwarfs, which are toward the lower left corner of the Hertzsprung-Russell diagram. Red giants are intrinsically bright because they are so large even though they are cool, hence their name red giants. White dwarfs are intrinsically dim because they are so small even though they are hot, hence their name white dwarfs.

The main sequence is both a temperature sequence and a luminosity sequence. In particular, given any two stars on the main sequence, the hotter star will be more luminous (intrinsically brighter), while the cooler star will be less luminous (intrinsically dimmer). Warning: this is only true on the main sequence. Is it possible for a hotter star to be less luminous? Yes, white dwarfs are hot but are intrinsically dim. Is it possible for a cooler star to be more luminous? Yes, red giants are cool but are intrinsically bright. However, given two stars on the main sequence, the hotter star is indeed more luminous, and the cooler star is indeed less luminous. For example, suppose the spectral types of two stars are A9 and F2. Although the A9 star is certainly hotter since it has an earlier spectral type and the F2 star is certainly cooler since it has a later spectral type (recall OBAFGKM), we cannot draw any conclusion about the luminosities of these two stars. If however in addition to the spectral types of the two stars we are also told that both stars are on the main sequence, only then may we draw the conclusion that the A9V star (Roman numeral V for main sequence) is more luminous, while the F2V star (Roman numeral V for main sequence) is less luminous.

In addition to being a temperature sequence and a luminosity sequence, the main sequence is also a radius (size) sequence. In particular, given any two stars on the main sequence, the hotter, more luminous star will be larger, while the cooler, less luminous star will be smaller. Warning: this is only true on the main sequence. Is it possible for a hotter star to be smaller? Yes, white dwarfs are hot but are small. Is it possible for a cooler star to be larger? Yes, red giants are cool but are large. However, given two stars on the main sequence, the hotter, more luminous star is indeed larger, and the cooler, less luminous star is indeed smaller. For example, suppose the spectral types of two stars are B2 and M8. Although the B2 star is certainly hotter since it has an earlier spectral type and the M8 star is certainly cooler since it has a later spectral type (recall OBAFGKM), we cannot draw any conclusion about the luminosities or the sizes of these two stars. If however in addition to the spectral types of the two stars we are also told that both stars are on the main sequence, only then may we draw the conclusion that the B2V star (Roman numeral V for main sequence) is more luminous and larger, while the M8V star (Roman numeral V for main sequence) is less luminous and smaller.

In addition to being a temperature sequence, a luminosity sequence, and a radius (size) sequence, the main sequence is also a mass sequence. In particular, given any two stars on the main sequence, the hotter, more luminous, larger star will be more massive, while the cooler, less luminous, smaller star will be less massive. Warning: this is only true on the main sequence. For example, suppose the spectral types of two stars are G7 and K5. Although the G7 star is certainly hotter since it has an earlier spectral type and the K5 star is certainly cooler since it has a later spectral type (recall OBAFGKM), we cannot draw any conclusion about the luminosities, the radii (sizes), or the masses of these two stars. If however in addition to the spectral types of the two stars we are also told that both stars are on the main sequence, only then may we draw the conclusion that the G7V star (Roman numeral V for main sequence) is more luminous, larger, and more massive, while the K5V star (Roman numeral V for main sequence) is less luminous, smaller, and less massive.

In nearly every way imaginable, our Sun is an ordinary star. Firstly, our Sun is a main sequence star, just as the vast majority of stars are main sequence stars. Recall that the spectral-luminosity type of our Sun is G2V, and notice that its spectral type G2 places it roughly in the middle of the main sequence. Our Sun is not toward the beginning of the main sequence such as an O-type or a B-type main sequence star, nor is our Sun toward the end of the main sequence such as a K-type or an M-type main sequence star. Therefore, our Sun is not particularly hot, nor is our Sun particularly cool; our Sun is intermediate in temperature. Our Sun is not particularly intrinsically bright, nor is our Sun particularly intrinsically dim; our Sun is intermediate in luminosity. Our Sun is not particularly large, nor is our Sun particularly small; our Sun is intermediate in size. Our Sun is not particularly high mass, nor is our Sun particularly low mass; our Sun is intermediate in mass. Recall that our Sun has been fusing hydrogen into helium in its core for roughly five billion years, and our Sun will continue to fuse hydrogen into helium in its core for another roughly five billion years. Therefore, our Sun is not particularly young, nor is our Sun particularly old; our Sun is intermediate in age. In nearly every way imaginable, our Sun is an ordinary star.

The main sequence is a temperature sequence, a luminosity sequence, a radius (size) sequence, a mass sequence, and two more types of sequences that we will discuss shortly. We are compelled to ask the following question: is there any type of sequence that the main sequence is not? When the astronomers Hertzsprung and Russell first constructed the Hertzsprung-Russell diagram, they believed that the main sequence was an evolutionary sequence. In other words, they believed that supposedly stars are born hot, bright O-type stars, and supposedly stars cool as they shine, becoming B-type followed by A-type then F-type then G-type then K-type until finally they supposedly die cool, dim M-type stars. Today, we realize that this is completely incorrect. Stars do not evolve along the main sequence. Unfortunately, the astronomers Hertzsprung and Russell believed so strongly that the main sequence was an evolutionary sequence that they called the main sequence stars toward the upper left corner of the Hertzsprung-Russell diagram early-type stars, and they called the main sequence stars toward the lower right corner of the Hertzsprung-Russell diagram late-type stars. Most unfortunately, this incorrect nomenclature persists among astronomers and astrophysicists to the present day. For example, an astronomer or astrophysicist may refer to a K3V star as being earlier than a K5V star. As another example, an astronomer or astrophysicist may refer to an O9V star as being later than an O3V star. Since this incorrect nomenclature persists to the present day, we will also use this incorrect nomenclature in this course. To summarize, the main sequence is a temperature sequence, a luminosity sequence, a radius (size) sequence, a mass sequence, and two more types of sequences that we will discuss shortly. By these sequences, we mean that given any two stars on the main sequence, the star earlier in the sequence OBAFGKM will be hotter, more luminous, larger, and more massive, while the star later in the sequence OBAFGKM will be cooler, less luminous, smaller, and less massive. However, the main sequence is not an evolutionary sequence, even though we will refer to main sequence stars toward the left of the OBAFGKM sequence as being early-type and main sequence stars toward the right of the OBAFGKM sequence as being late-type. We emphasize this again: the main sequence is not an evolutionary sequence. If stars do not evolve along the main sequence, then how do stars actually evolve? How are stars actually born? How do stars actually live? How do stars actually die? This is the next major topic of this course, and our entire discussion of stellar evolution will be in the context of the Hertzsprung-Russell diagram.

Stellar Evolution: Birth, Life, and Death

Stars are born from a diffuse nebula, a giant cloud of gas many light-years across composed primarily of hydrogen and helium. The gases within a diffuse nebula are pushed by many different forces, including thermal pressures, gravitational forces, magnetic pressures, and even cosmic rays (ultra high-energy particles). All these different forces are comparable in strength with each other in interstellar space (the space between star systems). Thus, the gases within a diffuse nebula are pushed in seemingly random directions, causing some regions within the diffuse nebula to be more dense than average and other regions within the diffuse nebula to be less dense (or more tenuous) than average. Small regions within a diffuse nebula may become dense enough that gravity dominates over all other forces. Thus, those small regions of the diffuse nebula will collapse from their self-gravity (under their own weight). We can gain insight into how stars are born by considering only gravitational forces and thermal pressures. Note that this simplified argument ignores other forces, such as magnetic pressures and cosmic rays for example. Consider a self-gravitating cloud of gas with thermal pressures resulting from its own temperature. If this cloud of gas is more massive than a certain critical mass, then its self-gravity will dominate over its own thermal pressures, and the cloud will contract. If the cloud of gas is less massive than that critical mass, then its own thermal pressures will dominate over its self-gravity, and the cloud will expand. If the cloud of gas is equal in mass to this critical mass, then its self-gravity will balance its own thermal pressures, and the cloud will remain in equilibrium. This critical mass is called the Jeans limit, named for the British physicist James Jeans who first performed this simplified calculation. Even in this simplified analysis, note that the Jeans limit is not a particular amount of mass, since the Jeans limit itself depends upon the temperature as well as the density of the gas. In other words, the Jeans limit is actually a range of masses that depends upon the temperature and the density of the gas. As a result, there is a range of masses that a star can be born with, as we will discuss shortly. We again emphasize that this is a simplified analysis. A cloud of gas more massive than the Jeans limit may still not contract if magnetic pressures for example are sufficiently strong. Astrophysicists can measure the magnetic fields within a diffuse nebula from the polarization of starlight that passes through the nebula, and astrophysicists have discovered magnetic fields within regions of diffuse nebulae that are sufficiently strong to prevent the contraction of gas within those regions of the diffuse nebula. Nevertheless, if a small region of a diffuse nebula is dense enough for gravity to dominate over all other forces, then that small region of the diffuse nebula will contract, collapsing from its self-gravity (under its own weight). At first, the collapse does not significantly change the temperature of the gas, since the gas is so tenuous (low density) that its constituent particles are so far from one another that they almost never collide with one another. However, as the cloud continues to collapse, it becomes more and more dense and hence more and more opaque (less and less transparent). Eventually, the cloud becomes so dense that if it continues to collapse, its constituent particles begin to collide with one another more and more frequently, thus causing the collapsing cloud to become warmer. The collapsing cloud has now become sufficiently dense that it is able to convert gravitational energy into heat, which is Kelvin-Helmholtz (gravitational) contraction as we discussed. Although this collapsing cloud is not yet a star, we now call it a protostar beginning with this transition in density and hence increase in opacity (decrease in transparency). As a protostar continues to collapse, it becomes hotter and hotter due to Kelvin-Helmholtz (gravitational) contraction. These hotter temperatures cause greater thermal pressures, which push against the self-gravity of the protostar. Hence, the collapse of the protostar slows. This imbalance between gravitational forces and thermal pressures may cause pulsations within the protostar, causing its size to oscillate from large to small and back again. As a result, the luminosity of the protostar oscillates from bright to dim and back again. These protostars are called Tauri variable stars, which we will discuss later in the course. For now, if the protostar is sufficiently massive for its self-gravity to continue to dominate over all other forces, then it will continue to collapse, becoming hotter and hotter. Eventually, the protostar has collapsed to such a small size that its core temperature reaches millions of kelvins, and hydrogen begins fusing into helium. These nuclear fusion reactions provide an outward pressure to balance inward self-gravity. When the protostar attains gravitational equilibrium, we say that a star is born.

All stars are born main sequence stars. If all stars are born main sequence stars, then where do red giants and white dwarfs come from? These stars come from stellar death, as we will discuss shortly. For now, all stars are born main sequence stars, but where along the main sequence are stars born? With which spectral type, O, B, A, F, G, K, or M, is a star born? As we discussed, the Jeans limit is actually a range of masses. Hence, there is a range of many different masses a star can be born with, and it is the mass that a star is born with that determines the spectral type of the star. In fact, the mass of a star is the single most important physical quantity of a star. The mass of a star determines how it will be born, how it will live, and how it will die. We will discuss stellar life and stellar death shortly. For now, if a star happens to be born with high mass because it had to overcome a large Jeans limit, then it will be born early on the main sequence, perhaps O-type or B-type. If a star happens to be born with low mass because it had to overcome a small Jeans limit, then it will be born late on the main sequence, perhaps K-type or M-type. If a star happens to be born with intermediate mass because it had to overcome an intermediate Jeans limit, then it will be born roughly in the middle of the main sequence, perhaps A-type, F-type, or G-type. In brief, the mass a star is born with determines its spectral type on the main sequence. Our argument is as follows. If a star happens to be born with high mass, it will have strong self-gravity. Therefore, a strong outward pressure is necessary to balance that strong inward self-gravity, and there will be a correspondingly hot temperature associated with that strong pressure. Hence, the star will be born hot and bright. If a star happens to be born with low mass, it will have weak self-gravity. Therefore, a weak outward pressure is necessary to balance that weak inward self-gravity, and there will be a correspondingly cool temperature associated with that weak pressure. Hence, the star will be born cool and dim. This explains why the main sequence is a temperature sequence, a luminosity sequence, and a mass sequence. High-mass stars must be born hot and bright to provide the strong outward pressure necessary to balance the strong inward self-gravity created by its high mass, while low-mass stars must be born cool and dim to provide the weak outward pressure necessary to balance the weak inward self-gravity created by its low mass.

There is an upper limit of mass that a star is permitted to be born with. This limit is called the Eddington limit, named for the British physicist Arthur Eddington who first calculated this upper mass limit. The Eddington limit is roughly equal to 100M_☉ (one hundred solar masses or one hundred times the mass of our Sun). If a protostar happens to have a mass greater than this Eddington limit, then the outward radiation pressure generated by its incredible luminosity will not just balance its inward self-gravity; that enormous outward radiation pressure will overpower its inward self-gravity. The protostar collapses at first as usual, but the enormous outward radiation pressure eventually halts the collapse and actually forces the protostar to expand. Essentially, the protostar blows itself apart before it could ever be born a main sequence star. Indeed, astronomers have never discovered a star with a mass significantly greater than roughly 100M_☉ (one hundred solar masses or one hundred times the mass of our Sun). This Eddington limit defines the beginning of the main sequence. The earliest main sequence star has spectral-luminosity type O0V, and these stars have a mass roughly equal to the Eddington limit of roughly 100M_☉ (one hundred solar masses or one hundred times the mass of our Sun). If a protostar happens to have a mass greater than the Eddington limit, it will blow itself apart before it can even be born a main sequence star. If a protostar happens to have a mass less than the Eddington limit, it will be born a main sequence star, fusing hydrogen into helium in its core.

There is a lower limit of mass that a star is permitted to be born with, roughly equal to 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun). If a protostar happens to have a mass less than this lower limit, then its self-gravity will be so weak that the outward pressure necessary to balance its weak inward self-gravity is also extraordinarily weak. The corresponding temperature is so cool that nuclear fusion is never ignited in the core. The protostar does eventually stop collapsing and attains gravitational equilibrium with outward pressure balancing inward self-gravity, but the outward pressure is not provided by the nuclear fusion of hydrogen into helium. The outward pressure is provided by electron degeneracy pressure, which we will discuss in detail shortly. In this course, we strictly define a main sequence star as a star that fuses hydrogen into helium in its core. Therefore, main sequence stars are also called hydrogen-burning stars. The use of the word burning is technically incorrect, since the word burning implies chemical reactions instead of nuclear reactions. Nevertheless, astronomers and astrophysicists use this word burning not just for hydrogen fusing into helium but for any nuclear reaction. Again, the strict definition of a main sequence star is a hydrogen-burning star, a star that fuses hydrogen into helium in its core. If a protostar happens to have a mass less than 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun), then it will not be born a main sequence star. The protostar becomes a very low mass sphere of mostly hydrogen and helium that is not hot enough to fuse hydrogen into helium in its core. These are called brown dwarf stars, although they are not strictly stars. The simple term brown dwarf instead of the term brown dwarf star would be more correct. This lower limit of 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun) defines the end of the main sequence. The latest main sequence star has spectral-luminosity type M9V, and these stars have a mass roughly equal to 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun). We can actually plot brown dwarfs on the Hertzsprung-Russell diagram. Since brown dwarfs are less massive and cooler and dimmer and smaller than even M9V stars at the end of the main sequence, brown dwarfs would be further to the right (since they are cooler) and further down (since they are dimmer) than the end of the main sequence. These brown dwarfs even have their own spectral type; brown dwarfs are classified as L-type stars, even cooler and dimmer than M-type main sequence stars. Therefore, a more complete listing of spectral types in the correct order from hottest to coolest is OBAFGKML. These L-type stars (brown dwarfs) are sufficiently cool that they radiate more infrared light and less visible light as compared with main sequence (hydrogen-burning) stars. If a protostar happens to have a mass greater than 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun), then the protostar will be born a main sequence star, fusing hydrogen into helium in its core. If a protostar happens to have a mass less than 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun), then the protostar will be born a brown dwarf, a very low mass sphere of mostly hydrogen and helium that is not hot enough to fuse hydrogen into helium in its core. These brown dwarf stars should sound familiar. A gas-giant planet is a sphere of mostly hydrogen and helium that is much smaller and much less massive than a star and is not hot enough to fuse hydrogen into helium in its core, as we discussed earlier in the course. We suspect that the term brown dwarf star is simply another name for gas-giant planet. Indeed, there is virtually no difference between a brown dwarf star and a gas-giant planet. The only difference is the circumstances of their formation (their birth). If the object formed from a collapsing cloud of gas within a diffuse nebula, then we name it a brown dwarf star. If the object formed within the protoplanetary disk orbiting a true main sequence star, then we name it a gas-giant planet. Other than their formation (how they are born), there is virtually no difference between a brown dwarf star and a gas-giant planet. However, many students then conclude that Jupiter is a failed star. These students argue that if Jupiter had been just a little more massive that it would have become a true main sequence star, resulting in us living in a binary star system. (We will discuss binary star systems shortly.) This conclusion is false. The minimum mass necessary to become a true main sequence star is roughly 0.08M_☉ (0.08 solar masses or eight percent the mass of our Sun), but Jupiter has a mass of only 0.001M_☉ (one-thousandth of a solar mass), as we discussed earlier in the course. The ratio between 0.08 and 0.001 is eighty. Hence, the minimum mass necessary to become a true main sequence star is roughly eighty jovian masses. In other words, Jupiter only has a very small fraction (one-eightieth) of the mass necessary to become a true main sequence star. Thus, the mass of Jupiter is not close to the minimum mass necessary to become a true main sequence star. Therefore, Jupiter should certainly be regarded as a gas-giant planet, not as a failed star. On the other hand, Jupiter might be incorrectly regarded as a brown dwarf star by intelligent alien lifeforms living billions of years from now, as we will discuss.

As a protostar collapses, it spins faster and faster in accordance with the Law of Conservation of Angular Momentum. The amount by which a protostar collapses is so tremendous that we can easily calculate that incredibly strong centrifugal forces should rip apart all protostars during their collapse, thus preventing any stars from ever forming. Since stars obviously are born, protostars must lose angular momentum as they collapse. Firstly, a protoplanetary disk forms around the protostar from which planets will eventually form, as we discussed earlier in the course. Most of the angular momentum of the collapsing gas resides in the material orbiting around the protostar, not in the protostar itself. In the case of our own Solar System for example, although the mass of the Sun accounts for roughly 99.9 percent of the total mass of the entire Solar System as we discussed earlier in the course, the rotational angular momentum of the Sun accounts for less than four percent of the total angular momentum of the entire Solar System. In other words, the orbiting planets account for more than ninety-six percent of the total angular momentum of the entire Solar System. Secondly, the magnetic field of a protostar strengthens as it collapses. This strengthening magnetic field ejects ionized gases at fast speeds, and these ejected ionized gases carry angular momentum away from the protostar. These ionized gases are often ejected as narrow columns or jets near the angular momentum axis of the forming star system, and these jets illuminate surrounding gases as the fast-moving jets collide with the surrounding gases. These illuminated gases, together with the colliding ionized gases ejected from the young star system, are called Herbig-Haro objects or HH objects for short, named for the American astronomer George Herbig and the Mexican astronomer Guillermo Haro who discovered them. Although protostars lose most of their angular momentum through these mechanisms, they nevertheless collapse by such tremendous amounts that they do rotate faster as they collapse. A protostar eventually rotates so fast that the centrifugal force becomes sufficiently strong that the protostar usually rips itself apart into two protostars. These two protostars remain orbiting each other, and both protostars are eventually born as two main sequence stars orbiting each other. This process is called fragmentation, and it results in a binary star system: two stars orbiting each other with possibly planets orbiting both of the stars. Most star systems are binary star systems, since fragmentation usually occurs from the strong centrifugal forces from the fast rotation due to the tremendous collapse in size of the protostar. Fragmentation can be even more severe, resulting in three protostars that are eventually born as a trinary star system: three stars orbiting each other with possibly planets orbiting all three stars. Usually, the two more massive stars orbit each other on a tighter orbit while the least massive third star orbits those two stars along a larger orbit. Planets would then orbit all three stars on even larger orbits. The closest star system to our Solar System, the α (alpha) Centauri star system, happens to be a trinary star system. This star system is slightly more than one parsec distant from our Solar System, and the three stars are named α (alpha) Centauri A, α (alpha) Centauri B, and α (alpha) Centauri C. The star α (alpha) Centauri A happens to be a G2V star, just like our own Sun. The star α (alpha) Centauri C happens to be closer to our Solar System than the other two stars. Hence, α (alpha) Centauri C is the closest star to us, besides the Sun of course. For this reason, the star α (alpha) Centauri C is also called Proxima Centauri, since the word proximity means near or close. Other nearby star systems include the Barnard star system nearly two parsecs distant, the Luhman 16 binary star system roughly two parsecs distant, the Wolf 359 star system roughly 2.4 parsecs distant, and the Sirius binary star system nearly three parsecs distant. We will discuss the Sirius binary star system in more detail shortly. Fragmentation may result in a quadruplet star system: four stars orbiting each other with possibly planets orbiting all four stars. Often, two of the stars orbit each other on one tight orbit, the other two stars orbit each other on another tight orbit, and both pairs of stars orbit each other on a larger orbit. Essentially, a quadruplet star system is often a double binary star system. Planets would then orbit all four stars on even larger orbits. Fragmentation may result in quintuplet star systems (five stars orbiting each other with possibly planets orbiting all five stars), sextuplet star systems (six stars orbiting each other with possibly planets orbiting all six stars), and so on and so forth. All such star systems are rare. Again, most star systems are binary star systems, with a fair number of star systems as single-star star systems, such as our own Solar System. We discussed that our Sun is an ordinary star in several respects. However, there is one thing unusual about our Sun: it did not suffer from fragmentation while it was being born as a protostar, since it is the only star in our Solar System. This is unusual, since protostars usually suffer from fragmentation, as we discussed. Since fragmentation usually occurs, a high mass protostar will usually become fragmented into lower mass protostars. Hence, high mass main sequence stars are less abundant (more rare), while low mass main sequence stars are more abundant (more common). We conclude that the main sequence is also a population-abundance sequence, meaning earlier main sequence stars (hotter, more luminous, larger, more massive main sequence stars) are less abundant (more rare), while later main sequence stars (cooler, less luminous, smaller, less massive main sequence stars) are more abundant (more common). Therefore, most stars are born M-type main sequence stars. Many stars are also born K-type main sequence stars, but not as commonly as M-type main sequence stars. A fair number of stars are born G-type main sequence stars. Few stars are born F-type main sequence stars, and even fewer stars are born A-type main sequence stars. A small fraction of all stars are born B-type main sequence stars, and only a tiny fraction of stars are born O-type main sequence stars. To summarize, the main sequence is a temperature sequence, a luminosity sequence, a radius (size) sequence, a mass sequence, a population-abundance sequence, and one more type of sequence that we will discuss shortly. By these sequences, we mean that given any two stars on the main sequence, the star earlier in the sequence OBAFGKM will be hotter, more luminous, larger, more massive, and less abundant (more rare), while the star later in the sequence OBAFGKM will be cooler, less luminous, smaller, less massive, and more abundant (more common). Caution: the main sequence is not an evolutionary sequence!

After a main sequence star is born, it spends its life fusing hydrogen into helium in its core. The duration of time that a main sequence star spends fusing hydrogen into helium in its core depends upon its mass. Again, we see that the mass of a star is the single most important physical quantity of a star. The mass of a star determines how it will be born, how it will live, and how it will die. We have already discussed stellar birth, and we will discuss stellar death shortly. For now, the mass of a star determines the duration of its main sequence (hydrogen-burning) lifetime. Many students argue that high mass stars should live longer lives, since they have more mass and therefore more hydrogen to use as fuel for the nuclear fusion reactions in the core. These students also argue that low mass stars should live shorter lives, since they have less mass and therefore less hydrogen to use as fuel for the nuclear fusion reactions in the core. Although this argument seems reasonable, it is completely wrong. In fact, the opposite is true: high mass main sequence stars have shorter lifetimes, while low mass main sequence stars have longer lifetimes. Firstly, simple calculations using the basic properties of stars reveal that this must be the case. Early-type main sequence stars may have more mass and therefore more hydrogen to use as fuel for the nuclear fusion reactions in the core, but early-type main sequence stars are also much more luminous. This luminosity is ultimately coming from the nuclear reactions in the core, and so we conclude that the nuclear reactions are proceeding at a faster rate. Late-type main sequence stars may have less mass and therefore less hydrogen to use as fuel for the nuclear fusion reactions in the core, but late-type main sequence stars are also much less luminous. Again, this luminosity is ultimately coming from the nuclear reactions in the core, and so we conclude that the nuclear reactions are proceeding at a slower rate. These conclusions we have drawn from simple calculations are consistent with more complex calculations. Nuclear reactions should proceed at faster rates at hotter temperatures, and nuclear reactions should proceed at slower rates at cooler temperatures. (The same is also true for chemical reactions.) High mass main sequence stars are so hot that the nuclear fusion reactions proceed so quickly that these high mass stars burn through all their hydrogen in an extremely brief amount of time, even though they have much more hydrogen to burn than low mass stars. Low mass main sequence stars are so cool that the nuclear fusion reactions proceed so slowly that it takes these low mass stars an extremely long amount of time to burn through their hydrogen, even though they have much less hydrogen to burn than high mass stars. Many students argue that the lifetime of a high mass star may be relatively shorter but must in fact be actually longer since it has much more hydrogen to burn. These students also argue that the lifetime of a low mass star may be relatively longer but must in fact be actually shorter since it has much less hydrogen to burn. This argument is again false. High mass main sequence stars are so hot with such tremendous luminosity that the nuclear fusion reactions proceed so quickly that their actual lifetime is truly shorter, even though these high mass stars have much more hydrogen to burn. Low mass main sequence stars are so cool with such little luminosity that the nuclear fusion reactions proceed so slowly that their actual lifetime is truly longer, even though these low mass stars have much less hydrogen to burn. The following analogy is helpful. Most students believe that a rich person who earns millions of dollars per year will be able to survive much longer than a poor slob who only earns a few thousand dollars per year, but in fact the opposite is true. A rich person who earns millions of dollars per year almost always spends their money at a furious pace. The rich spend their money so fast that they burn through their money in a short amount of time, even though they have more money to burn. A poor slob who earns only a few thousand dollars per year hardly has any money to spend. Hence, the poor spend their money at such a slow pace that they can survive their entire lifetimes on their miserable salaries. Indeed, this is actually the case; the highest bankruptcy rates are among the rich, not among the poor. Similarly, high mass main sequence stars burn through their hydrogen more quickly even though they have more hydrogen to burn, while low mass main sequence stars burn through their hydrogen more slowly even though they have less hydrogen to burn. O-type main sequence stars are so hot and so luminous that their nuclear fusion reactions proceed so quickly that they burn through their hydrogen in an incredibly short amount of time even though they have much more hydrogen to burn; the main-sequence lifetime of an O-type star is roughly one million years, incredibly short by astronomical terms. B-type main sequence stars are also hot enough and luminous enough that their nuclear fusion reactions proceed so quickly that they burn through their hydrogen in a very short amount of time, although they are not as hot and not as luminous as O-type stars. Therefore, B-type stars live somewhat longer than O-type stars; the main-sequence lifetime of a B-type star is roughly ten million years, still short by astronomical terms. A-type stars are also hot and luminous, but not as hot and not as luminous as O-type or B-type stars. Therefore, their nuclear fusion reactions proceed somewhat more slowly, giving A-type stars a somewhat longer main-sequence lifetime of roughly one hundred million years. F-type stars are not as hot and not as luminous as A-type stars; therefore, their nuclear fusion reactions proceed somewhat more slowly, giving F-type stars a somewhat longer main-sequence lifetime of roughly one billion years. G-type stars have an even longer main-sequence lifetime of roughly ten billion years. Recall that our Sun is a G2V star. Also recall that our Sun has been fusing hydrogen into helium in its core for roughly five billion years, and also recall that our Sun will continue fusing hydrogen into helium in its core for the next roughly five billion years. Therefore, the entire main-sequence lifetime of our Sun is roughly ten billion years, as it should be for a G-type main sequence star. K-type stars are so cool and so dim that their nuclear fusion reactions proceed so slowly that the main-sequence lifetime of a K-type star is roughly one hundred billion years. This is longer than the current age of the universe, which is only roughly fourteen billion years. Therefore, every K-type star that has ever been born has not died yet. We must wait at least an additional roughly eighty-six billion years before K-type stars begin to die. Finally, M-type stars are so cool and so dim that their nuclear fusion reactions proceed so slowly that the main-sequence lifetime of an M-type star is roughly one trillion years, much much longer than the current age of the universe. Therefore, every M-type star that has ever been born has not died yet. We must wait countless billions of years before any M-type stars begin to die. In brief, the main sequence is a lifetime sequence. Given any two stars on the main sequence, the star earlier in the sequence OBAFGKM will have a shorter main-sequence lifetime, while the star later in the sequence OBAFGKM will have a longer main-sequence lifetime. Brown dwarf stars are so cool that they do not fuse hydrogen into helium in their cores. Consequently, brown dwarf stars do not expend their hydrogen, and so we may regard brown dwarf stars as living indefinitely.

We will discuss how the mass of a star determines its death shortly. For now, we briefly mention that the mass of a star not only determines its main-sequence lifetime but the duration of its death as well. The main-sequence lifetime of low-mass stars is in the billions of years, while the main-sequence lifetime of high-mass stars is only in the millions of years. The processes involved with stellar death are shorter in duration as compared with a star’s main-sequence lifetime, but these shorter durations are in approximate proportion with the corresponding main-sequence lifetimes. In particular, the death of a low-mass star is millions of years in duration, while the death of a high-mass star is only thousands of years in duration. The mass of a star even determines its protostar-lifetime. In particular, the collapse of a low-mass protostar is millions of years in duration, while the collapse of a high-mass protostar is only thousands of years in duration. Note that in all cases, the main-sequence lifetime of a star is overwhelmingly longer than the duration of its birth as a collapsing protostar and overwhelmingly longer than the duration of its death. The main-sequence lifetime of a star is so overwhelmingly longer than its birth and its death that the total lifetime of a star may be regarded as its main-sequence lifetime as an excellent approximation. Note that the total lifetime of a high mass star is in the millions of years, but it takes that long just for a low-mass protostar to collapse. In other words, a high-mass star could be born, could live its entire main-sequence lifetime, and could die all in a time shorter than the time it takes a low-mass protostar to collapse, meaning that a high-mass star is born, lives, and dies even before a low-mass star can even be born!

Strictly, there are gradual changes in the luminosity, the temperature, and even the radius (the size) of a star over its main-sequence lifetime. Nevertheless, these main-sequence changes are small as compared with the changes in these quantities during stellar birth, when the changes are much more severe. Also, main-sequence changes are small as compared with the changes in these quantities during stellar death, when the changes are also much more severe, as we will discuss shortly. Therefore, we will regard the luminosity, the temperature, and the radius (the size) of a star as approximately constant (or fixed) over its main-sequence lifetime as a satisfactory approximation. Since the main-sequence lifetime of a star is overwhelmingly longer than its birth and its death, we conclude that a star remains at approximately the same location on the main sequence on the Hertzsprung-Russell diagram during most of its life. This validates our comparison of temperatures, luminosities, radii (sizes), masses, population abundances, and lifetimes of main sequence stars as physically meaningful. It is therefore appropriate to summarize the sequences of physical quantities by spectral-type along the main sequence. The main sequence is a temperature sequence, a luminosity sequence, a radius (size) sequence, a mass sequence, a population-abundance sequence, and a lifetime sequence. By these sequences, we mean that given any two stars on the main sequence, the main sequence star earlier in the sequence OBAFGKM will be hotter, more luminous, larger, more massive, less abundant (more rare), with a shorter main-sequence lifetime, while the main sequence star later in the sequence OBAFGKM will be cooler, less luminous, smaller, less massive, more abundant (more common), with a longer main-sequence lifetime. Warning: all of these conclusions can only be drawn if both stars are on the main sequence. If even one of the two stars is not on the main sequence, we cannot easily make any comparisons between the two stars. Finally, the main sequence is not an evolutionary sequence, as we are currently discussing. In actuality, a star can be born anywhere along the main sequence depending on its mass, and a star will remain at its particular location on the main sequence on the Hertzsprung-Russell diagram throughout its main-sequence lifetime, fusing hydrogen into helium in its core. When stars die, they actually evolve off of the main sequence, as we now discuss.

Stellar death is defined to begin when a main sequence star has exhausted the hydrogen in its core, having fused the hydrogen into helium. Without hydrogen in its core to fuse into helium, the star’s main-sequence (hydrogen-burning) lifetime has ended, and stellar death begins. For the purposes of stellar death, we divide all main sequence stars into two categories: low mass main sequence stars and high mass main sequence stars. A low mass main sequence star has a mass less than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses or seven, eight, or nine times the mass of our Sun). A high mass main sequence star has a mass greater than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses or seven, eight, or nine times the mass of our Sun). Note that our Sun is a low mass main sequence star as far as stellar death is concerned, since the mass of our Sun is 1M_☉ (one solar mass), and one is less than seven, eight, or nine! In terms of spectral types, low mass main sequence stars have spectral types A, F, G, K, or M, while high mass main sequence stars have spectral types O or B. Again, our Sun is a G2V star, which falls into the low mass main sequence category. The vast majority of all main sequence stars are low mass; only a very small fraction of all main sequence stars are high mass. We divide all main sequence stars into these two categories because low mass death and high mass death are sufficiently different that we must discuss them separately. Actually, low mass death and high mass death are somewhat similar to each other. High mass death is simply more violent as compared with low mass death. In other words, low mass death is more gentle as compared with high mass death. Since the vast majority of all main sequence stars are low mass, most stars die gently. Since only a very small fraction of all main sequence stars are high mass, few stars die violently. Even though high mass death is rare, we must devote a thorough discussion to high mass death, since we owe our very existence to violent high mass death, as we will discuss shortly. Nevertheless, we begin our discussion with low mass death, since the vast majority of all stars die gently, including our own Sun.

Low mass stars have long main-sequence lifetimes. After exhausting the hydrogen in the core, the nuclear fusion reactions end. Thus, there is no outward pressure to balance the inward self-gravity of the helium core. Hence, the helium core begins to collapse under its self-gravity. As the helium core collapses, it becomes hotter, since it is converting gravitational energy into heat. A layer of hydrogen around that collapsing helium core becomes hot enough to itself fuse into helium. This fusion layer around the collapsing helium core provides pressure that pushes the outer layers of the star further outward. If the outer layers expand, then they must become cooler. The core of the star and the outer layers of the star are doing two opposite things at the same time! The core collapses and becomes hotter, while the outer layers of the star expand and become cooler! We can only observe the outer layers of a star; the inner layers of a star are hidden beneath its outer layers. Hence, we observe the outer layers of the star become larger and cooler. Cooler temperatures correspond to redder colors. Therefore, the star becomes larger and redder. In other words, the star has become a red giant. As we discussed, all stars are born main sequence stars, while red giants are essentially dying stars. More correctly, the outer layers of the star gradually expand and cool over millions of years, turning the star from a main sequence star to an orange subgiant star to a red giant star. Although this gradual expansion over millions of years seems long as compared with human timescales, this expansion is relatively short as compared with the billions of years the star spent as a main sequence star. The imbalance between gravitational forces and thermal pressures during the expansion from a main sequence star to an orange subgiant star to a red giant star may cause pulsations within the star, causing its size to oscillate from large to small and back again. As a result, the luminosity of the star oscillates from bright to dim and back again. These stars are called Cepheid variable stars, which we will discuss later in the course. The helium core continues to collapse, becoming hotter. Eventually, the helium core becomes so hot that helium nuclei begin fusing into heavier nuclei, in particular carbon nuclei. This is called helium burning, although again the use of the word burning is incorrect nomenclature. The moment when helium begins fusing into carbon is called the helium flash. The nuclear fusion of helium into carbon is more properly written 3 → energy + . This nuclear reaction is called the triple-alpha process, since three helium nuclei (three alpha particles) fuse into a carbon nucleus. Note that the electromagnetic repulsion between electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower the electromagnetic repulsion between two helium nuclei (two alpha particles) each having two positive protons is hotter than the temperature necessary to overpower the electromagnetic repulsion between two hydrogen nuclei (two protons), each having one positive proton. In the case of helium-helium fusion, the electromagnetic repulsion is proportional to two times two, which is four. In the case of hydrogen-hydrogen fusion, the electromagnetic repulsion is proportional to one times one, which is one. Four is significantly greater than one, meaning more electromagnetic repulsion. Hence, a hotter temperature is required for helium-helium fusion (the basis of the triple-alpha process) as compared with hydrogen-hydrogen fusion (the basis of the proton-proton cycle). The helium flash causes a small expansion of the core and hence a slight decrease in the core temperature. This in turn causes the outer layers of the star to contract and warm. This imbalance between gravitational forces and thermal pressures may cause pulsations within the star, causing its size to oscillate from large to small and back again. As a result, the luminosity of the star oscillates from bright to dim and back again. These stars are called Lyrae variable stars, which we will discuss later in the course. Eventually, the entire star attains a new gravitational equilibrium as a helium-burning star, although note that there is a layer of hydrogen fusing into helium around the core where helium fuses into carbon. The helium-burning lifetime of the star is much shorter than its hydrogen-burning (main-sequence) lifetime, since helium fusing into carbon occurs at much hotter temperatures than hydrogen fusing into helium. The star spends millions of years as a helium-burning star. Although this seems long as compared with human timescales, these millions of years as a helium-burning star is relatively short as compared with the billions of years the star spent as a hydrogen-burning (main sequence) star. Eventually, the core exhausts the helium in its core, ending the triple-alpha process. Again, there is no outward pressure to balance the inward self-gravity of the carbon core. Hence, the core again collapses, becoming hotter. A layer of helium around that collapsing carbon core becomes hot enough to fuse into carbon, and a layer of hydrogen around that helium-burning layer becomes hot enough to fuse into helium. These two fusion layers around the collapsing carbon core provide pressure that again pushes the outer layers of the star further outward, causing the outer layers to become cooler and hence redder. The star has become a red giant a second time! The imbalance between gravitational forces and thermal pressures during the expansion from a helium-burning star to a red giant star may cause pulsations within the star, causing its size to oscillate from large to small and back again. As a result, the luminosity of the star oscillates from bright to dim and back again. These stars are called Mira variable stars, which we will discuss later in the course. Since the star is low mass, its self-gravity is too weak to compress the carbon core sufficiently to ignite the nuclear fusion of carbon nuclei into even heavier nuclei. In other words, a carbon flash does not occur. Hence, the outer layers of the star continue to expand until they become divorced from the very small, very hot carbon core. The outer layers have become a slowly expanding shell of gas. This is called a planetary nebula, which is a truly incorrect term since a planetary nebula has nothing to do with planets! The planetary nebula exposes the very small, very hot carbon core. This naked core is very small and very hot since it has collapsed twice. We might suspect that this naked core is intrinsically bright, since it is so hot. However, this naked core is very small; it is roughly the size of the Earth! According to the Stefan-Boltzmann law, such a small size results in a low luminosity, even though the temperature is hot. Therefore, this naked core is small, hot, and intrinsically dim. The naked core has become a white dwarf. As we discussed, all stars are born main sequence stars, while red giants and white dwarfs result from stellar death. In summary, a low-mass main sequence (hydrogen-burning) star dies by first becoming a red giant, enters a helium-burning phase, becomes a red giant a second time, and finally dies as a slowly expanding planetary nebula surrounding a white dwarf. White dwarfs have incredible densities, since they have roughly the mass of our Sun squeezed into roughly the size of the Earth. The radius of the Earth, and therefore the radius of a white dwarf, is roughly 0.01R_☉ (one-hundredth of a solar radius or one-hundredth the radius of our Sun). Therefore, the volume of the Earth, and therefore the volume of a white dwarf, is roughly one-millionth the volume of our Sun. With roughly the mass of the Sun squeezed into roughly one-millionth the volume of the Sun, white dwarfs therefore have densities roughly one million times normal densities! White dwarfs also have sufficiently hot surface temperatures to radiate a fair amount of ultraviolet light. The gases of the surrounding planetary nebula absorb some of these ultraviolet photons radiated by the hot white dwarf, bringing the electrons within these gases to higher energy quantum states. The electrons then transition back down to lower energy quantum states, emitting visible light photons. As a result, a planetary nebula surrounding a white dwarf often displays a variety of beautiful colors. The planetary nebula continues to expand, becoming cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the planetary nebula return to the interstellar medium. The interstellar medium (which astrophysicists always abbreviate ISM) is the very diffuse gas that fills the Milky Way Galaxy. In fact, a nebula is actually a part of the interstellar medium where densities are greater than the average densities of most of the gas of the interstellar medium. As we discussed, stars are born from within a diffuse nebula; therefore, stars are born from the interstellar medium. Low mass stars live their lives fusing hydrogen into helium, begin dying by fusing helium into carbon, and finally die by returning the gas of its outer layers back to the interstellar medium. These gases may someday form a new diffuse nebula from which new stars will be born. Hence, stellar evolution is actually a cycle, since stellar death ultimately leads to stellar birth again. Beautiful examples of planetary nebulae include the Ring Nebula in the constellation Lyra (the harp), the Little Ghost Nebula in the constellation Ophiuchus (the serpent bearer), and the Helix Nebula in the constellation Aquarius (the water bearer). The white dwarf at the center of a planetary nebula spends billions of years becoming cooler and cooler and hence dimmer and dimmer. After many more billions of years, a white dwarf becomes so cool and so dim that it is renamed a black dwarf.

We subdivide low mass main sequence stars into two subcategories: ordinary low mass stars and very low mass stars. Ordinary low mass main sequence stars have masses from 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses) down to roughly 0.5M_☉ (one-half of one solar mass). Very low mass main sequence stars have masses from roughly 0.5M_☉ (one-half of one solar mass) all the way down to the lower limit of all main sequence stars of roughly 0.08M_☉ (0.08 solar masses). Note that our Sun is an ordinary low mass star, since the mass of our Sun is 1M_☉ (one solar mass), and one is between one-half and seven, eight, or nine! In terms of spectral types, ordinary low mass main sequence stars have spectral types of A, F, or G, while very low mass main sequence stars have spectral types of K or M. Recall that our Sun is a G2V star, again placing our Sun into the ordinary low mass subcategory. The stellar death we have discussed thus far strictly applies to ordinary low mass main sequence stars, like our Sun. Very low mass main sequence stars die somewhat differently. A very low mass star spends countless billions of years fusing hydrogen into helium in its core. After exhausting the hydrogen in its core, the nuclear fusion reactions end. Thus, there is no outward pressure to balance the inward self-gravity of the helium core. Hence, the helium core begins to collapse under its self-gravity. As the helium core collapses, it becomes hotter, since it is converting gravitational energy into heat. A layer of hydrogen around that collapsing helium core becomes hot enough to fuse into helium. This fusion layer around the collapsing helium core provides pressure that pushes the outer layers of the star further outward. If the outer layers expand, then they must become cooler. Again, the core of the star and the outer layers of the star are doing two opposite things at the same time. The core collapses and becomes hotter, while the outer layers of the star expand and become cooler. Again, we can only observe the outer layers of a star; the inner layers of a star are hidden beneath its outer layers. Hence, we observe the outer layers of the star become larger and cooler. Cooler temperatures correspond to redder colors. Therefore, the star becomes larger and redder. In other words, the outer layers of the star gradually expand and cool over millions of years, turning the star from a main sequence star to a subgiant star to a giant star. The death of very low mass stars seems identical with the death of ordinary low mass stars, but now the differences begin. Very low mass stars have such weak self-gravity that they cannot compress their cores to reach the threshold temperatures at which helium fuses into carbon. In other words, the helium flash never occurs, and the star only becomes a red giant once instead of twice. The outer layers of the star continue to expand, eventually becoming a planetary nebula surrounding a helium white dwarf instead of a carbon white dwarf. As we discussed, every K-type or M-type main sequence star that has ever been born has not died yet. Hence, there are no helium white dwarfs in the entire universe as of yet. We must wait at least an additional roughly eighty-six billion years before K-type stars begin to die. There are more K-type stars than A-type stars, F-type stars, or G-type stars, since the main sequence is a population-abundance sequence. Hence, when K-type stars begin to die, helium white dwarfs will become the majority of the white dwarfs in the universe, turning the carbon white dwarfs into a minority of the white dwarfs in the universe. Countless billions of years after that, M-type main sequence stars will begin to die, and there are even more M-type main sequence stars than K-type main sequence stars, since again the main sequence is a population-abundance sequence. Hence, when M-type stars begin to die, helium white dwarfs will become the overwhelming majority of all white dwarfs in the universe, while carbon white dwarfs will become an overwhelming minority of all white dwarfs in the universe.

Ordinary low mass stars do not have sufficient self-gravity to compress their carbon cores to sufficient temperatures for the carbon flash to occur. Hence, an ordinary low mass star dies as a non-burning carbon white dwarf surrounded by a slowly expanding planetary nebula. Very low mass stars have such weak self-gravity that not even the helium flash occurs. Hence, a very low mass star dies as a non-burning helium white dwarf surrounded by a slowly expanding planetary nebula. In either case, low mass stars die as a slowly expanding planetary nebula surrounding a non-burning white dwarf. If there are no nuclear reactions occurring in a white dwarf, what is providing the outward pressure to balance the inward self-gravity to keep a white dwarf in gravitational equilibrium? As we discussed, white dwarfs have densities roughly one million times normal densities. At such incredible densities, electrons are squeezed close to each other. However, electrons obey the Pauli Exclusion Principle, named for the Austrian physicist Wolfgang Pauli who first formulated this fundamental statement of Quantum Mechanics. According to the Pauli Exclusion Principle, certain quantum-mechanical particles are forbidden from occupying the same quantum state at the same time. Thus, any attempt to squeeze such particles into the same quantum state will result in a pressure against this compression. This pressure is called degeneracy pressure. Electrons are one type of quantum-mechanical particle that obey the Pauli Exclusion Principle. In other words, electrons are forbidden from occupying the same quantum state at the same time. It is because of this exclusion that electrons within atoms must occupy higher energy states when lower energy states happen to be already filled with electrons. It is the electrons in the higher energy quantum states of atoms that participate in chemical reactions and chemical bonding. Therefore, all of chemistry, including all of the biochemistry essential for all life, would not occur if electrons did not obey the Pauli Exclusion Principle. Also since electrons obey the Pauli Exclusion Principle, it is electron degeneracy pressure that provides the outward pressure to balance the inward self-gravity of a white dwarf. Electron degeneracy pressure also provides the outward pressure to balance the inward self-gravity of brown dwarfs. Many students argue that this electron degeneracy pressure must come from the electromagnetic repulsion of the electrons. As we discussed earlier in the course, like charges repel, and unlike charges attract. Since electrons are negatively charged, they must repel each other electromagnetically, and students argue that this is the source of the electron degeneracy pressure. Although this argument seems reasonable, it is nevertheless wrong. Electron degeneracy pressure has nothing to do with electromagnetic repulsion. Of course, the electromagnetic repulsion of the electrons provides some extra pressure in addition to the electron degeneracy pressure. However, electron degeneracy pressure has nothing to do with the charge of electrons. The source of electron degeneracy pressure is the spin of the electrons. The spin of any quantum-mechanical particle is its intrinsic angular momentum. As a crude picture, we can imagine that the electron is spinning or turning around an axis. According to Quantum Mechanics, it is this spinning of the electron around an axis that is the source of the electron degeneracy pressure. We will discuss another type of degeneracy pressure shortly that will beautifully emphasize how degeneracy pressure has nothing to do with electromagnetic repulsion. To summarize, white dwarfs (as well as brown dwarfs) remain in gravitational equilibrium not due to nuclear reactions but due to electron degeneracy pressure, which arises because the Pauli Exclusion Principle prevents electrons (and certain other quantum-mechanical particles) from occupying the same quantum state at the same time. Degeneracy pressure has nothing to do with electromagnetic repulsion; degeneracy pressure arises from the intrinsic angular momentum (the spin) of certain quantum-mechanical particles.

It is instructive to discuss how our particular Solar System will die. Our Sun is an ordinary low mass star with a main sequence (hydrogen-burning) lifetime of roughly ten billion years. Our Sun has spent roughly five billion years fusing hydrogen into helium in its core, and our Sun will spend an additional roughly five billion years fusing hydrogen into helium in its core. After exhausting the hydrogen in its core, our Sun will begin to die. Gradually over millions of years (which is brief as compared with its ten-billion-year main-sequence lifetime), our Sun’s helium core will collapse and become hotter while its outer layers expand and become cooler, turning our Sun from a yellow main sequence star to an orange subgiant star to a red giant star. The helium flash will then occur, and our Sun will become a helium-burning star. Our Sun’s helium-burning lifetime will last millions of years, which is again brief as compared with its ten-billion-year main-sequence lifetime. After exhausting the helium in its core, our Sun’s carbon core will collapse and become hotter, while its outer layers expand and become cooler. Our Sun will become a red giant a second time. When the outer layers of our Sun expand to become a red giant the second time, its outer layers will consume the inner planets (Mercury, Venus, Earth, and Mars). However, the outer layers of a red giant are cool, only one or two thousand kelvins in temperature. Although this temperature is hot by human standards, it is not hot enough to melt most rocks, and it is certainly not hot enough to melt most metals. Hence, the inner planets will not immediately be destroyed when our Sun’s second red giant phase consumes them. In fact, the inner planets will at first continue to orbit the red giant Sun while being inside the red giant Sun! This will not continue long however, since the outer layers of the red giant Sun will cause drag as the inner planets orbit within these outer layers of gas. This drag will cause the inner planets to spiral inward toward the red giant Sun’s core, which is certainly hot enough to melt metal and rock. This is how the inner planets will be destroyed. The outer layers of the red giant Sun will continue to expand and cool. By the time these gases reach the outer planets, they will be so tenuous (low density) that they will have a negligible effect on the outer planets. These outer gas layers will pass the outer planets, continuing to become cooler and cooler and more and more diffuse (less and less dense). These outer gas layers will eventually become a planetary nebula, returning these gases to the surrounding interstellar medium. Now the only gravitational attraction the outer planets will feel is from the carbon white dwarf, the naked core of the former red giant Sun. However, the Sun has lost most of its mass, since it injected its outer gas layers which became an expanding planetary nebula. The carbon white dwarf was once the Sun’s core, which is only a small fraction of the Sun’s original mass. With significantly less mass, the carbon white dwarf will not have sufficient gravitational attraction to hold the outer planets in orbit. Hence, the outer planets will leave their orbits, becoming rogue planets (or orphan planets). A rogue (or orphan) planet does not orbit any particular star but instead moves along its own trajectory through our Milky Way Galaxy. Finally, all that will remain of our Solar System will be a carbon white dwarf, which was once our Sun’s core. All of these processes will begin in roughly five billion years, and they will take many millions of years to occur. If we could return to our Solar System roughly six billion years from now, all of these processes would be complete, and a carbon white dwarf would be all that remains of our Solar System. Billions of years from now, intelligent life may evolve on another planet orbiting another star. These intelligent lifeforms may even build telescopes and discover the rogue (or orphan) planet Jupiter moving through the Milky Way Galaxy. However, these intelligent lifeforms will have no direct evidence that Jupiter once orbited our Sun, since our Sun will have long since died. Hence, these intelligent lifeforms will probably mistakenly believe that Jupiter is a brown dwarf star. Perhaps some of the brown dwarf stars we observe today were once gas-giant planets that were once orbiting an ancient star that has long since died. In other words, perhaps some of the brown dwarf stars we observe today are not brown dwarf stars at all but are actually rogue (or orphan) planets.

High mass main sequence stars have masses greater than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses). In terms of spectral types, high mass main sequence stars are either O-type stars or B-type stars. High mass death is somewhat similar to low mass death but more violent. Since high mass stars are rare, the vast majority of main sequence stars die gently, while only a small fraction of main sequence stars die violently. Nevertheless, we must discuss high mass death, since we owe our very existence to violent high mass death, as we will discuss shortly.

A high mass main sequence star spends a short amount of time fusing hydrogen into helium in its core, only several million years. After exhausting the hydrogen in its core, the helium center collapses and becomes hotter, while a new layer of hydrogen fusion causes the outer layers of the star to expand further outward and become cooler. The core is compressed until the triple alpha process 3 → energy + begins, and the star becomes a helium-burning star, having a core where helium fuses into carbon surrounded by a layer where hydrogen fuses into helium. The helium-burning lifetime of the star is hundreds of thousands years, shorter than the star’s hydrogen-burning (main-sequence) lifetime, since helium fusion occurs at hotter temperatures than hydrogen fusion. Eventually, the central helium is exhausted, the carbon center collapses and becomes hotter, while two surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. Thus far, high mass death seems nearly identical with low mass death, but now the differences begin. High mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where carbon nuclei fuse into even heavier nuclei, in particular oxygen nuclei. More strictly, carbon nuclei fuse with helium nuclei (alpha particles) to yield oxygen nuclei. This nuclear reaction is more properly written + → energy + . Note that the electromagnetic repulsion between electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower the electromagnetic repulsion between a carbon nucleus with six positive protons and a helium nucleus (an alpha particle) with two positive protons is not as hot as the temperature necessary to overpower the electromagnetic repulsion between two carbon nuclei, each having six positive protons. In the case of carbon-helium fusion, the electromagnetic repulsion is proportional to six times two, which is twelve. In the case of carbon-carbon fusion, the electromagnetic repulsion is proportional to six times six, which is thirty-six. Twelve is significantly less than thirty-six, meaning less electromagnetic repulsion and hence a less hot temperature is required for carbon-helium fusion as compared with carbon-carbon fusion. Although there is an even weaker electromagnetic repulsion between a hydrogen nucleus (a proton) and a carbon nucleus, the nuclear fusion of a hydrogen nucleus with any other nucleus is slow, since it involves the weak nuclear force. The star is now a carbon-burning star, having a core where carbon fuses into oxygen surrounded by two less hot fusion layers. The carbon-burning lifetime of the star is tens of thousands of years, even shorter than its helium-burning lifetime, since carbon burning occurs at even hotter temperatures than helium burning, since carbon-helium fusion temperatures are proportional to twelve (six times two), a larger number as compared with helium-helium fusion temperatures which are proportional to four (two times two). Eventually, the central carbon is exhausted, the oxygen center collapses and becomes hotter, while three surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where oxygen nuclei fuse into even heavier nuclei, in particular neon nuclei. More strictly, oxygen nuclei fuse with helium nuclei (alpha particles) to yield neon nuclei. This nuclear reaction is more properly written + → energy + . Again, the electromagnetic repulsion between electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower the electromagnetic repulsion between an oxygen nucleus with eight positive protons and a helium nucleus (an alpha particle) with two positive protons is not as hot as the temperature necessary to overpower the electromagnetic repulsion between two oxygen nuclei, each having eight positive protons. In the case of oxygen-helium fusion, the electromagnetic repulsion is proportional to eight times two, which is sixteen. In the case of oxygen-oxygen fusion, the electromagnetic repulsion is proportional to eight times eight, which is sixty-four. Sixteen is significantly less than sixty-four, meaning less electromagnetic repulsion and hence a less hot temperature is required for oxygen-helium fusion as compared with oxygen-oxygen fusion. Although there is an even weaker electromagnetic repulsion between a hydrogen nucleus (a proton) and an oxygen nucleus, the nuclear fusion of a hydrogen nucleus with any other nucleus is again slow, since it involves the weak nuclear force. The star is now an oxygen-burning star, having a core where oxygen fuses into neon surrounded by three less hot fusion layers. The oxygen-burning lifetime of the star is several thousand years, even shorter than its carbon-burning lifetime, since oxygen burning occurs at even hotter temperatures than carbon burning, since oxygen-helium fusion temperatures are proportional to sixteen (eight times two), a larger number as compared with carbon-helium fusion temperatures which are proportional to twelve (six times two). Eventually, the central oxygen is exhausted, the neon center collapses and becomes hotter, while four surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where neon nuclei fuse into even heavier nuclei, in particular magnesium nuclei. More strictly, neon nuclei fuse with helium nuclei (alpha particles) to yield magnesium nuclei. This nuclear reaction is more properly written + → energy + . The star is now a neon-burning star, having a core where neon fuses into magnesium surrounded by four less hot fusion layers. The neon-burning lifetime of the star is several hundred years, even shorter than its oxygen-burning lifetime, since neon burning occurs at even hotter temperatures than oxygen burning, since neon-helium fusion temperatures are proportional to twenty (ten times two), a larger number as compared with oxygen-helium fusion temperatures which are proportional to sixteen (eight times two). Eventually, the central neon is exhausted, the magnesium center collapses and becomes hotter, while five surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where magnesium nuclei fuse into even heavier nuclei, in particular silicon nuclei. More strictly, magnesium nuclei fuse with helium nuclei (alpha particles) to yield silicon nuclei. This nuclear reaction is more properly written + → energy + . The star is now a magnesium-burning star, having a core where magnesium fuses into silicon surrounded by five less hot fusion layers. The magnesium-burning lifetime of the star is several decades, even shorter than its neon-burning lifetime, since magnesium burning occurs at even hotter temperatures than neon burning, since magnesium-helium fusion temperatures are proportional to twenty-four (twelve times two), a larger number as compared with neon-helium fusion temperatures which are proportional to twenty (ten times two). Eventually, the central magnesium is exhausted, the silicon center collapses and becomes hotter, while six surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where silicon nuclei fuse into even heavier nuclei, in particular sulfur nuclei. More strictly, silicon nuclei fuse with helium nuclei (alpha particles) to yield sulfur nuclei. This nuclear reaction is more properly written + → energy + . The star is now a silicon-burning star, having a core where silicon fuses into sulfur surrounded by six less hot fusion layers. The silicon-burning lifetime of the star is several years, even shorter than its magnesium-burning lifetime, since silicon burning occurs at even hotter temperatures than magnesium burning, since silicon-helium fusion temperatures are proportional to twenty-eight (fourteen times two), a larger number as compared with magnesium-helium fusion temperatures which are proportional to twenty-four (twelve times two). Eventually, the central silicon is exhausted, the sulfur center collapses and becomes hotter, while seven surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where sulfur nuclei fuse into even heavier nuclei, in particular argon nuclei. More strictly, sulfur nuclei fuse with helium nuclei (alpha particles) to yield argon nuclei. This nuclear reaction is more properly written + → energy + . The star is now a sulfur-burning star, having a core where sulfur fuses into argon surrounded by seven less hot fusion layers. The sulfur-burning lifetime of the star is several months, even shorter than its silicon-burning lifetime, since sulfur burning occurs at even hotter temperatures than silicon burning, since sulfur-helium fusion temperatures are proportional to thirty-two (sixteen times two), a larger number as compared with silicon-helium fusion temperatures which are proportional to twenty-eight (fourteen times two). Eventually, the central sulfur is exhausted, the argon center collapses and becomes hotter, while eight surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where argon nuclei fuse into even heavier nuclei, in particular calcium nuclei. More strictly, argon nuclei fuse with helium nuclei (alpha particles) to yield calcium nuclei. This nuclear reaction is more properly written + → energy + . The star is now an argon-burning star, having a core where argon fuses into calcium surrounded by eight less hot fusion layers. The argon-burning lifetime of the star is several days, even shorter than its sulfur-burning lifetime, since argon burning occurs at even hotter temperatures than sulfur burning, since argon-helium fusion temperatures are proportional to thirty-six (eighteen times two), a larger number as compared with sulfur-helium fusion temperatures which are proportional to thirty-two (sixteen times two). Eventually, the central argon is exhausted, the calcium center collapses and becomes hotter, while nine surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where calcium nuclei fuse into even heavier nuclei, in particular titanium nuclei. More strictly, calcium nuclei fuse with helium nuclei (alpha particles) to yield titanium nuclei. This nuclear reaction is more properly written + → energy + . The star is now a calcium-burning star, having a core where calcium fuses into titanium surrounded by nine less hot fusion layers. The calcium-burning lifetime of the star is even shorter than its argon-burning lifetime, since calcium burning occurs at even hotter temperatures than argon burning, since calcium-helium fusion temperatures are proportional to forty (twenty times two), a larger number as compared with argon-helium fusion temperatures which are proportional to thirty-six (eighteen times two). Eventually, the central calcium is exhausted, the titanium center collapses and becomes hotter, while ten surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where titanium nuclei fuse into even heavier nuclei, in particular chromium nuclei. More strictly, titanium nuclei fuse with helium nuclei (alpha particles) to yield chromium nuclei. This nuclear reaction is more properly written + → energy + . The star is now a titanium-burning star, having a core where titanium fuses into chromium surrounded by ten less hot fusion layers. The titanium-burning lifetime of the star is even shorter than its calcium-burning lifetime, since titanium burning occurs at even hotter temperatures than calcium burning, since titanium-helium fusion temperatures are proportional to forty-four (twenty-two times two), a larger number as compared with calcium-helium fusion temperatures which are proportional to forty (twenty times two). Eventually, the central titanium is exhausted, the chromium center collapses and becomes hotter, while eleven surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature where chromium nuclei fuse into even heavier nuclei, in particular iron nuclei and nickel nuclei. More strictly, chromium nuclei fuse with helium nuclei (alpha particles) to yield iron nuclei, and iron nuclei fuse with helium nuclei (alpha particles) to yield nickel nuclei. These nuclear reactions are more properly written + → energy + and + → energy + . The star is now a chromium-burning star, having a core where chromium fuses into iron and nickel surrounded by eleven less hot fusion layers. The chromium-burning lifetime of the star is even shorter than its titanium-burning lifetime, since chromium burning occurs at even hotter temperatures than titanium burning, since chromium-helium fusion temperatures are proportional to forty-eight (twenty-four times two), a larger number as compared with titanium-helium fusion temperatures which are proportional to forty-four (twenty-two times two). Eventually, the central chromium is exhausted, the iron-nickel center collapses and becomes hotter, while twelve surrounding fusion layers cause the outer layers of the star to expand further outward and become cooler. In brief, each successive nuclear reaction occurs at hotter and hotter temperatures. The first hydrogen-burning stage occurs at tens of millions of kelvins. Helium burning, carbon burning, and oxygen burning each occurs at hundreds of millions of kelvins. All the remaining burning (fusion) stages occur at a few billion kelvins! Also, each successive lifetime of the star is shorter and shorter, again since each successive nuclear reaction occurs at hotter and hotter temperatures. The first hydrogen-burning stage (the main-sequence lifetime) is itself relatively short for these high-mass stars, lasting only millions of years. Helium burning, carbon burning, and oxygen burning each last only thousands of years, neon burning lasts only centuries, and magnesium burning lasts only decades. Silicon burning lasts only years, sulfur burning only months, and argon burning only days! Calcium burning and titanium burning last only hours, and chromium burning lasts only minutes!

Many students now conclude that successively hotter and hotter nuclear reactions continue to occur, synthesizing heavier and heavier nuclei all the way to the end of the Periodic Table of Elements. However, this nuclear reaction chain actually ends at iron and nickel, which is roughly halfway through the Periodic Table of Elements. As we discussed, nuclear fission is the splitting of more massive (heavier) nuclei into less massive (lighter) nuclei, while nuclear fusion is the merging or fusing of less massive (lighter) nuclei into more massive (heavier) nuclei. Both of these types of nuclear reactions generate energy because atoms of intermediate mass (roughly halfway through the Periodic Table of Elements) have the most stable nuclei among all atoms. The most massive (heaviest) nuclei attain greater stability by splitting into less massive (lighter) nuclei, hence releasing energy. The least massive (lightest) nuclei attain greater stability by merging or fusing into more massive (heavier) nuclei, again releasing energy. Hence, attempting to merge or fuse nuclei of intermediate mass (roughly halfway through the Periodic Table of Elements) into more massive (heavier) nuclei would result in those more massive (heavier) nuclei spontaneously splitting back into the intermediate nuclei. Similarly, attempting to split nuclei of intermediate mass (roughly halfway through the Periodic Table of Elements) into less massive (lighter) nuclei would result in those less massive (lighter) nuclei spontaneously merging or fusing back into the intermediate nuclei. Iron and nickel are intermediate-mass atoms, roughly halfway through the Periodic Table of Elements. In fact, iron and nickel nuclei are among the most stable of all the nuclei in the universe. Thus, the nuclear reaction chain at the center of a high mass star ends at iron and nickel. Caution: the physical strength of iron has nothing to do with its nuclear stability; the physical strength of iron arises from interactions among its electrons that reside in atomic states around the nucleus, not nuclear states within the nucleus. The core of the high mass star now has several layers, rather like the layers of an onion. Starting at the center of the many-layered core, we have non-burning iron and nickel surrounded by a layer of chromium burning (fusing) into iron and nickel surrounded by a layer of titanium burning (fusing) into chromium surrounded by a layer of calcium burning (fusing) into titanium surrounded by a layer of argon burning (fusing) into calcium surrounded by a layer of sulfur burning (fusing) into argon surrounded by a layer of silicon burning (fusing) into sulfur surrounded by a layer of magnesium burning (fusing) into silicon surrounded by a layer of neon burning (fusing) into magnesium surrounded by a layer of oxygen burning (fusing) into neon surrounded by a layer of carbon burning (fusing) into oxygen surrounded by a layer of helium burning (fusing) into carbon surrounded by a layer of hydrogen burning (fusing) into helium. Surrounding this many-layered core is the rest of the star, which is not hot enough for any nuclear fusion reactions to occur. Hence, most of the star is composed of roughly seventy-five percent (three-quarters) hydrogen and roughly twenty-five percent (one-quarter) helium. With each core collapse, these outer layers of the star have expanded further outward, becoming cooler and therefore redder. Since the outer layers of the star have expanded many times with each of the many collapses of the core, the star has become enormous. The star has become a red supergiant. While low mass stars begin to die by becoming red giants, high mass stars begin to die by becoming red supergiants. Since non-burning iron and nickel constitutes the center of the many-layered core of this supergiant star, the non-burning iron and nickel center must be supported by electron degeneracy pressure. Note that the center of the core has compressed many times, squeezing the electrons closer and closer to each other. As we discussed, white dwarfs are supported by electron degeneracy pressure. Therefore, we may regard the center of the many-layered core of this supergiant star as an iron-nickel white dwarf. This iron-nickel white dwarf core has collapsed many times, making it small and hot. In brief, at this stage of the life of a high mass star, it has become a supergiant star with a many-layered core, and the center of that many-layered core of the supergiant star is an iron-nickel white dwarf supported by electron degeneracy pressure.

The iron-nickel white dwarf that comprises the center of the many-layered core of a red supergiant star is under such tremendous pressure that exotic nuclear reactions can occur. One such exotic nuclear reaction is called electron capture, where a proton devours an electron thus transmuting itself into a neutron and emitting a neutrino. This nuclear reaction is more properly written + e^– → + ν_e. Caution: in nuclear physics, the symbol is used for the hydrogen-1 nucleus, which is simply a proton. Also note that is the symbol of the neutron in nuclear physics, as we discussed. Also as we discussed, e^– is the symbol of the (ordinary) electron, and ν_e is the symbol of the neutrino. Neutrinos are extremely weakly interacting particles, as we also discussed. Hence, the neutrinos generated by this nuclear reaction simply fly out of the center of the many-layered core, passing through all the other layers of the core, flying through the outer layers of the red supergiant, and propagating into the surrounding outer space at nearly the speed of light. The iron-nickel white dwarf center was being supported by electron degeneracy pressure. If protons are devouring electrons, then the electron degeneracy pressure that was supporting the center of the many-layered core vanishes. The neutrons that were synthesized by this nuclear reaction go into free fall, since there is no pressure to balance self-gravity. According to Quantum Mechanics, neutrons obey the Pauli Exclusion Principle, just as electrons obey the Pauli Exclusion Principle. In other words, no two neutrons can occupy the same quantum state at the same time, and thus attempting to squeeze neutrons together results in a repulsion called neutron degeneracy pressure. This beautifully illustrates that degeneracy pressure has nothing to do with electromagnetic repulsion. Neutrons are neutral; they do not attract or repel each other electromagnetically. However, neutrons do repel each other through neutron degeneracy pressure if they are squeezed too close to each other. The neutrons therefore stop collapsing when neutron degeneracy pressure halts their collapse. It is not difficult to calculate that neutron degeneracy pressure halts the collapse when the iron-nickel white dwarf has collapsed from the size of the Earth (the white dwarf size scale) down to a radius of roughly ten kilometers. This is roughly the size of a city! This incredibly small and dense sphere of neutrons supported by neutron degeneracy pressure is called a neutron star. The existence of white dwarfs is already difficult to comprehend, since they have compressed roughly the mass of our Sun into roughly the size of the Earth, with a density roughly one million times normal densities. Now imagine compressing roughly the mass of our Sun into roughly the size of a city! The resulting density of a neutron star is hundreds of millions of times more dense than even a white dwarf, making a neutron star hundreds of trillions of times more dense than normal densities! These densities are fantastic, far beyond human comprehension. As a result of these fantastic densities, the gravity near a neutron star significantly warps the fabric of space and time around it, as we will discuss shortly in the context of Einstein’s theory of gravity, the General Theory of Relativity. Although the density of a neutron star is far beyond human imagination, its density is actually roughly equal to the density of every nucleus of every atom composing everything in the universe, including our own bodies. Therefore, we may regard a neutron star as an enormous atomic nucleus! The most massive atoms at the end of the Periodic Table of Elements have atomic masses of nearly three hundred, but far far far beyond those atoms are neutron stars with atomic masses of roughly one octillion nonillion or one septillion decillion! It is not difficult to calculate that the free-fall collapse of the iron-nickel white dwarf from roughly the size of the Earth to a neutron star roughly the size of a city occurs in roughly one millisecond, one-thousandth of one second! It is also not difficult to calculate the amount of energy liberated when roughly one solar mass collapses from roughly the size of the Earth to roughly the size of a city. The energy liberated is comparable to the total energy radiated by our Sun over its entire ten billion year lifetime! The resulting luminosity of this high mass star is in the billions of solar luminosities! This is roughly the total power output of an entire galaxy of stars! Such fantastic quantities of energy liberated in such an incredibly short amount of time is obviously a cataclysmic explosion. This is how high mass stars die; they obliterate themselves in a spectacularly violent explosion called a supernova. Strictly, this is a Type II (Roman numeral) supernova. We will discuss Type I (Roman numeral) supernovae later in the course. To summarize, high mass stars live short main-sequence lifetimes, swell to become red supergiants, and explode as Type II supernovae. The violence of this explosion throws the outer layers of the star away from the explosion at very high speeds and heats these gases to millions of kelvins of temperature. This rapidly expanding, hot gas is called a supernova remnant, which astrophysicists always abbreviate SNR. Beautiful examples of supernova remnants include the Crab Nebula in the constellation Taurus (the bull), the Tycho Nebula in the constellation Cassiopeia (the queen of Aethiopia), and the Kepler Nebula in the constellation Ophiuchus (the serpent bearer).

The Type II supernova of a high mass star is so violent that all the nuclei across the entire Periodic Table of Elements are synthesized by this cataclysmic explosion. The nuclear reactions do not end at iron and nickel, roughly halfway through the Periodic Table of Elements. The nuclear reactions actually proceed all the way to the end of the Periodic Table of Elements, synthesizing even the most massive (heaviest) of all nuclei, such as uranium and plutonium. As we will discuss toward the end of the course, the universe was essentially pure hydrogen and helium when it was born in the fires of the Big Bang. If the universe was born pure hydrogen and helium, where did all the other atoms of the Periodic Table of Elements come from? Most stars are born low mass, and these low mass stars fuse hydrogen into helium. At best, they can fuse helium into carbon. However, high mass stars synthesize all the elements up to iron and nickel within their cores, and then synthesize all the elements through to the end of the Periodic Table of Elements within their violent Type II supernovae. The rapidly expanding supernova remnant throws all these nuclei into the surrounding outer space. As the supernova remnant expands, it becomes cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the supernova remnant return to the interstellar medium, enriching or polluting the interstellar medium with these new nuclei. These enriched or polluted gases may someday form a new diffuse nebula from which new stars will be born, but now this diffuse nebula has been enriched or polluted with new nuclei. We now realize why we owe our very existence to high mass death. Our bodies are composed of these atoms, such as the iron in our blood, the sodium and potassium in our nerves, the calcium in our bones, and the oxygen that composes the water that makes up most of our bodies. All the terrestrial planets, including our own planet Earth, are also composed of these atoms, such as iron and nickel and silicon and oxygen, as we discussed earlier in the course. Without high mass stellar death, there would be no terrestrial planets and no life. If the universe was essentially pure hydrogen and helium when it was born in the fires of the Big Bang, then the first generation of stars born in the universe were essentially pure hydrogen and helium, and therefore they could not have had terrestrial planets orbiting them. At best, they had jovian, gas-giant planets orbiting them. The deaths of these first generation stars were essential to creating future generations of stars that could have terrestrial planets orbiting them and therefore the potential for life on some of these terrestrial planets, in particular our planet Earth.

We subdivide high mass main sequence stars into two subcategories: ordinary high mass stars and very high mass stars. Ordinary high mass stars have masses from 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses) up to 20M_☉ to 25M_☉ (twenty to twenty-five solar masses). Very high mass stars have masses from 20M_☉ to 25M_☉ (twenty to twenty-five solar masses) all the way up to the Eddington limit of roughly 100M_☉ (one hundred solar masses). In terms of spectral types, we will regard ordinary high mass stars as B-type stars and very high mass stars as O-type stars. The stellar death we have discussed strictly applies to ordinary high mass stars. A very high mass star also lives a very short main-sequence lifetime and also swells to become a red supergiant with an iron-nickel white dwarf center surrounded by a many-layered core. The red supergiant also explodes with a Type II supernova, again initiated by electron capture in the core that emits neutrinos, which again throws outward a very hot and rapidly expanding supernova remnant. Very high mass death seems identical to ordinary high mass death, but the difference is as follows. Very high mass stars have such tremendous self-gravity that not even neutron degeneracy pressure can halt the core collapse. If neutron degeneracy pressure cannot halt the collapse of the core, then nothing can halt the collapse of the core. The core continues collapsing all the way down to a mathematical point. This is the ultimate triumph of gravity. This mathematical point is called a black hole, which we will discuss in more detail shortly in the context of Einstein’s theory of gravity, the General Theory of Relativity. Recall that the main sequence is a population-abundance sequence, with higher mass main sequence stars being less abundant (more rare) than lower mass main sequence stars, which are more abundant (more common). Thus, very high mass stars are more rare than ordinary high mass stars. Therefore, most Type II supernovae leave behind a hot supernova remnant rapidly expanding away from a neutron star. On rare occasions, a Type II supernova will leave behind a hot supernova remnant rapidly expanding away from a black hole. To summarize high mass death, the star spends a short time as a hydrogen-burning (main sequence) star, swells to become a red supergiant with an iron-nickel white dwarf center surrounded by a many-layered core, and explodes as a Type II supernova. The Type II supernova is triggered by electron capture in the core that emits neutrinos, collapses the core, and throws outward a hot supernova remnant that rapidly expands away from either a neutron star or a black hole. Notice that high mass death is similar to low mass death, just more violent. As we discussed, a low mass star spends a long time as a hydrogen-burning (main sequence) star, swells to become a red giant, and finally dies as a planetary nebula slowly expanding away from a white dwarf. The supernova remnant for high mass death is analogous to the planetary nebula for low mass death, and the neutron star or the black hole for high mass death is analogous to the white dwarf for low mass death. We can turn this logic completely around and claim that low mass death is similar to high mass death, just more gentle. The planetary nebula for low mass death is analogous to the supernova remnant for high mass death, and the white dwarf for low mass death is analogous to the neutron star or the black hole for high mass death.

Supernovae are rare, since only a tiny fraction of all stars are high mass that die with Type II supernova explosions. In a typical galaxy like our Milky Way Galaxy that is composed of roughly one hundred billion stars, there is only one supernova per century on average. If a supernova occurs in a typical galaxy roughly once every century (once every one hundred years), then if astronomers continuously observe one hundred galaxies, we should observe roughly one supernova every year on average. If astronomers continuously observe one thousand galaxies, we should observe roughly ten supernovae every year on average; this would be roughly once a month. If astronomers continuously observe ten thousand galaxies, we should observe roughly one hundred supernovae every year on average; this would be roughly twice a week. Over the past few decades, astronomers have used telescopes to continuously observe tens of thousands of galaxies. Thus, we observe several hundred supernovae every year; this is roughly once every day. However, these supernovae are in distant galaxies, far beyond our own Milky Way Galaxy. These supernovae cannot be observed with the naked eye; they can only be observed with very powerful telescopes. The procedure for discovering a supernova in a distant galaxy is as follows. We use a powerful telescope to photograph a galaxy night after night after night. One night, we see a point of light in the galaxy that is as bright as the entire galaxy. We conclude that one of the stars in that galaxy has suffered a supernova explosion. The point of light remains bright for a couple weeks, and the point of light eventually fades away over the next several months.

Our Sun will never suffer from a supernova, since our Sun is a low mass star. This is fortunate, since if our Sun were to suffer from a supernova, the explosion would obliterate our entire Solar System! There are no nearby high mass stars that may suffer a supernova that could harm us in any way, which stands to reason since high mass stars are rare. There are however some high mass stars close enough to be visible with the naked eye that have already entered the red supergiant phase, such as Betelgeuse in the constellation Orion (the hunter) and Antares in the constellation Scorpius (the scorpion). How would Betelgeuse or Antares appear in the sky if they were to suffer a supernova? A supernova has a luminosity of billions of solar luminosities. Hence, the star would appear to become billions of times brighter. This would be so bright that we could see the star in the daytime! The star would remain this bright for a couple weeks. Over the next several months, the star would remain fairly bright but would gradually fade in intensity. Within roughly one year, the star would vanish from our sky, forever changing the appearance of the constellation Orion (the hunter) or Scorpius (the scorpion), since a bright star in the constellation has now been forever erased from our sky! Again, this sequence of events would be visible to the naked eye, making nearby supernovae within our own Milky Way Galaxy exciting to observe. Over the past millennium (one thousand years), we have observed roughly one supernova per century within our own Milky Way Galaxy. Warning: astronomers have observed supernovae roughly once every day over the past few decades, but these are supernovae in distant galaxies that can only be observed with very powerful telescopes. Only a handful of naked-eye supernovae over the past millennium have been observed, including in April 1006, July 1054 creating the Crab Nebula, August 1181, November 1572 creating the Tycho Nebula, and October 1604 creating the Kepler Nebula. Note that the last supernova on this list, the most recent naked-eye supernova from within our Milky Way Galaxy, occurred more than four hundred years ago. If a supernova occurs in a typical galaxy roughly once per century, then we are long overdue for a naked-eye supernova from within our Milky Way Galaxy. We could almost guarantee that we will observe one or perhaps two or perhaps even three naked-eye supernovae from within our Milky Way Galaxy within our lifetimes. Frustratingly, the last naked-eye supernova from within our Milky Way Galaxy occurred before Galileo Galilei made his historic observations of the sky with his primitive telescope in the year 1609, as we discussed earlier in the course. Thus, the model we have presented of a Type II supernova being triggered by the core collapse and subsequent explosion of a high mass star remained an untested theoretical model for many years. This all changed in the historic year 1987. As we discussed, working at a neutrino detector is the most boring job in the world, since a neutrino detector only detects a single neutrino per day. However, on Monday, February 23, 1987, at 07:35:35 universal time, neutrino detectors around the world detected twenty-five neutrinos within a time span of less than thirteen seconds! Physicists all around the world had no idea how to explain this incredible burst of neutrinos. The source of these neutrinos was revealed a couple hours later, when astronomers witnessed the star named CPD-69 402 (also named GSC 09162-00821) violently explode, becoming extraordinarily more luminous. This star was not within our own Milky Way Galaxy however; this star was within a nearby galaxy called the Large Magellanic Cloud, nearly two hundred thousand light-years from our Solar System. It suddenly became clear what caused the neutrino burst. Nearly two hundred thousand years ago, the high mass star CPD-69 402 (GSC 09162-00821) swelled to become a supergiant star until electron capture was initiated in its iron-nickel white dwarf center, triggering a supernova explosion. Neutrinos flew out of the core, with the light from the explosion following right behind the neutrinos. Over the next nearly two hundred thousand years, the neutrinos propagated spherically outward, with the light from the explosion also propagating spherically outward. On the 23rd day of February in the historic year 1987, the neutrinos from this supernova passed through planet Earth, and neutrino detectors around the world detected twenty-five of them. A couple hours later, the light from the supernova arrived at the Earth, and this light was not only observed by astronomers through telescopes but was actually witnessed by humans (in the southern hemisphere) with the naked eye. This is the closest supernova to occur in roughly four hundred years. The name of this supernova is SN1987A, since the name of a supernova always begins with the letters SN (for supernova) followed by the year astronomers first observed the supernova followed by the letter of the English alphabet indicating numerically which observed supernova it was in that year. For example, the first supernova astronomers observed in the year 2017 was named SN2017A, the second supernova astronomers observed that same year was named SN2017B, the third supernova astronomers observed that same year was named SN2017C, and so on and so forth. There are only twenty-six letters in the English alphabet. Therefore, the twenty-seventh supernova astronomers observed in the year 2017 was named SN2017aa, the twenty-eighth supernova astronomers observed in that same year was named SN2017ab, the twenty-ninth supernova astronomers observed in that same year was named SN2017ac, and so on and so forth. Again, astronomers observe hundreds of supernovae every year from distant galaxies. However, SN1987A was the closest supernova observed in roughly four hundred years. This supernova was close enough and hence bright enough to be visible to the naked eye (but only from the southern hemisphere). This supernova provided strong evidence that our theoretical model of supernova explosions is correct. In summary, a Type II supernova is caused by a dying high mass star that swells to become a supergiant star. The nuclear reaction electron capture in the core triggers the Type II supernova. Neutrinos fly out of the core, the core collapses, and the energy of the collapse is liberated in a cataclysmic explosion with a brightness in the billions of solar luminosities. The final result of a Type II supernova is a very hot supernova remnant rapidly expanding away from either a neutron star or a black hole. The next time neutrino detectors around the world detect a burst of neutrinos, every astronomical telescope in the world will immediately point to supergiant stars such as Betelgeuse or Antares to witness the actual explosion of the supergiant star. Over the past few decades since SN1987A, astronomers have witnessed the formation of the supernova remnant that resulted from that supernova. Astrophysicists will continue to study SN1987A for many centuries, just as astrophysicists still continue to study the Crab Nebula for example, which resulted from a supernova observed in July 1054, almost one thousand years ago. By making many observations over several decades, astronomers have measured the growing size of several supernova remnants. We can calculate the speed with which the supernova remnant expands from these observations, and we can then extrapolate backwards to calculate how long ago the supernova occurred. In the cases where astronomers from previous centuries actually witnessed the supernova occur, our extrapolated date of the supernova is always roughly equal to the date that was recorded by astronomers centuries ago.

When an ordinary high mass star suffers a supernova explosion, the iron-nickel white dwarf core collapses to a neutron star, as we discussed. By the Law of Conservation of Angular Momentum, the collapsing core must spin faster. Since the iron-nickel white dwarf roughly the size of the Earth collapses to a neutron star roughly the size of a city, the rotational speed of the neutron star after the collapse is hundreds of thousands of times faster than the rotational speed of the iron-nickel white dwarf from which it collapsed! If the iron-nickel white dwarf rotated once a day for example, the neutron star that formed from it must rotate once in less than one second! Furthermore, the magnetic field lines of the star are pulled with the collapsing core. Hence, the magnetic field lines tighten, increasing the strength of the magnetic field by a tremendous amount. The magnetic field at the surface of a neutron star can be trillions of times stronger than the Earth’s magnetic field! It would be improbable for the magnetic poles of the neutron star to precisely coincide with its rotational poles, just as the Earth’s magnetic poles do not coincide with its own rotational poles, as we discussed earlier in the course. Hence, as the neutron star rotates, its magnetic axis precesses around its rotational axis. The incredibly strong magnetic field that precesses at the incredibly fast rotational speed causes electromagnetic waves to be radiated away from the neutron star, and these radiated electromagnetic waves also rotate with the neutron star. If the precessing magnetic axis of the neutron star happens to direct these emissions in our general direction, we will observe regular pulses of electromagnetic waves as the neutron star rotates, rather like the rotating light from a lighthouse. These neutron stars are called pulsars. The first pulsars ever discovered were the Crab Pulsar at the center of the Crab Supernova Remnant in the constellation Taurus (the bull) and the Vela Pulsar at the center of the Vela Supernova Remnant in the constellation Vela (the sails). The discovery of these and other pulsars at the center of supernova remnants provides further strong evidence that our theories of supernova explosions are correct. Presumably, all neutron stars are born as pulsars, but the continuous emission of electromagnetic waves carries energy and angular momentum away from the pulsar, thus slowing the rotation of the neutron star. Eventually, the neutron star would no longer emit pulses. Astronomers have measured the gradual rotational slowing of several pulsars to be a few microseconds per year, and several non-pulsar neutron stars have been discovered. Note however that some of these non-pulsar neutron stars may in fact be pulsars. If a neutron star happens to have an axis of rotation that precesses its magnetic axis to radiate pulses that do not happen to be emitted in our general direction, then we would not observe its pulses. Hence, we would incorrectly conclude that the pulsar neutron star is instead a non-pulsar neutron star, and it would be virtually impossible for us to discover that this particular neutron star is in fact a pulsar. Neutron stars can also have their rotational speed increased. As we will discuss shortly, gases may fall toward a neutron star, and these gases may add angular momentum to the neutron star, speeding up its rotation. These are called millisecond pulsars, since they rotate once in only a few milliseconds! These millisecond pulsars are also called recycled pulsars, since they were at first rotationally slowing from a pulsar neutron star to a non-pulsar neutron star, but the additional angular momentum gave the neutron star a second life as a pulsar.

As we discussed, the cutoff between low mass main sequence stars and high mass main sequence stars is 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses). Many students demand to know the exact cutoff: is it 7M_☉ (seven solar masses), 8M_☉ (eight solar masses), or 9M_☉ (nine solar masses)? Unfortunately, we cannot specify this cutoff more precisely, since there is uncertainty in our theoretical calculations. The Type II supernova of a high mass star is triggered by the failure of electron degeneracy pressure to support the white dwarf core of the supergiant. Therefore, we might be able to specify an exact cutoff between a low mass star and a high mass star by calculating the maximum mass electron degeneracy pressure is able to support. Caution: this would be the cutoff mass for only the core of the star, not the cutoff mass for the entire star. The maximum mass that electron degeneracy pressure is able to support is called the Chandrasekhar limit, named for the Indian astrophysicist Subrahmanyan Chandrasekhar who first performed this calculation. The Chandra observatory, NASA’s great X-ray space telescope as we discussed earlier in the course, is also named for this astrophysicist. The Chandrasekhar limit is equal to 1.4M_☉ (1.4 solar masses or 1.4 times the mass of our Sun). This is the maximum mass that electron degeneracy pressure is able to support. Therefore, this is the core-mass cutoff between a low mass star and a high mass star. Again, this is the cutoff mass for the core only; the cutoff mass for the entire star is 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses). In other words, a star with a total mass of 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses) has a core mass equal to 1.4M_☉ (1.4 solar masses). If the mass of the entire star less than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses), then the mass of its core is less than 1.4M_☉ (1.4 solar masses). In this case, electron degeneracy pressure will be able to support the core. Therefore, the star must be a low mass star, and it will die gently as a slowly expanding planetary nebula surrounding a white dwarf. If the mass of the entire star is greater than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses), then the mass of its core is greater than 1.4M_☉ (1.4 solar masses). In this case, electron degeneracy pressure will not be able to support the core. Therefore, the star must be a high mass star, and it will die violently as a Type II supernova resulting in a very hot supernova remnant rapidly expanding away from a neutron star that is supported by neutron degeneracy pressure. Since the Chandrasekhar limit is the maximum mass that electron degeneracy pressure is able to support, it is not only the core-mass cutoff between low mass stars and high mass stars. The Chandrasekhar limit is also the maximum possible mass of a white dwarf. This has been verified by observations. No white dwarf has ever been discovered with a mass greater than the Chandrasekhar limit of 1.4M_☉ (1.4 solar masses). The brightest star in the constellation Canis Major (the big dog) is Sirius the dog star, as we discussed earlier in the course. Sirius is actually a binary star, and the two stars are named Sirius A and Sirius B. Whereas Sirius A is a main sequence star, Sirius B is a white dwarf, the closest white dwarf to our Solar System and one of the first white dwarfs ever discovered. The mass of the Sirius B white dwarf is roughly 1.0M_☉ (1.0 solar masses), less than the 1.4M_☉ (1.4 solar mass) Chandrasekhar limit.

As we discussed, the cutoff between ordinary high mass main sequence stars and very high mass main sequence stars is 20M_☉ to 25M_☉ (twenty to twenty-five solar masses). Many students demand to know the exact cutoff: is it 20M_☉, 21M_☉, 22M_☉, 23M_☉, 24M_☉, or 25M_☉? Unfortunately, we cannot specify this cutoff more precisely, since there is uncertainty in our theoretical calculations. The formation of a black hole results from the failure of neutron degeneracy pressure to support the core. Therefore, we might be able to specify an exact cutoff between an ordinary high mass star and a very high mass star by calculating the maximum mass neutron degeneracy pressure is able to support. Caution: this would be the cutoff mass for only the core of the star, not the cutoff mass for the entire star. The maximum mass that neutron degeneracy pressure is able to support is called the Tolman-Oppenheimer-Volkoff limit or the TOV limit for short, named for the three physicists who first attempted this calculation: American physicist Richard Tolman, American physicist J. Robert Oppenheimer, and Russian physicist George Volkoff. The American physicist J. Robert Oppenheimer is most famous for being the father of the nuclear bomb, since he was the head physicist of the secret Manhattan Project during the Second World War. We are not certain of the precise value of the Tolman-Oppenheimer-Volkoff limit. Although these three physicists (and other physicists) have attempted this calculation, neutron stars have such fantastically high densities that the precise properties of the state of matter within neutron stars is unknown. Presumably, the outer layers of the neutron star are composed of neutrons; this is called the crust of the neutron star. However, the interior of a neutron star is under such incredible pressures that the quarks and gluons (which compose both protons and neutrons) are squeezed out of the neutrons. Hence, we no longer have individual neutrons toward the center of a neutron star. The core of a neutron star is actually composed of a fantastically dense soup of quarks and gluons all colliding with each other. This new state of matter at the core of a neutron star is called a quark-gluon plasma, about which we know very little. Therefore, calculating the exact value of the Tolman-Oppenheimer-Volkoff limit remains elusive. Nevertheless, theoretical estimates have revealed its approximate value. The Tolman-Oppenheimer-Volkoff limit is very roughly equal to 2.4M_☉ (2.4 solar masses), and it is definitely less than 3M_☉ (three solar masses). Although we do not know the precise value of the Tolman-Oppenheimer-Volkoff limit, for the purposes of this discussion we will use roughly 2.4M_☉ (2.4 solar masses). This is the maximum mass that neutron degeneracy pressure is able to support. Therefore, this is the core-mass cutoff between an ordinary high mass star and a very high mass star. Again, this is the cutoff mass for the core only; the cutoff mass for the entire star is 20M_☉ to 25M_☉ (twenty to twenty-five solar masses). In other words, a star with a total mass of 20M_☉ to 25M_☉ (twenty to twenty-five solar masses) has a core mass equal to roughly 2.4M_☉ (2.4 solar masses). To summarize, if the mass of the entire star is less than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses) and of course greater than the lower limit of 0.08M_☉ (0.08 solar masses) of all main sequence stars, then the mass of its core is less than 1.4M_☉ (1.4 solar masses). In this case, electron degeneracy pressure will be able to support the core. Therefore, the star must be a low mass star, and it will die gently as a slowly expanding planetary nebula surrounding a white dwarf. If the mass of the entire star is greater than 7M_☉, 8M_☉, or 9M_☉ (seven, eight, or nine solar masses) but less than 20M_☉ to 25M_☉ (twenty to twenty-five solar masses), then the mass of its core is greater than 1.4M_☉ (1.4 solar masses) but less than roughly 2.4M_☉ (2.4 solar masses). In this case, electron degeneracy pressure will not be able to support the core, but neutron degeneracy pressure will be able to support the core. Therefore, the star must be an ordinary high mass star, and it will die violently as a Type II supernova resulting in a very hot supernova remnant rapidly expanding away from a neutron star that is supported by neutron degeneracy pressure. If the mass of the entire star is greater than 20M_☉ to 25M_☉ (twenty to twenty-five solar masses) and of course less than the Eddington limit of roughly 100M_☉ (one hundred solar masses) of all main sequence stars, then the mass of its core is greater than roughly 2.4M_☉ (2.4 solar masses) and probably less than roughly 10M_☉ (ten solar masses), the approximate core mass of the most massive stars at the Eddington limit. In this case, not even neutron degeneracy pressure is able to support the core. Therefore, the star must be a very high mass star, and it will die violently as a Type II supernova resulting in a very hot supernova remnant rapidly expanding away from a black hole. As we discussed, since the Chandrasekhar limit is the maximum mass that electron degeneracy pressure is able to support, it is not only the core-mass cutoff between low mass stars and ordinary high mass stars; the Chandrasekhar limit is the maximum possible mass of a white dwarf. We also now realize that this Chandrasekhar limit is also the minimum mass of a neutron star. Since the Tolman-Oppenheimer-Volkoff limit is the maximum mass that neutron degeneracy pressure is able to support, it is not only the core-mass cutoff between ordinary high mass stars and very high mass stars; the Tolman-Oppenheimer-Volkoff limit is the maximum possible mass of a neutron star. It is also the minimum mass of a black hole. In conclusion, the mass of a white dwarf must be less than the 1.4M_☉ Chandrasekhar limit, the mass of a neutron star must be greater than the 1.4M_☉ Chandrasekhar limit but less than the roughly 2.4M_☉ Tolman-Oppenheimer-Volkoff limit, and the mass of a black hole must be greater than the roughly 2.4M_☉ Tolman-Oppenheimer-Volkoff limit but less than the 10M_☉ rough estimate for the core mass of the most massive stars at the Eddington limit. Caution: we will discuss shortly that since nothing can escape from a black hole, a black hole will gain more and more mass over time. After billions of years, a black hole may attain a mass of millions and even billions of solar masses. These are called supermassive black holes, which we will discuss later in the course. We will use the term stellar black holes for black holes recently born from the Type II supernova of a very high mass star, and some stellar black holes grow over billions of years to become supermassive black holes. We will also discuss toward the end of the course that there may be microscopic black holes in our universe. These microscopic black holes are also called primordial black holes, since they were born in the fires of the Big Bang, the creation of the entire universe.

At normal densities, solids and liquids are highly incompressible, resulting in solids and liquids having roughly constant densities. In other words, at normal densities the volume of solids and liquids is directly proportional to their mass, meaning more mass will occupy a proportionally larger volume. For example, twice as much metal or twice as much rock or twice as much liquid water will all occupy twice as much volume. However, white dwarfs and neutron stars are both supported by degeneracy pressure. Therefore, more massive white dwarfs and more massive neutron stars must in fact have smaller volumes to provide greater pressure to balance the significantly stronger self-gravity from their higher mass. A particular white dwarf or a particular neutron star will actually shrink in volume (shrink in size) if it happens to gain mass, as we will discuss shortly.

Stars are born from within a diffuse nebula, a giant cloud of gas many light-years across composed primarily of hydrogen and helium. Since a diffuse nebula is so enormous, many stars are born within a diffuse nebula simultaneously. Therefore, stars are born in clusters. However, most stars do not remain in clusters indefinitely. After a star cluster is born from a diffuse nebula, the individual stars move apart from one another, moving along their own trajectories through our Milky Way Galaxy. Therefore, most stars are not members of star clusters. For example, our Sun is not presently a member of a star cluster, although it was presumably born a member of an ancient star cluster that has long since dispersed. Although most stars are not members of star clusters, it is the study of star clusters that has truly revealed that our models of stellar evolution are correct. In our discussion of star clusters, we will see the power of the Hertzsprung-Russell diagram in explaining and predicting stellar properties and stellar evolution.

When we construct the Hertzsprung-Russell diagram for a star cluster, we can see the main sequence, the red giants, and the white dwarfs on the diagram. Astronomers often abbreviate the main sequence MS. The red giants appear along the asymptotic giant branch, which astronomers often abbreviate AGB. The asymptotic giant branch connects the main sequence with another collection of stars called the horizontal branch, which astronomers often abbreviate HB. The horizontal branch connects the asymptotic giant branch with another grouping of stars called the clump. In almost every Hertzsprung-Russell diagram for almost every star cluster, the early part of the main sequence is missing. This confirms that high mass main sequence stars live shorter lifetimes than low mass main sequence stars, which live longer lifetimes. The star cluster is sufficiently old that the stars from the missing early part of the main sequence have already died, since they live short lifetimes. However, the star cluster is not sufficiently old for the stars in the late part of the main sequence to have died as of yet. These stars are still hydrogen-burning main sequence stars, since they have longer lifetimes. The earliest main sequence star in the Hertzsprung-Russell diagram for a star cluster is called the main-sequence turnoff, since it connects the main sequence to the asymptotic giant branch. In other words, the hottest, most luminous, largest, and most massive (earliest) main sequence star in the Hertzsprung-Russell diagram for a star cluster is at the main-sequence turnoff. The main-sequence turnoff reveals the age of the star cluster. If the main-sequence turnoff is early, then the star cluster must be young, since there are still short-lifetime main sequence stars that have not yet evolved into red giants. If the main-sequence turnoff is late, then the star cluster must be old, since there are only long-lifetime stars remaining on the main sequence. For example, if the main-sequence turnoff is in the spectral type B, then the star cluster must be roughly ten million years old, since the main-sequence lifetime of a B-type star is roughly ten million years. As another example, if the main-sequence turnoff is in the spectral type F, then the star cluster must be roughly one billion years old, since the main-sequence lifetime of an F-type star is roughly one billion years. The main-sequence lifetime of a G-type star like our Sun is roughly ten billion years. No star cluster has ever been discovered with a main-sequence turnoff later than the G spectral type. This is one way we know the age of the entire universe. The universe cannot be much older than ten billion years since we have never discovered a star cluster with a main-sequence turnoff later than spectral type G. At the very end of this course, we will discuss that the age of the universe is more precisely fourteen billion years, which we have determined from the expansion of the entire universe. Notice these two different methods of determining the age of the universe are fairly consistent with each other. Since the asymptotic giant branch connects with the main sequence at the main-sequence turnoff, the red giants along the asymptotic giant branch must be expanding to become red giants after ending their main-sequence lifetimes. The stars near the beginning of the asymptotic giant branch are orange subgiant stars; they have only recently left the main sequence. The stars suffering from the helium flash are at the end of the asymptotic giant branch, where the asymptotic giant branch connects with the horizontal branch. We also find Cepheid variable stars along the asymptotic giant branch, since Cepheid variable stars suffer from the instability of transitioning from a main sequence star to a red giant star. We will discuss Cepheid variable stars later in the course. The stars along the horizontal branch are in the process of attaining gravitational equilibrium from the new pressure provided by the helium fusion in the stellar core. We also find Lyrae variable stars along the horizontal branch, since Lyrae variable stars suffer from the instability of transitioning from a red giant star to a helium-burning star. We will discuss Lyrae variable stars later in the course. The clump is the collection of helium-burning stars that have attained gravitational equilibrium. There is often a second asymptotic giant branch connected to the clump. The stars along this second asymptotic giant branch have exhausted the helium in their cores. The carbon core collapses, while the outer layers of the star expand. Hence, these stars have become red giants for the second time. We find Mira variable stars along the second asymptotic giant branch, since Mira variable stars suffer from the instability of transitioning from a helium-burning star to a red giant star. We will discuss Mira variable stars later in the course. Astronomers informally refer to the upper-middle part of the Hertzsprung-Russell diagram as the instability strip, since we find Cepheid variable stars, Lyrae variable stars, Mira variable stars, and even Tauri variable stars on that part of the diagram. Eventually, the slowly expanding outer layers of the star will divorce themselves from the core. The slowly expanding outer layers will become a planetary nebula, while the naked core will become a white dwarf. Indeed, we see white dwarfs in the Hertzsprung-Russell diagram for star clusters. If we plot the stars of a newly born star cluster on the Hertzsprung-Russell diagram, we would see the entire main sequence with no red giants and no white dwarfs, since a newly born cluster has not had time for any main sequence stars to die. If we could wait millions of years as the stars within this newly born star cluster evolve and if we could plot these stars accordingly on the Hertzsprung-Russell diagram, we would see the early-type main sequence stars become red supergiant stars and then disappear from the Hertzsprung-Russell diagram as they live their short lifetimes and explode as Type II supernovae. As a result, the main-sequence turnoff would appear to advance from spectral type O to spectral type B, thus shrinking the appearance of the main sequence on this Hertzsprung-Russell diagram. As the main-sequence turnoff advances to spectral type A, we would see these main sequence stars evolve into orange subgiant stars and then into red giant stars, forming the first asymptotic giant branch within the instability strip. When these stars eventually suffer from the helium flash, we would then see the formation of the horizontal branch within the instability strip, ultimately forming the clump on the Hertzsprung-Russell diagram. When these stars exhaust the helium in their cores, we would then see the formation of the second asymptotic giant branch within the instability strip. We would then see white dwarfs begin to appear on the Hertzsprung-Russell diagram. If we could wait billions of more years and if we could continue to plot these stars accordingly on the Hertzsprung-Russell diagram, we would see the main-sequence turnoff continue to advance from spectral type A to spectral type F to spectral type G as more and more main sequence stars begin the process of stellar death, thus further shrinking the main sequence on the Hertzsprung-Russell diagram. We would continue to see stars move from the main sequence toward and along the first asymptotic giant branch within the instability strip, along the horizontal branch within the instability strip, through the clump, and along the second asymptotic giant branch within the instability strip. We would also see more and more white dwarfs appear on this Hertzsprung-Russell diagram as these stars reach the very end of their evolution.

The Hertzsprung-Russell diagram for a star cluster can be used to determine the distance to the cluster. Suppose a star cluster is significantly beyond the solar neighborhood. Therefore, the star cluster is too distant for parallax to be used to determine its distance. Hence, we need another method to determine the distance to this star cluster. The procedure to determine the distance to this cluster is as follows. First, we first construct the Hertzsprung-Russell diagram for the star cluster. At this suggestion, we should all protest. The vertical axis of the Hertzsprung-Russell diagram is the luminosity or the absolute magnitude or the intrinsic brightness, and we must have the distances to stars to determine their luminosities or absolute magnitudes or intrinsic brightnesses. Suppose we instead use the apparent magnitude as the vertical axis instead of the absolute magnitude. At this suggestion, we should all protest even more strongly. The apparent magnitude or the apparent brightness of a star depends upon its distance from us; the apparent magnitude is not an intrinsic property of a star! Here is the crux of the argument. The star cluster is distant enough for all of the stars within the cluster to be roughly the same distance from us; therefore, all of their apparent brightnesses are directly related to their intrinsic brightnesses. A concrete example will make this clear. Suppose we observe two stars in the sky. We will name these two stars Star Alpha and Star Beta. Suppose Star Alpha appears to be brighter than Star Beta; that is, suppose Star Beta appears to be dimmer than Star Alpha. We cannot draw any conclusion about the intrinsic brightness or the luminosity of these two stars without knowing the distance of each star from us. Star Alpha could be intrinsically brighter than Star Beta, but Star Beta might in fact be intrinsically brighter than Star Alpha. In this case, Star Beta only appears dimmer since it is further from us, and Star Alpha only appears brighter since it is closer to us. However, now suppose Star Alpha appears to be brighter than Star Beta, and in addition suppose we have determined using whatever method that both stars are the same distance from us. We can now be certain that Star Alpha is indeed intrinsically brighter than Star Beta; that is, we can be certain that Star Beta is intrinsically dimmer than Star Alpha. Again, without knowing distances, we cannot draw any conclusions, but if we happen to know that two stars are the same distance from us, then the apparently brighter star is indeed intrinsically brighter and the apparently dimmer star is indeed intrinsically dimmer. If a star cluster is distant enough, which we are certain is the case if parallax angles are too small to measure, then all the stars within the cluster are roughly the same distance from us. Of course, the stars in front of the cluster are somewhat closer to us; of course, the stars in the back of the cluster are somewhat further from us. Nevertheless, these are small variations if the entire star cluster is distant enough from us. If all of the stars in the star cluster are roughly the same distance from us, then the stars that appear to be brighter truly are more luminous, and the stars that appear to be dimmer truly are less luminous. Therefore, we can construct the Hertzsprung-Russell diagram for a distant star cluster using the apparent magnitude instead of the absolute magnitude as the vertical axis. After constructing the Hertzsprung-Russell diagram, we should see the main sequence on the diagram, among other features such as the asymptotic giant branch, the horizontal branch, and the clump. We already know the absolute magnitudes of main sequence stars as a function of their spectral types from studying nearby stars within the solar neighborhood. Thus, we assign these absolute magnitudes to the corresponding main sequence stars we see on the Hertzsprung-Russell diagram for the star cluster. Essentially, we are sliding the star cluster’s entire Hertzsprung-Russell diagram vertically (up and down) until all main sequence stars on the diagram attain their appropriate absolute magnitudes. Now that we have both the absolute magnitudes and the apparent magnitudes of the main sequence stars in the cluster, the only unknown remaining in the equation I = ℒ / 4πr² is the distance r, meaning that we have successfully determined the distance to the star cluster. This procedure is called the main sequence fitting method, since we are determining the distance to the cluster by combining the apparent magnitudes of the main sequence stars with their established absolute magnitudes from nearby main sequence stars in the solar neighborhood. The main sequence fitting method is the next major rung of the Cosmological Distance Ladder above the parallax method, which is the lowest rung of the Cosmological Distance Ladder.

As we discussed, most star systems are binary star systems. This is reason enough to devote some discussion to binary star systems. Most binary star systems are detached binaries, meaning that the two stars orbit each other sufficiently far from one another that they do not significantly affect each other’s evolution. Whichever star is more massive will live a shorter main-sequence lifetime. That star will then swell to become a red giant star. The helium flash will occur, and that star will then become a helium-burning star. After the star’s helium-burning lifetime, the star will swell to become a red giant star a second time, eventually ejecting a slowly expanding planetary nebula and leaving behind a white dwarf. Eventually, the other star will experience the same sequence of stages of stellar death. If one of the stars in a binary star system is high mass, it will live an extremely short main-sequence lifetime. That star will then swell to become a supergiant star and suffer from a Type II supernova. Extraordinarily, the other star survives this supernova explosion. The supernova ejects a hot supernova remnant rapidly expanding away from either a neutron star or a black hole. The other star is usually a low mass star that will eventually experience its own appropriate sequence of stages of stellar death. In conclusion, since most binary star systems are detached binaries where the two stars orbit each other sufficiently far from one another that they do not significantly affect each other’s evolution, all of the stellar evolution we have discussed applies to most binary star systems. However, if the two stars in a binary star system are orbiting sufficiently close to each other, they can affect each other’s evolution. Therefore, much of the stellar evolution we have discussed requires modifications. Such binary star systems are called close binaries. Caution: the term close binary does not mean the binary star system is close to our Solar System; the term close binary means the two stars in the binary star system orbit close to each other. In a close binary, whichever main sequence star is more massive will live a shorter main-sequence lifetime. That star will then swell to become a giant star. However, the two stars orbit sufficiently close to each other that the outer layers of the giant star approach the other less massive star. These outer layers will then feel more gravitational attraction from the less massive star. Hence, the outer layers of the giant star will fall toward the less massive star. The gas does not fall directly toward the second star, since the gas has angular momentum from the orbital motion of both stars around each other. Therefore, these gases settle into an orbit around the less massive star, forming a flat disk. Gases within the disk that are closer to the star orbit faster while gases within the disk that are further from the star orbit slower, in accordance with Kepler’s laws. As a result, neighboring gases within the disk move at different speeds; hence, the gases within the disk rub against each other, resulting in friction that heats the disk. This increase in thermal energy (heat energy) must come at the expense of gravitational orbital energy, since energy must be conserved. Therefore, the gas within the disk migrates inward, toward the less massive star. Eventually, the gas collides onto the surface of the star. The less massive star therefore gains mass through these collisions. The gaining of mass from collisions is called accretion, as we discussed earlier in the course. Hence, the flat disk around the less massive star is called an accretion disk. In summary, there is a mass transfer from the giant star to the less massive star through an accretion disk around the less massive star. Eventually, the less massive star may gain so much mass that it becomes more massive than the giant star, which has now lost so much mass that the giant star is less massive than the second star! This explains why we sometimes discover binary star systems with a giant star that is less massive than the other star. More massive stars live shorter main-sequence lifetimes; therefore, the giant star should be the more massive star. Indeed, the giant star was formerly the more massive star, but it lost much of its mass through a mass transfer to the other star. A giant star in a close binary may lose so much mass from its outer layers through this mass transfer that it may become an exotic subgiant star with a disproportionately large core. Eventually, the other star may gain so much mass that it begins to evolve off of the main sequence prematurely. It swells to become a giant star, and its outer layers approach the first star. These gases will then feel more gravitational attraction to the first star, eventually falling toward the first star. Hence, there is a second mass transfer from the second star back to the first star!

Usually, both stars in a close binary are low mass stars. Eventually, one of the stars will reach the very end of its life, ejecting a planetary nebula and leaving behind a white dwarf. The star has lost most of its mass when it ejects the planetary nebula, and hence the gravitational attraction between the white dwarf and the second star is greatly weakened. As a result, the center of mass of the close binary is displaced to be much closer to the second star, and the trajectories of both stars around that new center of mass is greatly altered. Often, the gravitational attraction of the two stars is sufficiently weakened that both stars subsequently move on unbound trajectories; the stars leave each other, and the binary system ends. The two stars may however continue to orbit each other, and the two stars may even continue to orbit close to each other, maintaining the close binary system. Eventually, the second star ends its main-sequence lifetime and expands to become a giant star. In the case where the two stars continue to orbit close to each other, the outer layers of the giant star approach the white dwarf. These gases will then feel more gravitational attraction from the white dwarf. Hence, the outer layers of the giant star will fall toward the white dwarf. Again, the gas does not fall directly toward the white dwarf, since the gas has angular momentum from the orbital motion of both stars around each other. Therefore, these gases settle into an orbit around the white dwarf, forming an accretion disk. Again, the gases within the disk rub against each other, resulting in friction that heats the disk. This increase in thermal energy (heat energy) must come at the expense of gravitational orbital energy, since energy must be conserved. Therefore, the gas within the disk migrates inward, toward the white dwarf. Eventually, the gas collides onto the surface of the white dwarf, causing the white dwarf to gain mass. In summary, there is a mass transfer from the giant star to the white dwarf through an accretion disk around the white dwarf. However, a white dwarf is small, roughly the size of the Earth, as we discussed. Hence, the gravitational well of a white dwarf is sufficiently deep that the gas that collides onto the surface of the white dwarf is strongly compressed and significantly heated. This gas is predominantly hydrogen, since it came from the outer layers of the giant star. As gas continues to fall onto the white dwarf, the hydrogen on its surface may eventually be heated to millions of kelvins, causing it to fuse into helium. This causes the white dwarf to suddenly increase in brightness to thousands of solar luminosities for a few weeks. This is called a nova. The sudden increase in luminosity ejects material from the surface of the white dwarf, resulting in an expanding shell of hot gas away from the close binary system. This is called a nova remnant. The nova and the ejected nova remnant do not stop the mass transfer from the giant star to the white dwarf from continuing. Eventually, another nova may occur, ejecting another nova remnant. In other words, novae and nova remnants from a white dwarf in a close binary are periodic, occurring regularly. Novae and nova remnants from a white dwarf in a close binary may occur once every few decades, once every few centuries, or once every few millennia. To summarize, there are several important differences between a nova and a supernova. Firstly, novae from a white dwarf in a close binary occur regularly, while the Type II supernova of a high mass star occurs only once. Secondly, novae last a few weeks, while a supernova lasts a few months. Thirdly, novae have luminosities of thousands of solar luminosities, while supernovae have luminosities of billions of solar luminosities. Note however that observationally a nova and a supernova may appear identical, at least at first glance. A nova that occurs sufficiently close to us may appear just as bright (same apparent magnitude) as a supernova that occurred much further from us. We can discriminate between a nova and a supernova by determining the distance to the event and then using that distance to calculate the luminosity (the absolute magnitude or the intrinsic brightness) of the event. If the absolute magnitude is thousands of solar luminosities, the event was a nova, not a supernova. If instead the absolute magnitude is billions of solar luminosities, the event was a supernova, not a nova. We may also discriminate between a nova and a supernova from the duration of time of the event. If the increase in brightness lasts for a few weeks, we may conclude that the event was a nova, not a supernova. If instead the increase in brightness lasts for a few months, we may conclude that the event was a supernova, not a nova. We may also discriminate between a nova and a supernova by observing the space surrounding the event. If we observe a large slowly expanding planetary nebula around the event, we may conclude that the event was a nova, not a supernova. In this case, the surrounding large planetary nebula was ejected when the white dwarf first formed. If instead we observe a small very hot (in the millions of kelvins) supernova remnant rapidly expanding away from the event, we may conclude that the event was a supernova, not a nova. In this case, the small very hot rapidly expanding supernova remnant was just ejected when the supernova occurred. Note that the word nova is derived from a Latin word meaning new. Observationally, a nova simply appears to be a new star. A supernova also appears to be a new star, but with much greater luminosity or absolute magnitude or intrinsic brightness. Caution: a white dwarf in a close binary may suffer from its own unique type of supernova, as we will discuss later in the course.

If one of the stars in a close binary is a high mass star, it will live a short main-sequence lifetime. It then swells to become a supergiant star and explodes as a Type II supernova, throwing out a hot supernova remnant rapidly expanding away from a neutron star or a black hole. Extraordinarily, the other star survives the supernova, even though the two stars orbit close to each other. We now have a neutron star or a black hole, called the compact object, orbiting a main sequence star, called the primary object. The primary object will eventually end its main-sequence lifetime and swell to become a giant star. The outer layers of the giant star approach the compact object and hence feel stronger gravitational attraction from the compact object. Thus, the outer layers of the giant star fall toward the compact object. Again, the gas does not fall directly toward the compact object, since the gas has angular momentum from the orbital motion of both stars around each other. Therefore, these gases settle into an orbit around the compact object, forming an accretion disk where friction heats the disk causing the gas to migrate inward toward the compact object. However, the gravitational well of a neutron star or a black hole is so incredibly deep that the gas is heated to millions of kelvins of temperature as it falls toward the compact object. At these very hot temperatures, the accretion disk radiates X-rays. These binary star systems are called X-ray binaries, which astrophysicists often abbreviate XRBs. The incredibly deep gravitational well of the compact object also accelerates the falling gas to nearly the speed of light. Some of this gas may be ejected as narrow columns or jets near the rotational angular momentum axis of the accretion disk around the compact object. For all of these reasons, some types of X-ray binaries are often called microquasars. We will discuss quasars later in the course. For now, we simply mention that the accretion disk of an X-ray binary together with the high-speed jets of gas ejected along the rotational angular momentum axis of the accretion disk around the compact object makes these X-ray binaries similar to quasars, but on a much smaller size scale than quasars. This is why some types of X-ray binaries are often called microquasars.

The compact object of an X-ray binary is either a neutron star or a black hole. A neutron star has a solid surface. Therefore, very hot gas falling toward a neutron star that has been accelerated to nearly the speed of light will eventually collide onto the surface of the neutron star, causing sudden and intense X-ray bursts. These X-ray bursts can have luminosities of many thousands of solar luminosities, entirely in X-rays! Black holes however do not have a solid surface, as we will discuss shortly. Therefore, very hot gas falling toward a black hole that has been accelerated to nearly the speed of light will not collide with a solid surface; the gas rather quietly disappears from the observable universe as it falls into the black hole. Therefore, there are no sudden and intense X-ray bursts from a black hole. This is one way astrophysicists determine whether the compact object in an X-ray binary is a neutron star or a black hole. If we detect sudden and intense X-ray bursts, then the compact object is a neutron star. If we do not detect sudden and intense X-ray bursts, then the compact object is a black hole. Another way astrophysicists make this determination is by calculating the mass of the compact object using Kepler’s third law. If the mass of the compact object is greater than the Tolman-Oppenheimer-Volkoff limit, then the compact object must be a black hole. If the mass of the compact object is less than the Tolman-Oppenheimer-Volkoff limit but greater than the Chandrasekhar limit, then the compact object must be a neutron star. The first black hole ever discovered was the compact object in an X-ray binary in the constellation Cygnus (the swan). This X-ray binary was named Cygnus X-1. The primary object in this binary star system is a supergiant star. The compact object in this binary star system was calculated to have a mass significantly greater than the Tolman-Oppenheimer-Volkoff limit, revealing that it is indeed a black hole. Yet another way astrophysicists determine whether the compact object in an X-ray binary is a neutron star or a black hole is the observation of pulses from the compact object. As we discussed, a pulsar is a neutron star, and the observation of electromagnetic pulses from the X-ray binary would reveal that the compact object is a neutron star. The mass transferred from the primary object to the neutron star through the accretion disk may add angular momentum to the neutron star, thus speeding up its rotation. The result is a millisecond pulsar, since it rotates once in only a few milliseconds. These millisecond pulsars are also called recycled pulsars, since they were at first rotationally slowing from a pulsar neutron star to a non-pulsar neutron star through the loss of angular momentum carried away by its pulses, but the additional angular momentum from the accreting gases gave it a second life as pulsar.

The Theories of Relativity: Galilean-Newtonian Relativity, Einsteinian Special Relativity, and Einsteinian General Relativity

Galilean-Newtonian Relativity theory was formulated between three hundred and four hundred years ago. This relativity theory may also be called common-sense relativity theory, since many of us understand this relativity theory intuitively from our daily experiences. Fundamental to Galilean-Newtonian Relativity theory is the Galilean-Newtonian velocity addition law, which states that the velocity of Object A relative to Object B (written ) plus the velocity of Object B relative to Object C (written ) is equal to the velocity of Object A relative to Object C (written ). This law is more properly written = + . This Galilean-Newtonian velocity addition law may seem intimidating at first, but in fact many of us already understand this law intuitively from our daily experiences, even if we cannot state this law mathematically. Let us discuss several examples to illustrate that this law is indeed consistent with our common sense. As our first example, suppose a train is moving at ten miles per hour to the right relative to the ground, and suppose someone on the train fires a bullet moving at one hundred miles per hour to the right relative to the train. Then, the velocity of the bullet relative to the ground is one hundred and ten miles per hour to the right. As our second example, suppose that a train is moving at ten miles per hour to the right relative to the ground, and suppose someone on the ground fires a bullet moving at one hundred miles per hour to the right relative to the ground. Then, the velocity of the bullet relative to the train is ninety miles per hour to the right. As our third example, suppose a car is moving at seventy miles per hour on one side of a highway relative to the ground, and suppose another car is moving at fifty miles per hour on the same side of the highway and hence is moving in the same direction relative to the ground. Then, the seventy-car is moving at twenty miles per hour relative to the fifty-car. Also, the fifty-car is moving at twenty miles per hour backwards relative to the seventy-car. As our fourth example, suppose a car is moving at seventy miles per hour on one side of a highway relative to the ground, and suppose another car is moving at fifty miles per hour on the opposite side of the highway and hence is moving in the opposite direction relative to the ground. Then, either car is moving at one hundred and twenty miles per hour relative to the other car. Why did we subtract seventy miles per hour and fifty miles per hour to obtain twenty miles per hour in our third example? Why did we add seventy miles per hour and fifty miles per hour to obtain one hundred and twenty miles per hour in our fourth example? The simple, common-sense arguments are as follows. If we are driving at fifty miles per hour on a highway and if a car on the same side of the highway comes up from behind us at seventy miles per hour and collides with us, the collision will be mild, since the relative speed between the two cars is only twenty miles per hour. This collision is exactly as if our car were parked and we were hit by a car moving at twenty miles per hour. However, if we are driving at fifty miles per hour on a highway and if a car moving in the opposite direction at seventy miles per hour collides with us (a head-on collision), we would be dead, since the relative speed between the two cars is one hundred and twenty miles per hour. This collision is exactly as if our car were parked and we were hit by a car moving at one hundred and twenty miles per hour. As a fifth example, suppose a car is moving at sixty miles per hour on one side of a highway relative to the ground, and suppose another car is moving at sixty miles per hour on the same side of the highway and hence is moving in the same direction relative to the ground. Then either car is not moving (is at rest) relative to the other car.

After centuries of physicists believing that Galilean-Newtonian Relativity theory (common-sense relativity theory) is correct, new physics was discovered that began to reveal that these laws, although seemingly indisputable, are nevertheless incorrect. In the 1860s, the brilliant Scottish physicist James Clerk Maxwell formulated classical electromagnetic theory with four equations, later named the Maxwell equations in his honor. These four Maxwell equations are mathematically beautiful. These four Maxwell equations completely summarize classical electromagnetism. These four Maxwell equations even revealed that light is an electromagnetic wave, and indeed the entire wave theory of light can be derived from these four Maxwell equations, including all the laws of classical optics. However, these four Maxwell equations also stated that the vacuum speed of light is always the same number, written c and equal to roughly three hundred thousand kilometers per second or roughly one hundred and eighty-six thousand miles per second. This cannot be true, can it? According to common sense, the vacuum speed of light cannot always be the same number, as the following few examples will illustrate. Suppose a train is moving at velocity V to the right relative to the ground, and suppose someone on the train with a flashlight sends a light beam moving at velocity c to the right relative to the train. Then, the velocity of the light beam relative to the ground is c plus V, isn’t it? Please review our first example from Galilean-Newtonian Relativity theory for help with this example, since they are in fact identical examples. Suppose a train is moving at velocity V to the right relative to the ground, and suppose someone on the ground with a flashlight sends a light beam moving at velocity c to the right relative to the ground. Then, the velocity of the light beam relative to the train is c minus V, isn’t it? Please review our second example from Galilean-Newtonian Relativity theory for help with this example, since they are in fact identical examples. Suppose a train is moving at velocity c to the right relative to the ground, and suppose someone on the ground with a flashlight sends a light beam moving at velocity c to the right relative to the ground. Then, the light beam is not moving (is at rest) relative to the train, isn’t it? Please review our fifth example from Galilean-Newtonian Relativity theory for help with this example, since they are in fact identical examples. Suppose a train is moving at velocity V relative to the ground, and suppose someone on the ground with a flashlight sends a light beam moving at velocity c relative to the ground at right angles to the train’s velocity. Then, the light beam is moving at a speed relative to the train, isn’t it? All of these examples persuade us that according to the common sense of our daily experiences, the vacuum speed of light should depend upon which direction we are moving, how fast we are moving, and in which direction the light itself is moving. Our examples, using the common sense of our daily experience, tell us that sometimes the vacuum speed of light might be c, but other times it might be c plus V, sometimes it could be c minus V, sometimes it could be zero, sometimes it could be , and so on and so forth. The two American physicists Albert A. Michelson and Edward W. Morley set out to show that this is the case in the 1880s with what was later called the Michelson-Morley experiment in their honor. However, their experiment showed that the vacuum speed of light does not depend upon which direction we are moving or how fast we are moving or even in which direction the light itself is moving! Their measurements showed that the vacuum speed of light is always the same number, always c! Our common sense tells us that this cannot be true, and indeed many physicists believed that Michelson and Morley performed their experiment incorrectly. Some physicists did believe the result, but they could not explain how this can possibly be true.

This brings us to the person who would explain all of these mysteries. Albert Einstein was a mediocre physicist who struggled with mathematics. One of his elementary-school math teachers told young Albert’s father, “Nothing good will ever come from this boy!” In the year 1905, Albert Einstein worked at a patent office in Switzerland. Although many physicists would feel humiliated working in such a position, this job gave Einstein plenty of time to think about fundamental physics. Einstein was so enraptured by the beauty of the Maxwell equations that he became convinced that they must be true. Most physicists would have responded that the Maxwell equations cannot be completely true, since they state that the vacuum speed of light is always c, which common sense says is impossible. Only Einstein would dare assert the following. The Maxwell equations are so beautiful that they must be true. Therefore, if they state that the vacuum speed of light is always c, then the vacuum speed of light is always c! This is Special Relativity theory in one sentence. Einstein’s Special Relativity theory states that the vacuum speed of light does not depend upon which direction we are moving or how fast we are moving or even in which direction light happens to be moving. Einstein’s Special Relativity theory states that the vacuum speed of light is always the same number, written c and equal to roughly three hundred thousand kilometers per second or roughly one hundred and eighty-six thousand miles per second. In other words, Einstein’s Special Relativity theory states that the vacuum speed of light is an invariant.

Einstein’s Special Relativity theory is simple to state, but this theory confounds our common sense. How can the vacuum speed of light possibly be an invariant? The basic argument is as follows. If the vacuum speed of light is always the same number c, then space and time must change to ensure that the vacuum speed of light c does not change. For example, Einstein made the following incredible deduction from his new theory. Time slows down when we move; moving clocks actually run slow! This is called time dilation. Consider any clock whatsoever, such as a mechanical clock or the electronic clock within our mobile telephones. According to Einstein’s Special Relativity theory, a clock must run slower if it moves. Suppose all of us had identical mobile telephones, and suppose we synchronized their clocks. If one of us walks with our mobile telephone, our mobile telephone runs slower than everyone else’s mobile telephones! As a result, our time runs behind everyone else’s time! Is Einstein actually claiming that whenever we walk or ride a bicycle or drive a car or ride a train or ride an airplane that our time slows down? Yes! But then why do we never notice in our daily experience that our time slows down? The time dilation effect is very tiny for objects moving at speeds very slow compared with c, and everything in our daily experiences does indeed move very slowly compared with the vacuum speed of light c. We would only notice these temporal changes if we moved incredibly fast, close to the vacuum speed of light c. The implications of this time dilation effect are staggering. For example, consider two identical twins who have lived together in the same house their entire lives. Hence, they are the same age. However, if one of them walks down the street, that twin will age a tiny amount slower, since their time is now running slower. When that twin returns home, that twin will be a tiny amount younger than the twin who remained at home! Time dilation was considered outrageous a century ago, but this effect has actually been observed in recent decades. For example, suppose we synchronize two extraordinarily accurate atomic clocks. Then, suppose we place one of these atomic clocks on an airplane. After the flight, physicists have actually experimentally measured that the airplane’s atomic clock is behind the ground’s atomic clock by a tiny amount! As another example, consider an unstable subatomic particle that decays after a short lifetime. If this particle is accelerated close to the vacuum speed of light c, it lives much longer before decaying since its lifetime is much longer. When the particle moves, its time slows down, permitting it to live a longer lifetime before decaying. Not only is time dilation real, but in fact computers, mobile telephones, and the global positioning system (GPS) would all not function correctly without taking into account the fact that all of their clocks run at different rates since they all move at different speeds!

Einstein drew another incredible conclusion from his new theory: space contracts when we move; moving objects actually contract! This is called length contraction. Consider any object whatsoever. According to Einstein’s Special Relativity theory, the object must contract in the direction it is moving. While we are walking, we are skinnier than usual, and not because we are getting exercise! When we stop moving, our shape returns to normal. Is Einstein actually claiming that moving cars and moving trains are shorter than normal? Yes! But then why do we never notice in our daily experience that moving cars and moving trains are shorter than normal? The length contraction effect is very tiny for objects moving at speeds very slow compared with c, and everything in our daily experiences does indeed move very slowly compared with the vacuum speed of light c. We would only notice these spatial changes if we moved incredibly fast, close to the vacuum speed of light c.

We now begin to have some understanding how it could possibly be true that the vacuum speed of light is an invariant, always equal to the same number c. Speed is equal to distance divided by time. Distance is measured with graduated rods such as meter sticks, and time is measured with clocks. However, moving objects contract and moving clocks run slow! To ensure that the vacuum speed of light is always equal to the same number c, every graduated rod in the universe contracts by just the right amount and every clock in the universe slows down by just the right amount to ensure that the distance traveled by light divided by the time for light to travel always equals the same number c. Space and time must change to ensure that c does not change!

When Einstein deduced length contraction and time dilation from his new theory, he realized that the Galilean-Newtonian velocity addition law could no longer be correct. Physicists believed that the Galilean-Newtonian velocity addition law was correct for centuries, and even today the common sense of our daily experience tells us that it seems to be true. We must keep in mind that time dilation and length contraction are very tiny effects for objects moving at speeds very slow compared with c, such as walking people, driving cars, moving trains, and flying airplanes. This makes the Galilean-Newtonian velocity addition law almost correct, but still not exactly correct, at these slow speeds. At very fast speeds approaching c, we would actually notice that this law is severely wrong. Einstein deduced the correct velocity addition law by taking time dilation and length contraction into account. This new law is called the Lorentz-Einstein velocity addition law, named for both Albert Einstein and Dutch physicist Hendrik Lorentz. The Lorentz-Einstein velocity addition law states that . This new velocity addition law correctly ensures that the vacuum speed of light is an invariant, always equal to the same number c.

Einstein drew yet another incredible conclusion from his new theory: the mass of an object increases as it moves faster. For centuries, physicists believed that the mass of an object is fixed, and even today the common sense of our daily experience tells us that the mass of an object is fixed. We must realize that the additional mass is very tiny at speeds very slow compared with c, such as the speeds of walking people, driving cars, moving trains, and flying airplanes. We would need to move at very fast speeds approaching c to actually notice this extra mass. If we stand still, we have a certain amount of mass, but while we are walking we have more mass! The next time someone urges us to go jogging to lose some weight, we should respond that we will gain mass if we jog! The extra mass is tiny at such a slow speed, but it is nevertheless real! The equation for this extra mass (which we will not present in this course) reveals yet another outrageous consequence of this theory: the vacuum speed of light c is the speed limit of the universe. An object gains mass when it moves faster, but this means that we would then require more force to speed it up further. If a force does speed the object up further, then the object would gain even more mass, and thus we would require even more force to speed it up further still. According to the extra-mass equation, the mass of an object approaches infinity as its speed approaches c. This means that we would need an infinite force to speed the object up to c, but it is impossible to exert an infinite force. Not only does the universe forbid anything from moving faster than c, the universe forbids any object to even reach c! We could accelerate a spaceship faster and faster, making it move closer and closer to c, but the spaceship can never reach c. The only things permitted to actually move at c are things already moving at c, such as light or any electromagnetic wave (composed of photons) from across the entire Electromagnetic Spectrum. We will discuss shortly that according to Einstein’s General Relativity theory, gravity also moves at c. Any object moving slower than c is forever constrained to move slower than c. Such an object can move faster and faster approaching c, but actually reaching c is forbidden. Moving faster than c is out of the question. These conclusions were considered outrageous a century ago, but they have been proven in recent decades. In particle accelerators, we can speed up subatomic particles, and physicists have experimentally verified that these particles do indeed gain more and more mass as they move faster and faster, in precise accordance with the extra-mass equation that Einstein discovered. Moreover, physicists have experimentally verified that the vacuum speed of light c acts as a bottleneck, precisely as predicted by the extra-mass equation that Einstein discovered. We can accelerate particles very close to c, but we cannot accelerate particles to actually reach c. Speeds faster than c are out of the question. Modern particle accelerators can accelerate protons to speeds faster than 99.9999% of c, but nevertheless still slower than c itself. In summary, Einstein’s Special Relativity theory states that our universe has a speed limit, the vacuum speed of light c!

Einstein also deduced his famous mass-energy relation from his new theory, which states that energy is equal to mass multiplied by the square of the vacuum speed of light. This law is most commonly written E = mc². The consequences of this equation are staggering. For example, consider nuclear reactions. An exothermic nuclear reaction liberates energy, while an endothermic nuclear reaction absorbs energy. These two terms exothermic and endothermic are used to describe chemical reactions as well. If an exothermic nuclear reaction liberates energy, then the reaction must liberate mass as well. Thus, the products of an exothermic reaction have less mass than the reactants! If an endothermic nuclear reaction absorbs energy, then the reaction must absorb mass as well. Thus, the products of an endothermic reaction have more mass than the reactants! Einstein stated this mass-energy relation in the year 1906. It would be a few years later before physicists even discovered that an atom has a nucleus, and it would be almost forty years later before the first nuclear weapons were built. Nevertheless, Einstein actually stated in the year 1906 that his mass-energy relation could be proven by studying radioactive materials. It would be years before physicists even realized that radioactivity is a type of nuclear reaction! Let us spend a moment reflecting upon Einstein’s genius. Almost forty years before nuclear weapons were built and even a few years before the nucleus of an atom was discovered, Einstein discovered the mass-energy equation and applied it to radioactivity, a type of nuclear reaction!

The mass-energy relation may be the last of Einstein’s contributions to Special Relativity theory, but one of his former math teachers, the German-Polish-Russian mathematician Hermann Minkowski, realized what this new theory is really trying to tell us. According to Special Relativity theory, we live in a four-dimensional universe. According to the common sense of our daily experience, we live in a three-dimensional universe. These three dimensions are length, width, and height, mathematically written as x, y, and z. However, time is the fourth dimension according to relativity theory. Time is usually written as t, but time is written as ct in relativity theory. In other words, we live in a four-dimensional universe: three spatial dimensions (x, y, and z) and one temporal dimension (ct). Moreover, these four dimensions mix into one another, and the mixing of the temporal dimension with the three spatial dimensions is the fundamental cause of time dilation, length contraction, the invariance of c, the universal speed limit of c, and even the mass-energy relation. Minkowski invented a new word to describe our four-dimensional universe. Minkowski took the word space and the word time, and he put them together to form a new word: spacetime. Notice that there is no space or even a hyphen between the two words used to construct this new word. To summarize Einstein’s Special Relativity theory, we live in a four-dimensional spacetime with three spatial dimensions (x, y, and z) and one temporal dimension (ct) that all mix into one another thus causing time dilation, length contraction, the invariance of c, the universal speed limit of c, and the mass-energy relation.

Fictitious forces or pseudoforces are forces that do not actually exist; they only seem to exist in certain frames of reference. For example, suppose we are in a stationary car waiting at a red traffic light. When the red traffic light turns green, we place our foot upon the car’s accelerator pedal. As the car accelerates forward, everyone and everything in the car feels a backward force. We actually feel ourselves pulled backward into the backrest of our chair. Anything hanging from the rearview mirror also swings backward. This backward force is a fictitious force or a pseudoforce. It does not exist; it only seems to exist within the car as the car accelerates forward. Although everyone and everything within the car feels this backward force, it nevertheless does not actually exist. In actuality, everyone and everything within the car remains stationary for a moment as the car and its chairs accelerate forward, and hence the backrests of the chairs accelerate forward and collide with our own backs. This is amusing: within the car we feel pulled backward into the backrests of the chairs, but in actuality we remain stationary while the backrests of the chairs accelerate forward into our backs! Although we feel a backward force within the car, we nevertheless conclude that this backward force is a fictitious force or a pseudoforce. It does not actually exist; it only seems to exist within the car as the car accelerates forward. As another example, suppose we are in a moving car when we see a green traffic light turn yellow, and so we place our foot upon the car’s brake pedal. As the car slows down, everyone and everything in the car feels a forward force. We actually feel ourselves pulled forward off of the backrest of our chair. Anything hanging from the rearview mirror also swings forward. In extreme cases, we may feel pulled forward so strongly that our heads may collide with the windshield. This forward force is a fictitious force or a pseudoforce. It does not exist; it only seems to exist within the car as the car slows down. Although everyone and everything within the car feels this forward force, it nevertheless does not actually exist. In actuality, everyone and everything within the car remains in motion for a moment as the car and its chairs and its windshield slow down, and hence the backrests of the chairs move away from our own backs while the windshield moves toward our heads. This is amusing: within the car we feel pulled forward off of the backrests of the chairs and toward the windshield, but in actuality the backrests of the chairs move away from our backs and the windshield moves toward our heads! Although we feel a forward force within the car, we nevertheless conclude that this forward force is a fictitious force or a pseudoforce. It does not actually exist; it only seems to exist within the car as the car slows down. As yet another example, suppose we are in a moving car when we see that the highway ramp ahead curves to the left, and so we turn the steering wheel to the left so that the car will remain on the highway ramp. As the car turns left, everyone and everything in the car feels a rightward force. We actually feel ourselves pulled rightward away from the driver’s side of the car and toward the passenger’s side of the car. Anything hanging from the rearview mirror also swings rightward and continues to remain suspended rightward in apparent defiance of the Earth’s downward gravity as the car turns left! This rightward force is a fictitious force or a pseudoforce. It does not exist; it only seems to exist within the car as the car turns left. Although everyone and everything within the car feels this rightward force, it nevertheless does not actually exist. In actuality, everyone and everything within the car remains in forward motion as the car turns left, and hence the driver’s side of the car turns away from us while the passenger’s side of the car turns toward us. This is amusing: within the car we feel pulled rightward toward the passenger’s side of the car, but in actuality we remain in forward motion while the passenger’s side of the car turns leftward toward us! Although we feel a rightward force within the car, we nevertheless conclude that this rightward force is a fictitious force or a pseudoforce. It does not actually exist; it only seems to exist within the car as the car turns left. As a fourth example, projectiles will appear to suffer from deflections within a rotating frame of reference. This deflecting force is a fictitious force or a pseudoforce. It does not exist; it only seems to exist within the rotating frame of reference. In actuality, the projectiles are not deflected; the projectiles in fact continue moving along straight paths. The frame of reference is rotating, and the rotation of the entire frame of reference seems to cause projectiles to deviate from straight trajectories. This particular fictitious force or pseudoforce is called the Coriolis force, named for the French physicist Gaspard-Gustave de Coriolis who first derived the mathematical equations describing this particular fictitious force or pseudoforce. The Coriolis force appears to cause rightward deflections in frames of reference rotating counterclockwise, and the Coriolis force appears to cause leftward deflections in frames of reference rotating clockwise. The Coriolis force appears to cause stronger deflections if the frame of reference is rotating faster and appears to cause weaker deflections if the frame of reference is rotating slower. The Coriolis force appears to vanish if the frame of reference stops rotating. The Coriolis force only appears to cause deflections; it does not cause projectiles to speed up or slow down.

As we discussed earlier in the course, physicists use the word acceleration for the rate at which an object’s motion changes, where the object could be suffering from any change in motion whatsoever. An object that is speeding up is said to be accelerating, but an object that is slowing down is also said to be accelerating. (In colloquial English, we would use the word decelerating instead.) Moreover, an object that is neither speeding up nor slowing down but only changing the direction that it moves is also said to be accelerating. In all four of our examples of fictitious forces or pseudoforces, notice that the frame of reference is accelerating. In the first example, the car was accelerating forward. In the second example, the car was slowing down, which again is a form of acceleration. In the third example, the car was changing the direction that it was moving, which again is a form of acceleration. In our fourth example, the entire frame of reference was rotating, which is also a form of acceleration. A frame of reference where there are no fictitious forces or pseudoforces is called an inertial frame of reference, while a frame of reference where fictitious forces or pseudoforces appear to exist is called a non-inertial frame of reference. It is not difficult to prove mathematically that all inertial frames of reference (where there are no fictitious forces or pseudoforces) are not accelerating relative to one another. It is also not difficult to prove mathematically that all non-inertial frames of reference (where fictitious forces or pseudoforces appear to exist) are accelerating relative to all inertial frames of reference, as our four examples illustrate. Since fictitious forces or pseudoforces appear to exist within non-inertial (accelerating) frames of reference, the laws of physics require exotic modifications when used within non-inertial frames of reference. Since there are no fictitious forces or pseudoforces within inertial (non-accelerating) frames of reference, the laws of physics do not require these exotic modifications when used within inertial frames of reference. More plainly, the laws of physics apply naturally from within inertial (non-accelerating) frames of reference, but the laws of physics do not naturally apply from within non-inertial (accelerating) frames of reference. All of the laws of physics we have discussed thus far in this course apply naturally from within inertial (non-accelerating) frames of reference. In particular, Galilean-Newtonian Relativity theory, Newton’s laws of motion, Newton’s theory of gravitation, Maxwell’s electromagnetic theory, Quantum Mechanics, and even Einstein’s Special Relativity theory all apply naturally from within inertial (non-accelerating) frames of reference. All of the laws of physics we have discussed thus far in this course do not apply naturally from within non-inertial (accelerating) frames of reference. Note that this is why Einstein’s Special Relativity theory is called Special Relativity. This theory only applies naturally from within special frames of reference, inertial (non-accelerating) frames of reference, just as all the laws of physics we have discussed thus far in this course apply naturally from within inertial (non-accelerating) frames of reference.

Einstein was extremely bothered by this restriction upon the laws of physics, in particular upon his Special Relativity theory. If the laws of physics are the mathematical equations that describe the universe, then we should feel free to apply them from within any frame of reference whatsoever. Consequently, Einstein realized that he must generalize his Special Relativity theory to a new theory of physics that could be applied from within any frame of reference whatsoever, whether inertial (non-accelerating) or non-inertial (accelerating). This new more general theory Einstein called General Relativity theory, since it is more general than his Special Relativity theory and indeed more general than all other laws of physics. Einstein insisted that this new General Relativity theory must apply naturally from within not only inertial (non-accelerating) frames of reference but from within non-inertial (accelerating) frames of reference as well. Fictitious forces or pseudoforces appear to act upon all objects from within non-inertial (accelerating) frames of reference. Einstein then realized that there is another force that acts upon all objects: gravitation. Einstein began to imagine that fictitious forces or pseudoforces must act like gravitational forces, and therefore his General Relativity theory must ultimately be a theory of gravity. To illustrate how fictitious forces or pseudoforces act like gravitational forces, consider a spaceship far from all stars and planets or any other large gravitating objects. The astronauts within this spaceship would feel weightless as long as the spaceship were not accelerating. However, suppose the spaceship had sufficient fuel to thrust the spaceship, causing an acceleration. While the spaceship accelerates, everyone and everything within the spaceship would feel fictitious forces or pseudoforces, and hence these fictitious forces or pseudoforces would feel like gravitational forces, even though the spaceship is far from all stars and planets or any other large gravitating objects. In fact, if the spaceship had sufficient fuel to thrust the spaceship with an acceleration of 9.8 meters per second per second, then the astronauts would feel the same gravity within the spaceship that they would feel if they were standing on the surface of the Earth. As long as the spaceship has sufficient fuel to continue to accelerate the spaceship, everyone and everything within the spaceship would feel gravity as if they were standing on Earth instead of in a spaceship in outer space! This example persuades us that we can turn gravity on within non-inertial (accelerating) frames of reference. We can also turn gravity off within non-inertial (accelerating) frames of reference. For example, suppose we are standing within an elevator on planet Earth. Now suppose the elevator cable breaks, causing the elevator to fall. We present two arguments to persuade us that everyone and everything within this falling elevator would feel weightless while falling. Firstly, everything falls toward the Earth with the same acceleration ignoring non-gravitational forces such as air resistance, as we discussed earlier in the course. Hence, everyone and everything within the elevator accelerates downward together. Consequently, if we were to take our keys out of our pocket for example and let go, our keys would not appear to fall down but would instead appear to simply float in front of us, since we ourselves and our keys and everything within the elevator are accelerating downward along with the elevator with the same acceleration. Secondly, since the elevator is accelerating downward, it is a non-inertial frame of reference. Therefore, everyone and everything within the elevator should feel a fictitious force or a pseudoforce upward that would exactly cancel the Earth’s downward gravity. These two arguments persuade us that everyone and everything within the falling elevator feels weightless. More generally, gravity is always turned off within all freely falling frames of reference. Caution: just as physicists use the word acceleration for any change in motion whatsoever, physicists use the term freely falling for any frame of reference moving only under the influence of gravity. Someone who is falling downward is said to be freely falling, but someone who is shot upward out of a cannon is also said to be freely falling even while they are moving upward. Someone who is shot out of a cannon at an angle is also said to be freely falling even though they are moving along a trajectory that at first takes them upward and then later takes them downward. Moons orbiting planets are freely falling even if the moon and the planet are not actually approaching each other. Planets orbiting stars are also freely falling even if the planet and the star are not actually approaching each other. In all such cases, gravity is turned off within freely falling frames of reference. For example, astronauts feel weightless while orbiting the Earth even though astronauts almost always orbit close enough to the Earth that its gravity is essentially as strong as the gravity on the surface of the Earth. As a counterintuitive example of this principle, consider a spaceship falling toward a planet. Most students believe that the astronauts within the spaceship would feel stronger and stronger gravity as their spaceship approaches the planet, but this is false. In actuality, the astronauts feel weightless during their entire journey falling toward the planet, since they are in a freely falling frame of reference. Assuming the planet has no atmosphere that would slow the spaceship down or burn the spaceship up, the astronauts within the spaceship would feel weightless during their entire journey, right up to the moment just before they crash upon the planet. Other astronauts right next to the crash site who are standing upon the planet feel the planet’s gravity, but the astronauts within the spaceship feel weightless, even immediately before crashing even though they are right next to the other astronauts standing upon the planet who do feel the planet’s gravity! Einstein struggled for roughly ten years to mathematically express all of these ideas, and in the year 1915 he finally formulated his General Relativity theory. Firstly, this new General Relativity theory states that we live in a four-dimensional spacetime with three spatial dimensions and one temporal dimension that all mix into one another. Although this is precisely what Special Relativity theory already asserts, this new theory in addition states that gravity is the curvature of our four-dimensional spacetime. According to Special Relativity theory, our four-dimensional spacetime has a flat (uncurved) geometry because Special Relativity does not include the effects of gravity. According to General Relativity theory, our four-dimensional spacetime has a curved geometry, since this new theory states that gravity is the curvature of our four-dimensional spacetime. To the present day, Einstein’s General Relativity is the only theory in all of physics that places all frames of reference, both inertial (non-accelerating) and non-inertial (accelerating), on equal footing. Einstein’s General Relativity theory may be applied from within any frame of reference whatsoever, whether or not there appear to be fictitious forces or pseudoforces from within the frame of reference.

All of the outrageous conclusions that Einstein deduced from Special Relativity are still true in General Relativity, but these conclusions are even more outrageous in this new theory. For example, does time dilation still occur? Does a clock still run slow when it moves according to General Relativity theory? Yes, but this effect is now even worse. According to General Relativity theory, a clock does not even need to be moving for it to run slow because gravity itself slows down time! In particular, stronger gravity will slow down time more, while weaker gravity will slow down time less. Time dilation that is caused by motion is called kinematic time dilation, which is predicted by both Special Relativity theory and General Relativity theory. However, the slowing of time by gravity is called gravitational time dilation, which is predicted only by General Relativity theory. This gravitational time dilation was considered outrageous a century ago, but this effect has actually been observed in recent decades. For example, suppose we place one atomic clock on the ground floor of a building, and suppose we place another atomic clock on the roof of that building. Even after synchronizing these two atomic clocks, they do not remain synchronized! The atomic clock on the ground floor is closer to the Earth and thus feels stronger gravity than the atomic clock on the roof, which is further from the Earth and thus feels weaker gravity. Therefore, the atomic clock on the ground floor will run slower and will lag further and further behind the atomic clock on the roof! Is Einstein actually claiming that whenever we walk upstairs or downstairs that our clocks are not synchronized with everyone else’s clocks? Yes! But then why do we never notice in our daily experience that all of our clocks read different times? The Earth’s gravity is sufficiently weak that this gravitational time dilation is so tiny that we do not notice it in our daily experience. Even the Sun’s gravity causes only tiny amounts of this gravitational time dilation. We would only notice these temporal changes if we were subjected to incredibly strong gravity, such as near a neutron star or a black hole. We will discuss black holes in detail shortly. The implications of this gravitational time dilation effect are staggering. For example, consider two identical twins who have lived together on the second floor of a building their entire lives. Hence, they are the same age. However, if one of these twins walks downstairs to the ground floor, that twin will age a tiny amount slower, since that twin’s time is now running slower. After walking back upstairs, that twin will now be a tiny amount younger than the twin who remained on the second floor! Our feet are younger than our head, since our feet are a little closer to the Earth than our head, thus causing our feet to age more slowly! As we discussed, the satellites orbiting the Earth all move at different speeds, resulting in kinematic time dilation. Moreover, all of the satellites orbiting the Earth are at various distances from the Earth. Hence, the satellites orbiting the Earth are subjected to varying gravitational field strengths from the Earth. Our own mobile telephones are with us on the surface of the Earth and therefore feel a stronger gravitational field strength than all satellites in orbit. As a result, all satellites as well as all of our mobile telephones suffer from gravitational time dilation. Indeed, the global positioning system (GPS) would not function correctly without taking into account both kinematic time dilation and gravitational time dilation.

Just as the vacuum speed of light c is an invariant according to Special Relativity theory, the vacuum speed of light c is still an invariant according to General Relativity theory. If we deduced kinematic time dilation from the invariance of the vacuum speed of light c in Special Relativity theory, we may deduce kinematic time dilation from the invariance of the vacuum speed of light c in General Relativity as well. Length contraction caused by motion is called kinematic length contraction, in correspondence with kinematic time dilation. If we deduced kinematic length contraction from the invariance of the vacuum speed of light c in Special Relativity theory, we may deduce kinematic length contraction from the invariance of the vacuum speed of light c in General Relativity as well. However, this effect is now even worse. According to General Relativity theory, an object does not even need to be moving for it to contact because gravity itself causes length contraction! In particular, stronger gravity will contract objects more, while weaker gravity will contract objects less. Just as the slowing of time by gravity is called gravitational time dilation, the contraction of space by gravity is called gravitational length contraction, which is predicted only by General Relativity theory.

Consider light that is emitted from the roof of a building that propagates to its ground floor. Because of gravitational length contraction, the wavelength of the light must contract as it approaches the ground floor, since the lower floors are closer to the Earth where gravity is stronger. However, the light must continue to propagate at the same speed, the vacuum speed of light c. The speed of any wave with wavelength λ and frequency f is determined by the equation v = f λ, where v is the speed (the velocity) of propagation of the wave, as we discussed earlier in the course. If the speed remains fixed and if the wavelength is contracted, then the frequency must increase by an appropriate amount to keep the product of the larger frequency f with the shorter wavelength λ equal to a fixed speed v (specifically c in this case). We may interpret this increased frequency as a blueshift. Conversely, consider light that is emitted from the ground floor of a building that propagates to its roof. Because of gravitational length contraction, the wavelength of the light must become less contracted (hence stretched) as it approaches the roof, since the higher floors are further from the Earth where gravity is weaker. However, the light must continue to propagate at the same speed, the vacuum speed of light c. Again, the speed of any wave with wavelength λ and frequency f is determined by the equation v = f λ, where v is the speed (the velocity) of propagation of the wave. If the speed remains fixed and if the wavelength is less contracted (hence stretched), then the frequency must decrease by an appropriate amount to keep the product of the smaller frequency f with the longer wavelength λ equal to a fixed speed v (specifically c in this case). We may interpret this decreased frequency as a redshift. As we discussed earlier in the course, motion causes the Doppler-Fizeau shift for any wave to occur. We now rename this Doppler-Fizeau shift the kinematic redshift (as well as the kinematic blueshift). The kinematic redshift (and blueshift) is predicted by both Special Relativity theory and General Relativity theory. However, we have just presented an argument for the gravitational redshift (as well as the gravitational blueshift), which is predicted only by General Relativity theory. More precisely, light that propagates from stronger gravitational fields toward weaker gravitational fields suffers from a gravitational redshift, while light that propagates from weaker gravitational fields toward stronger gravitational fields suffers from a gravitational blueshift. This gravitational redshift (and gravitational blueshift) has actually been observed. When an electron in an atom undergoes a transition from a higher-energy quantum state to a lower-energy quantum state, it must emit a photon with a specific frequency and a specific wavelength in accordance with the spectrum of the atom, as we discussed earlier in the course. If an atom on the ground floor of a building emits a photon that propagates toward the roof of the building, anyone on the roof will measure the frequency of that photon to be lower (or its wavelength to be longer) than the photon emitted from the same transition in an identical atom that happens to be located at the roof instead! Conversely, if an atom on the roof of a building emits a photon that propagates toward the ground floor of the building, anyone on the ground floor will measure the frequency of that photon to be higher (or its wavelength to be shorter) than the photon emitted from the same transition in an identical atom that happens to be located at the ground floor instead! Of course, the Earth’s gravity is sufficiently weak to cause only tiny amounts of this gravitational redshift (and gravitational blueshift). Even the Sun’s gravity causes only tiny amounts of this gravitational redshift (and gravitational blueshift). This gravitational redshift (and gravitational blueshift) only becomes severe with incredibly steep changes in gravity, such as near a neutron star or a black hole. Again, we will discuss black holes in detail shortly. Einstein’s General Relativity theory actually predicts a third type of redshift caused by the expansion of the universe called cosmological redshift, as we will discuss toward the end of the course. In summary, Einstein’s General Relativity theory predicts three different types of redshift: kinematic redshift caused by motion, gravitational redshift caused by the curvature of spacetime, and cosmological redshift caused by the expansion of the universe. All three of these redshifts have been observed for several decades, providing further evidence of the validity of Einstein’s General Theory of Relativity.

Just as the vacuum speed of light c is the speed limit of the universe according to Special Relativity theory, the vacuum speed of light c is still the speed limit of the universe according to General Relativity theory. If c is still the speed limit of the universe, then nothing can move faster than that speed. We now realize that not even gravity can move faster than c! In fact, Einstein’s General Relativity theory states that gravity itself moves at the speed c, just as light moves at the speed c. The implications of this are shocking. For example, suppose that the Sun were abruptly removed from the Solar System right now at this very moment. Since it takes light roughly eight minutes to propagate from the Sun to the Earth, we would continue to see the Sun shining in the sky for roughly eight minutes after its removal. Then, we would see the Sun removed. Furthermore, since gravity also propagates at the same vacuum speed of light c, the Earth would continue moving along its elliptical orbit as if the Sun still attracted it for roughly eight minutes after the Sun’s removal! Then, the Earth would gravitationally feel that the Sun has been removed and would finally move off of its elliptical orbit!

As we discussed earlier in the course, light is electromagnetic radiation or electromagnetic waves. More precisely, light is a propagating disturbance through an electromagnetic field. If gravity also moves at the same speed c, then there must be gravitational waves that are propagating disturbances through a gravitational field. According to Einstein’s General Relativity theory, gravity is actually the curvature of our four-dimensional spacetime, and hence this new theory predicts the existence of gravitational waves that are propagating disturbances through the curvature of our four-dimensional spacetime. In the year 1974, the American astrophysicists Russell Alan Hulse and Joseph Hooton Taylor discovered a binary neutron star system. These two neutron stars are orbiting sufficiently close to each other and orbiting sufficiently fast that they should be radiating significant amounts of gravitational waves. As these two neutrons stars radiate gravitational waves, they must lose orbital energy, since energy must be conserved. Therefore, these two neutrons stars must approach each other. Indeed, Hulse and Taylor measured that these two neutron stars are approaching each other by the precise amount that Einstein’s General Relativity theory predicts. Hulse and Taylor received the Nobel Prize for their achievement, and this binary neutron star system was named the Hulse-Taylor system in their honor. Nevertheless, this is not a direct detection of the gravitational wave itself. A direct detection of gravitational waves would require extraordinarily sensitive measurements of varying time dilation and varying length contraction as the crests and the troughs of the gravitational waves pass through the detector. The technology to make such measurements was not achieved until the year 2015, the one hundredth anniversary of Einstein’s General Relativity theory! Ever since that historic year, astronomers have directly detected several gravitational waves passing through the Earth. Most of these gravitational waves that astrophysicists have directly detected since the year 2015 were radiated from the collision and merger of binary black holes into single black holes in distant galaxies. This is a splendid manifestation of Einstein’s genius. His theory predicted the existence of gravitational waves, but it took one century for technologies to be developed that could directly detect their existence! Just as there is an entire Electromagnetic Spectrum of different wavelengths or frequencies of electromagnetic waves, there is an entire Gravitational Spectrum of different wavelengths or frequencies of gravitational waves. Although astrophysicists have spent decades observing the universe using electromagnetic waves from different bands across the Electromagnetic Spectrum to gain a more complete understanding of our universe, astrophysicists have just barely begun to observe the universe using gravitational waves from different bands across the Gravitational Spectrum. A completely new window has now been opened for astrophysicists to explore to gain an even more complete understanding of our universe.

If a certain amount of mass were concentrated into a single mathematical point of zero volume, then this point-mass would have infinite density. According to General Relativity theory, the gravity near this point-mass would be incredibly strong, since the curvature of the four-dimensional spacetime near this point-mass would be incredibly severe. The gravity would be so strong because of the severe spacetime curvature that an object too close to this point-mass would need to move faster than c to escape its gravity, but moving faster than c is forbidden. Mathematically, there is a sphere surrounding this point-mass that marks the boundary of no return. An object outside of this mathematical sphere may have hope of escaping the gravity of the point-mass, but an object that crosses inside of this mathematical sphere would have no hope of escaping the incredibly strong gravity of the point-mass. The object’s light would not even be able to escape from within the mathematical sphere. Thus, it would appear as if the object disappeared from our universe, as if it fell into a hole. This hole would appear black, since light cannot escape from within the mathematical sphere. So, objects falling toward the infinite-density point-mass would appear as if they are falling into a hole that is black. For several decades, these fantastic objects have been called black holes. The center of the black hole is its singularity, the point-mass of infinite density. The mathematical sphere surrounding the singularity is the event horizon, the boundary of no return. The radius of the event horizon is sometimes called the black hole radius but is more often called the Schwarzschild radius, named for the German physicist Karl Schwarzschild who mathematically derived the simplest black-hole solution to Einstein’s General Relativity theory. Karl Schwarzschild derived the following equation for the Schwarzschild radius (black hole radius) of the event horizon of a black hole: r_s = 2GM / c², where r_s is the Schwarzschild radius (black hole radius) of the event horizon, G is Newton’s gravitational constant of the universe, M is the mass of the black hole, and c is as usual the vacuum speed of light. Using this equation, we can easily calculate that the Schwarzschild radius (black hole radius) of a typical stellar black hole born from the Type II supernova of a very high mass star is very roughly eight kilometers. So, any object further than very roughly eight kilometers from the singularity of such a black hole may have hope of escaping its gravity, but any object closer than this distance from the singularity of such a black hole has crossed the event horizon and has no hope of escaping its gravity.

Nothing can escape from within the event horizon of a black hole, not even light. Hence, the event horizon of a black hole appears black, hence the name black hole. Many students believe that outer space is also black, thus preventing us from ever imaging the event horizon of a black hole against the surrounding space. Although this would indeed be the case for a completely isolated black hole, in actuality diffuse gas fills the entire universe. Hence, it should be possible to see the black event horizon of a black hole against the gases of the surrounding space. Some black holes have accretion disks around them, such as in X-ray binaries, and it should be possible to see the black event horizon of such a black hole against the surrounding accretion disk. Unfortunately, the Schwarzschild radius of a typical stellar black hole is very roughly eight kilometers, and no telescope is large enough to provide sufficient resolution (magnification) to image such a small size, even if the black hole resided as near as within the solar neighborhood. However, there are supermassive black holes in our universe, as we will discuss later in the course. The Schwarzschild radius of a typical supermassive black hole is at least one million kilometers! As we discussed earlier in the course, two telescopes on opposite sides of planet Earth used together as a single interferometer would in principle have the same resolving power as a single telescope the size of planet Earth. Using many radio telescopes working together as a single interferometer, astronomers succeeded in the year 2019 in imaging the event horizon of a supermassive black hole at the center of a distant galaxy. Although this galaxy is roughly sixteen megaparsecs (roughly fifty million light-years) distant, the supermassive black hole at its center has a Schwarzschild radius of roughly fifteen billion kilometers. This is roughly the size of our Solar System, from the Sun all the way out to the Kuiper belt just beyond the orbit of Neptune! In the radio images of this supermassive black hole, we actually see its black event horizon against the gases of the accretion disk that surrounds the supermassive black hole. In the year 2022, astronomers published a radio image of the event horizon of the supermassive black hole at the center of our own Milky Way Galaxy. Again, this image was produced by many radio telescopes working together as a single interferometer.

Although astronomers have been certain for decades that black holes actually exist, the first black hole ever discovered, the compact object in the X-ray binary Cygnus X-1, was not discovered until after Einstein died. In fact, Einstein himself did not believe that these strange objects actually existed in our universe. Nevertheless, even while Einstein was alive, physicists recognized an important application of Karl Schwarzschild’s mathematical discovery of this point-mass solution (black hole solution) to Einstein’s General Relativity theory. The spacetime curvature (the gravity) outside of an isotropic (spherically symmetric) distribution of mass is exactly the same as if all of its mass were concentrated at its center. More plainly, the spacetime curvature (the gravity) outside of a spherical distribution of mass (beginning at its surface and extending outward) would be the same spacetime curvature (the same gravity) as a black hole of the same total mass placed at the center of the spherical distribution. This statement is called the Birkhoff theorem, named for the American mathematician George David Birkhoff who first mathematically proved this important result. As an application of the Birkhoff theorem, the Earth is a spherical distribution of mass to an excellent approximation. Therefore, the gravity outside of the Earth (beginning at its surface and extending outward) is nearly exactly equal to the gravity of a black hole with the same mass as the Earth placed at the center of the Earth. Students often ask for a description of how the gravity of a black hole would feel if we were near the black hole but still outside its event horizon. According to the Birkhoff theorem, every moment of our lives we feel nearly precisely the same gravity from the Earth as we would feel from a black hole equal in mass to the Earth and placed at the center of the Earth, roughly 6400 kilometers beneath our feet! As another application of the Birkhoff theorem, the Sun is also a spherical distribution of mass to an excellent approximation. Therefore, the gravity outside of the Sun (beginning at its surface and extending outward) is nearly exactly equal to the gravity of a black hole with the same mass as the Sun placed at the center of the Sun. More plainly, the gravity with which the Sun attracts the planets and everything else in the Solar System is nearly exactly the same gravity as a black hole with the same mass as the Sun placed at the center of the Sun. Although the Sun is a low mass star and will never suffer from a supernova, imagine that the entire Sun were to suddenly collapse into a black hole with no change in mass. Most students believe that its gravity would now become so strong that it would begin to suck in the planets one by one, beginning with Mercury then Venus then Earth and so on and so forth, but this is false. According to the Birkhoff theorem, the Sun already creates nearly the same gravity as a black hole of the same mass placed at its center. Therefore, virtually nothing would happen to the orbits of the planets if the Sun were to collapse into a black hole. The planets would continue to orbit that black hole with almost precisely the same orbits that they have always enjoyed! Of course, all of the planets would also begin to cool, since they would no longer receive any light from this hypothetically collapsed Sun. All of us on Earth would eventually freeze to death, although we would die long before then, since nearly all life on Earth depends entirely upon sunlight, as we discussed earlier in the course. Nevertheless, the orbits of the planets as well as the orbits of everything else in the Solar System would remain almost exactly the same. Note that the Birkhoff theorem is also true in Newton’s theory of gravitation: the gravity outside of a spherical distribution of mass (beginning at its surface and extending outward) would be the same gravity as a point-mass of the same total mass placed at the center of the spherical distribution. For example, if we were to calculate the gravitational force between the Earth and the Moon according to Newton’s theory of gravitation, we would use Newton’s law of universal gravitation: , where r is the distance between the Earth and the Moon, as we discussed earlier in the course. However, what value do we use for this distance? After all, different parts of the Earth are different distances from different parts of the Moon. However, both the Earth and the Moon are spherical distributions of mass to an excellent approximation. Therefore, the gravity outside of each of them (beginning at their surfaces and extending outward) is nearly exactly equal to the gravity of point-masses placed at their centers, with the same masses as the Earth and the Moon of course. Hence, the Birkhoff theorem reveals that we must always use the center-to-center distance whenever calculating the gravitational force between the Earth and the Moon or between almost any pair of objects in the entire universe.

Today, we know that black holes do actually exist in our universe, and we also know that black holes form when the core of a very high mass star is able to overcome neutron degeneracy pressure, since its mass is greater than the Tolman-Oppenheimer-Volkoff limit. If neutron degeneracy pressure cannot halt the collapse of the core, then nothing can halt the collapse of the core. The core continues collapsing all the way down to a mathematical point, a black hole. This is the ultimate triumph of gravity. This reveals another interpretation of the black hole radius (the Schwarzschild radius). As we discussed, the equation for the black hole radius (the Schwarzschild radius) is r_s = 2GM / c². Notice that the only variable that determines the black hole radius (the Schwarzschild radius) according to this equation is the mass M, since G (Newton’s gravitational constant of the universe) and c (the vacuum speed of light) and certainly 2 are all fixed numbers. Therefore, there is nothing stopping us from calculating the Schwarzschild radius of any object in the universe, not just black holes. For example, we can easily calculate that the Schwarzschild radius of our Sun is roughly three kilometers. Many students protest this calculation, since the Sun is not a black hole and will in fact never become a black hole, since the Sun is a low mass star. Nevertheless, there is nothing stopping us from using the mass of the Sun in this Schwarzschild radius equation. If the Sun is not a black hole and will in fact never become a black hole, then how do we interpret this three-kilometer Schwarzschild radius for the Sun? Firstly, we note that the actual radius of the Sun (roughly seven hundred thousand kilometers) is much much larger than its Schwarzschild radius (roughly three kilometers). As a result, the Sun’s gravity is weak as compared to what the Sun’s gravity would be if we could collapse it into a black hole. Furthermore, this three-kilometer Schwarzschild radius for the Sun would be the size we would need to crush the Sun down into before its own self-gravity became strong enough to crush itself all the way down into a black hole. In other words, if we were to crush the entire mass of the Sun down to a radius of less than three kilometers, the Sun would not be able to escape from its own self-gravity, and the Sun would crush itself all the way down to a black hole. The Earth’s Schwarzschild radius is roughly nine millimeters. The actual radius of the Earth is nearly 6400 kilometers, which is much much larger than nine millimeters. Hence, the Earth’s gravity is weak as compared to what the Earth’s gravity would be if we could collapse it into a black hole. Furthermore, this nine-millimeter Schwarzschild radius for the Earth would be the size we would need to crush the Earth down into before its own self-gravity became strong enough to crush itself all the way down into a black hole. In other words, if we were to crush the entire mass of the Earth down to a radius of less than nine millimeters, the Earth would not be able to escape from its own self-gravity, and the Earth would crush itself all the way down to a black hole. Note that nine millimeters is almost ten millimeters, which is equal to one centimeter. This would be a Schwarzschild diameter of roughly two centimeters, which is roughly one inch since one inch is exactly 2.54 centimeters. Therefore, we would have to crush the entire mass of the Earth down to a size of roughly one inch to turn it into a black hole! The Schwarzschild radius of a typical human is ten billion times smaller than the nucleus of an atom! In other words, we would have to crush our bodies down to this size before our own self-gravity becomes strong enough to crush our bodies all the way down into a black hole!

Einstein was ridiculed for Special Relativity theory, and he was even more harshly ridiculed for General Relativity theory, but it would only be a couple years after he proposed General Relativity theory that his theories would be successfully tested. For example, although the orbits of the planets around the Sun are ellipses, those elliptical orbits do not remain fixed. A planet’s elliptical orbit around the Sun actually suffers a very slow orbital precession. This orbital precession is caused mostly by the gravitational tugs of the other planets, primarily Jupiter as we discussed earlier in the course. However, Mercury’s orbit has a very tiny anomalous orbital precession that could not be explained using Newton’s theory of gravity. The amount of Mercury’s anomalous orbital precession is roughly forty-three arcseconds per century. This is a fantastically tiny orbital shift. According to General Relativity theory, the curvature of spacetime caused by the Sun’s gravity causes a planet’s orbit to suffer a tiny orbital precession in addition to the orbital precession caused by the gravitational tugs from the other planets, primarily Jupiter. This tiny extra precession is called general-relativistic orbital precession. When Einstein calculated the amount of this general-relativistic orbital precession for Mercury’s orbit caused by the spacetime curvature of the Sun’s gravity, he obtained exactly forty-three arcseconds per century! Although this superb achievement convinced Einstein that his General Relativity theory was superior to Newton’s theory of gravity, most physicists were still not convinced, and so Einstein proposed the following experiment. According to his General Relativity theory, everything in the universe feels gravity, including light itself. More correctly, the curvature (the gravity) of our four-dimensional spacetime deflects the trajectory of everything, including light itself. More plainly, light falls in gravity just as everything else falls in gravity. In recent decades, astronomers have observed that the gravity of galactic clusters bends the light from distant galaxies, thus distorting the image of the distant galaxies. Since this is rather like the glass of a lens bending light, the gravity of a galactic cluster is called a gravitational lens. If the distant galaxy, the gravitational lens, and our own Milky Way Galaxy happen to form a nearly straight line, the light of the distant galaxy bends into the shape of a ring around the galactic cluster. This is called an Einstein ring. If the gravitational lens happens to be slightly displaced from the line connecting the distant galaxy and our own Milky Way Galaxy, the light of the distant galaxy bends into two duplicate images in the shape of arcs around the galactic cluster. These are called Einstein arcs. If the gravitational lens happens to be even more displaced from the line connecting the distant galaxy and our own Milky Way Galaxy, the light of the distant galaxy bends into four duplicate images around the galactic cluster. This is called an Einstein cross. The amount by which light is deflected by the Earth’s weak gravity is fantastically tiny. This is why we do not notice light falling downward in our daily experience. The Sun’s gravity is stronger than the Earth’s gravity, but even the Sun’s gravity is so weak that no one ever noticed the deflection of light around the Sun before Einstein. Using his General Relativity theory, Einstein calculated that light should be deflected by roughly 1.75″ (1.75 arcseconds or 1.75 seconds of arc) around the surface of the Sun. Although this is an incredibly small angle, it was measurable a century ago. However, we cannot see any stars in the daytime, besides the Sun of course! So, measuring the deflection of starlight around the surface of the Sun seemed hopeless. However, during the totality of a total solar eclipse, the sky becomes sufficiently dark that stars become visible, as we discussed earlier in the course. A total solar eclipse was scheduled to occur in the year 1919, and many physicists gathered for this eclipse for the purpose of proving Einstein wrong. When totality occurred, starlight was indeed deflected around the surface of the Sun, and astronomers measured the deflection to be 1.75″ (1.75 arcseconds or 1.75 seconds of arc), in precise agreement with Einstein’s General Relativity theory! Practically overnight, Einstein went from being ridiculed to being considered one of the most brilliant men, if not the most brilliant man, who ever lived. A mediocre physicist who struggled with mathematics had discovered correct theories of our universe using only the power of his own genius.

Einstein once claimed that when he studied physics, “I to know how God created this world … I want to know His thoughts.” In other words, Einstein believed that God not only created the universe, but God also authored the mathematical equations that describe the universe, the laws of physics. Moreover, Einstein believed that God authored a single ultimate mathematical equation that completely describes the universe. If the universe is described by a single ultimate mathematical equation, Einstein believed that that equation should be deducible from pure logic, from pure mathematics. Einstein once claimed that when he studied physics that he wanted to know “whether God had any choice in the creation of the world.” In other words, Einstein believed that if the ultimate mathematical equation that describes the universe is deducible from pure logic, from pure mathematics, then the laws of the universe could not possibly be different from what they actually are. Einstein also believed that this ultimate equation must be mathematically beautiful, just as he believed his own General Relativity theory was mathematically beautiful. In fact, Einstein claimed that if the deflection of starlight around the Sun was not as his theory predicted, then he would have “pitied the Lord because it would have proven that He did not create the universe correctly.” Although this quotation seems to imply that Einstein was so arrogant that he believed himself to be more intelligent than God, this quotation actually reveals that Einstein was humbled by the mathematical beauty of the universe that God created. This is summarized by another quotation by Einstein, “Subtle is the Lord, but malicious He is not.” In other words, the laws of physics that describe the universe may not be obvious and thus may require a genius to discover them, but God is not evil and hence God would not create a universe that was so complicated that humans would not be able to discover the mathematical equations that describe it. On the other hand, Einstein also once said, “The most incomprehensible thing about the universe is that it is comprehensible.” In other words, why did God decide to create the universe governed by beautiful mathematical equations? We could ask this question the other way around. What is it about the human mind that it is able to not only study and to understand the universe but beyond this to actually discover the mathematical equations that describe the universe? What is it about a genius such as Albert Einstein that he is able to capture the mind of God, to actually discover the mathematical equations that God authored when He created the universe? Einstein spent the last few decades of his life living in New Jersey trying to discover the ultimate theory of the universe that he believed God authored when He created the universe. Today, we would call such a theory a Super Unification Theory or a Theory of Everything, as we will discuss toward the end of the course. Einstein did not succeed in his quest, but other physicists in the decades after Einstein have succeeded in bringing us much closer to this ultimate theory than Einstein could have ever dreamt, as we will also discuss toward the end of the course. Nevertheless, many physicists agree that no other person single-handedly advanced our understanding of the universe more than Albert Einstein.

Links

Libarid A. Maljian homepage at the Department of Physics at CSLA at NJIT

Libarid A. Maljian profile at the Department of Physics at CSLA at NJIT

Department of Physics at CSLA at NJIT

College of Science and Liberal Arts at NJIT

New Jersey Institute of Technology

This webpage was most recently modified on Wednesday, the twentieth day of November, anno Domini MMXXIV, at 03:45 ante meridiem EST.