This is one of the webpages of Libarid A. Maljian at the Department of Physics at CSLA at NJIT.
New Jersey Institute of Technology
College of Science and Liberal Arts
Department of Physics
Introductory Astronomy and Cosmology, Section 101
Fall 2024
Third Examination lecture notes
Our Star, the Sun
The Sun is a star. We know more about the Sun than any other
star, since the Sun is by far the closest star to the Earth. The Sun is roughly one hundred and fifty
million kilometers from the Earth. This
seems distant by human standards, but in fact this is
extremely close by astronomical standards.
The nearest stars besides the Sun are more than two hundred thousand
times further from the Earth as compared with the Sun. Therefore, the Sun is indeed extremely close
to the Earth by astronomical standards, enabling astrophysicists to learn much
more about the Sun than any other star in the universe.
Astrophysicists use the
symbol R☉ to
denote the radius of the Sun, as we discussed earlier in the course. The radius of the Sun R☉ is such a fundamental unit in stellar astrophysics
that it is called a solar radius. The Sun is enormous; one solar radius R☉ is roughly equal to seven hundred thousand
kilometers. This is roughly one hundred
times the Earth’s radius. In other
words, one solar radius R☉ is
roughly equal to 100R⊕, where
astrophysicists use the symbol R⊕ to
denote the radius of the Earth, as we discussed earlier in the course. Since one solar radius R☉ is roughly equal to 100R⊕, the Sun
has a volume roughly one million times the Earth’s volume, since the volume of
a sphere is directly proportional to the cube of its radius and one hundred
cubed is one million. In other words, we
could fit roughly one million Earths inside the Sun! Astrophysicists use the symbol M☉ to denote the mass of the Sun, as we discussed
earlier in the course. The mass of the
Sun M☉ is such
a fundamental unit in stellar astrophysics that it is called
a solar mass. The mass of the Sun is
tremendous; one solar mass M☉ is roughly one thousand times the mass of Jupiter,
which is itself more massive than the rest of the mass of the Solar System
combined. Therefore, one solar mass M☉ is roughly one thousand times the mass of the rest of
the Solar System combined. More
precisely, one solar mass M☉ is
roughly two nonillion kilograms. (Please
refer to the following multiplication table, where each number is one thousand
times the previous number: one, one thousand, one million, one billion, one
trillion, one quadrillion, one quintillion, one sextillion, one septillion, one
octillion, one nonillion, one decillion.
Caution: this multiplication table is only correct in American
English. These same words are used for different numbers in British English.) Astrophysicists have determined the mass of
the Sun using Kepler’s third law. There
are eight planets and millions of asteroids and millions of comets all orbiting
the Sun, and astrophysicists use their orbital parameters together with
Kepler’s third law to determine the mass of the Sun. Even though all these different objects have
completely different orbits, Kepler’s third law always yields the same result
for the mass of the Sun. We can combine
the distance to the Sun with the intensity of sunlight we receive from the Sun
to calculate the luminosity of the Sun.
The luminosity of any object is the total amount of energy it radiates
every second, commonly known as the power output. The luminosity or the power output of any
object is measured in watts. Astrophysicists use cursive (script) ℒ for
luminosity. For any object with
luminosity ℒ, the intensity of the light I
at a distance r from the object is
given by the equation I = ℒ / 4πr2. This equation assumes that the object
radiates energy isotropically (equally in all
directions). Thus, the total energy
radiated by the object cuts through a sphere centered on the object, and the
surface area of a sphere of radius r
is 4πr2. This equation also reveals why a lightbulb
for example looks brighter when closer and dimmer when further. Doesn’t the
lightbulb radiate a constant luminosity (constant power output) regardless of
distance? Indeed it does, but that same
luminosity has spread out over a large sphere if we are far from the lightbulb. Hence, that constant luminosity is diluted over the large sphere, and thus a smaller
fraction of that luminosity enters our eye.
Conversely, that same luminosity is concentrated over a small sphere if
we are close to the lightbulb, and thus a larger fraction of that luminosity enters
our eye. We certainly know our distance
from the Sun, and we can measure the intensity of sunlight at our distance from
the Sun. Thus, the only unknown
remaining in the equation I = ℒ / 4πr2 is
the luminosity of the Sun.
Astrophysicists use the symbol ℒ☉ to denote the luminosity of the Sun. The luminosity of the Sun ℒ☉
is such a fundamental unit in stellar astrophysics that it is
called a solar luminosity. The
luminosity of the Sun is enormous; one solar luminosity ℒ☉
is roughly four hundred septillion watts.
(Again, please refer to the above multiplication table.) The Sun has been radiating roughly four
hundred septillion watts every second for roughly five billion years, and it
will continue to do so every second for the next roughly five billion
years! This begs the following question:
what is the source of this incredible luminosity? More plainly, why does the Sun shine? We will reveal the answer to this question
shortly.
The surface temperature of
the Sun is roughly six thousand kelvins.
Astrophysicists have determined the Sun’s surface temperature using two
different methods. Firstly, we can graph
the amount of light from the Sun as a function of the wavelength of the
light. The resulting graph is a continuous
blackbody spectrum, although there is an absorption spectrum superimposed upon
that continuous blackbody spectrum as we will discuss
shortly. From the peak of this
continuous blackbody spectrum, we can calculate the surface temperature of the
Sun. More plainly, we are calculating
the surface temperature of the Sun from its color. As we discussed earlier in the course, the
amount of energy radiated from a hot, dense object often follows the blackbody
spectrum, which is a continuous spectrum with its peak radiation within a band
of the Electromagnetic Spectrum determined by the temperature of the
object. In particular, hotter
temperatures correspond to higher photon energies (which are also at higher
frequencies and shorter wavelengths), while cooler temperatures correspond to
lower photon energies (which are also at lower frequencies and longer
wavelengths). In other words, a hot,
dense object’s primary radiation is displaced as its
temperature changes. This is the Wien
displacement law. More precisely, the
Wien displacement law states that the wavelength of a hot, dense object’s
primary radiation is inversely proportional to its temperature, assuming we
measure temperature with correct units such as kelvins or rankines. At one or two thousand kelvins, objects
radiate primarily red visible light. At
three or four thousand kelvins, objects radiate primarily orange visible
light. At five or six thousand kelvins,
objects radiate primarily yellow visible light.
At roughly ten thousand kelvins, objects radiate primarily blue visible
light. Notice how hotter temperatures
displace the primary radiation to higher and higher photon energies (which are
also higher and higher frequencies and shorter and shorter wavelengths), while
cooler temperatures displace the primary radiation to lower and lower photon
energies (which are also lower and lower frequencies and longer and longer
wavelengths). It is
commonly known that the Sun is a yellow star. For example, every young child will use a
yellow crayon when asked to draw the Sun.
From that yellow color, we can use the Wien displacement law to
calculate that the surface temperature of the Sun is roughly six thousand
kelvins. We can also calculate the
surface temperature of any hot, dense object using the Stefan-Boltzmann law,
which states that the luminosity of any hot, dense object is directly
proportional to the product of its surface area and the fourth power of its
surface temperature. Since the shape of
the Sun is very nearly a sphere and the surface area of a sphere of radius R is 4πR2, the
Stefan-Boltzmann law for the Sun states ℒ = σ(4πR2)T 4, where T is
the surface temperature. Also, σ (the
lowercase Greek letter sigma) is a fixed number called the Stefan-Boltzmann
constant. Warning: we use lowercase r for the distance from the hot object,
and we use uppercase (capital) R for
the actual radius of the hot object. In
particular for the Sun, r is roughly
one hundred and fifty million kilometers (our distance from the Sun), while R is roughly seven hundred thousand
kilometers (the actual radius of the Sun).
We already determined the luminosity of the Sun, and we certainly know
the radius of the Sun. Therefore, the
only unknown remaining in the Stefan-Boltzmann law ℒ = σ(4πR2)T 4 is the surface temperature of the Sun, which we again
calculate to be roughly six thousand kelvins, consistent with the
Wien-displacement-law method.
From the absorption spectral
lines (the Fraunhofer lines, as we discussed earlier
in the course) superimposed upon the Sun’s continuous blackbody spectrum, we
can determine the composition of the Sun.
We discover that the Sun is composed of all the atoms on the Periodic
Table of Elements, but not in equal amounts.
Only two atoms account for close to one hundred percent of the Sun’s
mass; all the other atoms on the Periodic Table of Elements account for only a
tiny fraction (tiny percentage) of the Sun’s mass. Roughly seventy-five percent (three-quarters)
of the Sun’s mass is hydrogen, and roughly twenty-five percent (one-quarter) of
the Sun’s mass is helium. Again, all the
other atoms on the Periodic Table of Elements make up a tiny fraction (tiny
percentage) of the Sun’s mass.
The Sun radiates roughly four
hundred septillion watts every second.
The Sun has been radiating this tremendous luminosity for roughly five
billion years, and it will continue to do so for the next roughly five billion
years. What is the source of this
incredible luminosity? More plainly, why
does the Sun shine? This question was
one of the great scientific debates of the 1800s (the
nineteenth century). Chemical reactions
provide nowhere nearly enough energy to account for the Sun’s luminosity over
its long lifetime. It is not difficult
to calculate that the Sun would consume all of its mass in only several
thousand years if it derived its luminosity from chemical reactions, but the
Sun has been shining for roughly five billion years. Gravitational contraction is not the Sun’s
energy source either. Although
gravitational contraction does convert gravitational energy into heat and
light, it is not difficult to calculate that the Sun would need to collapse in
several million years to account for its incredible luminosity. Although this several-million-year lifespan
is an improvement over the several-thousand-year lifespan chemical reactions
could provide, it is nevertheless still nowhere near the Sun’s actual lifetime,
which is in the billions of years.
Gravitational contraction is also called
Kelvin-Helmholtz contraction, named for British physicist William Thomson Lord
Kelvin and German physicist Hermann von Helmholtz, the two physicists who
developed the mathematical details of gravitational contraction. Note that while the Sun was being born as a
collapsing cloud of gas from within a diffuse nebula, it did derive its energy
from Kelvin-Helmholtz (gravitational) contraction as we will discuss shortly,
and indeed the Sun collapsed in only several million years while it was being
born. However, the Sun eventually
attained gravitational equilibrium, meaning outward pressure balances inward
self-gravity. The Sun has been in
gravitational equilibrium for roughly five billion years, and so
Kelvin-Helmholtz (gravitational) contraction does not explain why the Sun has been
shining for most of its lifetime. The 1800s (the nineteenth century) ended with this fundamental
question unanswered. Why does the Sun
shine? At the beginning of the 1900s (the twentieth century), the atomic theory of matter
became firmly established. Moreover,
physicists discovered that atoms are composed of even smaller particles: the
nucleus at the center of the atom and electrons around the nucleus. Physicists discovered that chemical reactions
involve the electrons around the nucleus.
Physicists also discovered nuclear reactions, which involve the nuclei
themselves. These nuclear reactions can
generate thousands, even millions, of times more energy than chemical
reactions. Perhaps the Sun derives its
energy from nuclear reactions. Before we
explore this idea further, we must discuss some fundamental physics.
There are four fundamental
forces in the universe. Starting with
the strongest force in the universe, we have the strong nuclear force, the
electromagnetic force, the weak nuclear force, and finally the gravitational
force is the weakest force in the universe.
Actually, the gravitational force is by far by far by far by far the
weakest force in the entire universe.
The gravitational force is much much much much weaker than the other
three forces. All of us have some familiarity
with gravity. As we discussed earlier in
the course, gravity causes everything in the universe to attract everything
else in the universe. All of us also
have some familiarity with the electromagnetic force. As we discussed earlier in the course, there
are both positive and negative electrical charges in our universe. Positive and positive repel, negative and
negative repel, and positive and negative attract. In other words, like charges repel, and
unlike charges attract. Protons are positively charged, while electrons are negatively
charged. Since unlike charges attract,
the positive protons within the atomic nucleus attract the negative electrons
around the atomic nucleus. This is what
holds the atom together, the attraction between the positive protons within the
nucleus and the negative electrons around the nucleus. However, what holds the nucleus of an atom
together? The atomic nucleus is composed
of protons and neutrons. The neutrons
are neutral; this is why they are called
neutrons! Since neutrons are neutral,
they are not attracted to or repelled from anything electromagnetically. More importantly, the protons are
positive. Hence, they repel each other
electromagnetically. If the neutrons
feel no electromagnetic attraction and if all the protons feel electromagnetic
repulsion from each other, then what is holding the atomic nucleus together? We deduce that there must be another force within the nucleus that is
stronger than the electromagnetic force and hence overpowers the electromagnetic
repulsion of the protons, thus holding the nucleus together. This force is the strongest force in the
entire universe, and it is called the strong nuclear
force. The strong nuclear force must be
stronger than the electromagnetic force, since the strong nuclear force must
overpower the electromagnetic repulsion among the protons to hold the atomic
nucleus together. The strong nuclear
force attracts protons and protons together, the strong nuclear force attracts
neutrons and neutrons together, and the strong nuclear force even attracts
protons and neutrons together. Protons
and neutrons are composed of even smaller particles called quarks and gluons,
and the strong nuclear force is also responsible for
holding quarks and gluons together to build protons and neutrons. More precisely, gluons are quantum-mechanical
particles that are ultimately responsible for the strong nuclear force, just as
an electromagnetic wave (light) is composed of photons and hence photons are
ultimately responsible for the electromagnetic force. If the strong nuclear force is so powerful,
why don’t all the protons and neutrons in the universe
attract each other to form one giant nucleus?
This does not occur because the strong nuclear force has a limited
range. The gravitational force does not
have a limited range. Regardless how
close or how distant two objects are from one another, they will attract each
other gravitationally with a strength that depends upon their masses and the distance
between them. The electromagnetic force
also does not have a limited range.
Regardless how close or how distant two objects are from one another, they will attract or repel each other
electromagnetically with a strength that depends upon their charges and the
distance between them. However, the
strong nuclear force does have a limited range.
Protons and neutrons will not feel the strong nuclear force if they are
further than a certain limited range. The range of the strong nuclear force is
roughly the size of the nucleus of an atom, which is a few quadrillionths of a
meter or a few trillionths of a millimeter or a few billionths of a
micrometer. Hence, the strong nuclear
force does overpower the electromagnetic force within the nucleus of an atom,
but the strong nuclear force essentially vanishes outside of the nucleus of an
atom. Hence, the limited range of the
strong nuclear force ensures that all the protons and
neutrons in the universe do not attract each other to form a giant
nucleus. The limited range of the strong
nuclear force only permits this force to hold quarks and gluons together within
protons and neutrons and to hold protons and neutrons together within the
nucleus of an atom. The weak nuclear
force is responsible for certain weak nuclear reactions, hence its name. It also has a limited range, like the strong
nuclear force.
The incredible strength of
the strong nuclear force reveals why nuclear reactions generate so much more
energy than chemical reactions. We will
focus on two particular categories of nuclear reactions: nuclear fission
reactions and nuclear fusion reactions.
A nuclear fission reaction is the splitting of a larger, more massive
(or heavier) nucleus into smaller, less massive (or lighter) nuclei. In fact, to fission anything means to split
it in colloquial English. A nuclear
fusion reaction is the merging of two smaller, less massive (or lighter) nuclei
into a larger, more massive (or heavier) nucleus. In fact, to fuse anything means to merge them
together in colloquial English. Nuclear
fission reactions generate thousands of times more energy than chemical
reactions, and nuclear fusion reactions generate thousands of times more energy
than nuclear fission reactions, meaning that nuclear fusion reactions generate
millions of times more energy than chemical reactions. To initiate a nuclear fission reaction, we
must use a particle as a projectile that will collide with a more massive
(heavier) nucleus; the collision causes the nucleus to split. We cannot use a proton as the projectile,
since protons are positively charged, and the target nucleus is itself
positively charged. Hence, the proton
and the target nucleus will repel each other electromagnetically. We cannot use an electron as the projectile
either. Although
electrons are negatively charged and would be attracted to the positively
charged nucleus that we are trying to split, an electron is almost two
thousand times less massive (lighter) than a proton or a neutron. Hence, the electron has too little mass to
split the target nucleus. If we wanted
to demolish a condemned building, firing a bullet at the building would be
fruitless. Regardless how fast the
bullet may be moving, it has such little mass that it will not have sufficient
momentum to demolish the condemned building.
However, a wrecking ball is so massive that it carries sufficient
momentum to demolish the condemned building, even if the wrecking ball is not
moving particularly fast. If a proton is repelled by the target nucleus and if an electron has
insufficient mass and thus insufficient momentum to split the target nucleus,
the only particle left to try is a neutron.
Although neutrons are neutral and thus will not be attracted to the
target nucleus, they will not be repelled either. More importantly, the mass of a neutron is
comparable to the mass of a proton. In
fact, the mass of the neutron is a little bit more than the mass of a
proton. Therefore, a neutron need not be
moving particularly fast to carry sufficient momentum to split the target
nucleus. Examples of massive (heavy)
nuclei commonly used in nuclear fission reactions include uranium and
plutonium. A particular example of a
nuclear fission reaction is a neutron splitting a uranium nucleus into a
krypton nucleus, a barium nucleus, and three neutrons. This nuclear fission reaction is more
properly written + → + + 3. Note that is the symbol of the neutron in nuclear
physics. Also
note that in nuclear physics, we use the same symbol for the nucleus of an atom
as a chemist would use for the entire atom.
For example, chemists use the symbol for the barium-141 atom, but nuclear
physicists use the same symbol for the barium-141 nucleus. Note that this nuclear reaction releases
three neutrons, which can be used to split further
nuclei. The result is a chain
reaction. A chain reaction can be controlled, as is the case in nuclear power
plants. A chain reaction can also be
uncontrolled, as is the case in a nuclear fission bomb. In a nuclear power plant, lead rods are used to control the reaction rate. If the chain reaction is proceeding too
quickly, lead rods are inserted into the reacting
solution; these lead rods absorb some of the neutrons to reduce the splitting
of the nuclei, thus slowing down the reaction.
If the chain reaction is proceeding too slowly, lead rods are pulled out of the reacting solution, leaving more
neutrons to split more nuclei thus speeding up the reaction.
A nuclear fusion reaction is
the merging of two less massive (lighter) nuclei into a more massive (heavier)
nucleus. However, all nuclei are
positively charged. Therefore, all
nuclei repel each another electromagnetically, which should prevent a fusion (a
merging) of nuclei from ever occurring.
As we discussed earlier in the course, temperature is a measure of the
average energy of individual particles.
In this course, we may assume that the average energy of individual
particles corresponds to their average speed.
In other words, particles move relatively faster at hotter temperatures,
while particles move relatively slower at cooler temperatures. Imagine incredibly hot temperatures when
nuclei are moving so fast that although they repel each other
electromagnetically as they approach each other, their tremendous energies at
these incredibly hot temperatures bring them within a few quadrillionths of a
meter of one another despite their mutual electromagnetic repulsion. It is within this range that the strong
nuclear force operates. Hence, the
strong nuclear force will overpower the electromagnetic repulsion, and the
nuclei will fuse together. Hydrogen is
the least massive (the lightest) atom in the entire universe, and helium is the
second least massive (the second lightest) atom in the entire universe. Hence, a particular example of a nuclear
fusion reaction is hydrogen nuclei fusing into a helium nucleus. The threshold temperature at which hydrogen
fuses into helium is several million kelvins.
This is incredibly hot by human standards, but this threshold
temperature would have been even hotter if it were not for Quantum Mechanics,
the correct theory of molecules, atoms, and subatomic particles. At the foundation of Quantum Mechanics is the
Heisenberg Uncertainty Principle, named for the German physicist Werner
Heisenberg who not only formulated this fundamental principle but was also one
of the physicists who formulated Quantum Mechanics itself. The Heisenberg Uncertainty Principle states that
it is impossible for a subatomic particle to have a definite position
(location) and a definite velocity (speed) at the same time. Because of this Heisenberg Uncertainty
Principle, there is a fair probability for subatomic particles to overcome
energy barriers even when they have insufficient energy to overcome the
barrier. This is
called quantum-mechanical tunneling.
At first glance, this quantum-mechanical tunneling seems like
unscientific nonsense, but quantum-mechanical tunneling has
been proven for many subatomic particles, including electrons for
example. In fact, modern electronic
devices such as mobile telephones and computers would not function correctly
without the quantum-mechanical tunneling of electrons. At several million kelvins of temperature,
most hydrogen nuclei are still not moving sufficiently fast to
quantum-mechanically tunnel through the electromagnetic repulsion between them,
but temperature is a statistical measure of average speed. In other words, at any given temperature,
whereas many particles move with a certain average speed, a small number of
particles move much slower than the average speed, and a small number of
particles move much faster than the average speed. At several million kelvins of temperature, a
small fraction of hydrogen nuclei do move sufficiently fast that they are able
to quantum-mechanically tunnel through the electromagnetic repulsion between
them, enabling the strong nuclear force to fuse them together. Humans have achieved uncontrolled nuclear fusion
reactions using nuclear fusion bombs.
Humans have not yet succeeded in controlled nuclear fusion
reactions. Since we do not yet have the
technology to control nuclear fusion reactions at these incredible temperatures,
all nuclear power plants use nuclear fission reactions, not nuclear fusion
reactions.
The energy yield of both
nuclear fission bombs and nuclear fusion bombs is measured
in units of tons of trinitrotoluene (TNT), a chemical explosive. One ton of TNT has an explosive yield of
roughly four billion joules of energy.
One kiloton of TNT has an explosive yield of one thousand tons of TNT,
since the prefix kilo- always means thousand.
For example, there are one thousand meters in a kilometer, and there are
one thousand grams in a kilogram. Since
one kiloton of TNT has an explosive yield of one thousand tons of TNT and since
one ton of TNT has an explosive yield of roughly four billion joules of energy,
one kiloton of TNT therefore has an explosive yield of roughly four trillion
joules of energy. One megaton of TNT has
an explosive yield of one million tons of TNT, since the prefix mega- always
means million. Since one megaton of TNT
has an explosive yield of one million tons of TNT and since one ton of TNT has
an explosive yield of roughly four billion joules of energy, one megaton of TNT
therefore has an explosive yield of roughly four quadrillion joules of
energy. The typical yield of a nuclear
fission bomb is a few kilotons of TNT, and the typical yield of a nuclear
fusion bomb is a few megatons of TNT.
These incredible yields help us appreciate the extraordinary energies
released from nuclear reactions. We can
also appreciate the vast quantities of energy released from nuclear reactions
by discussing the activation energy required to detonate these nuclear
weapons. We require a powerful chemical
explosive to heat uranium or plutonium to sufficient temperatures for neutrons
to move sufficiently fast to initiate nuclear fission. Hence, the detonator of a nuclear fission
bomb is a chemical explosive, such as TNT.
We require a fission bomb to heat hydrogen to millions of kelvins of
temperature so that the hydrogen nuclei can move sufficiently fast to fuse into
helium nuclei. Hence, the detonator of a
nuclear fusion bomb is a nuclear fission bomb!
These activation energies also give us a comparative scale. Comparing a chemical explosion to a nuclear
fission explosion is rather like comparing a nuclear fission explosion to a
nuclear fusion explosion!
As we discussed, the Sun is
roughly three-quarters hydrogen and roughly one-quarter helium. We now suspect that the Sun derives its
energy from the nuclear fusion of hydrogen into helium. Unfortunately, the surface temperature of the
Sun is only six thousand kelvins, as we discussed. This is nowhere nearly hot enough to fuse hydrogen
into helium. However, the interior of
the Sun is much hotter than six thousand kelvins. Theoretical calculations reveal that the core
of the Sun is at roughly fifteen million kelvins of temperature. Even at this incredibly hot
temperature, it is only a small fraction of hydrogen nuclei that move
sufficiently fast to quantum-mechanically tunnel through the electromagnetic
repulsion between them. However,
the Sun is also incredibly massive.
Although these incredibly hot temperatures are only
attained in the Sun’s core, the solar core is massive enough that a
small fraction of the enormous number of hydrogen nuclei that compose the solar
core is an appreciable number. In other
words, the Sun’s core is composed of such an incredible number of hydrogen
nuclei that a fair amount of nuclear fusion occurs, even though nuclear fusion
is somewhat improbable even at several million kelvins of temperature. In conclusion, the Sun shines because of the
nuclear fusion of hydrogen into helium in its core at roughly fifteen million
kelvins of temperature. Warning: most of
the Sun is not hot enough for any nuclear fusion to occur. Only the Sun’s core is hot enough to fuse
some hydrogen into helium. Therefore,
the solar core is slowly but progressively becoming less and less hydrogen and
more and more helium, while the rest of the Sun remains roughly three-quarters
hydrogen and roughly one-quarter helium.
In roughly five billion years, the solar core will exhaust its hydrogen,
becoming nearly entirely helium. At that
point, the Sun will begin to die, as we will discuss shortly. We emphasize again that the entire Sun will
never become pure helium. Most of the
Sun will remain roughly three-quarters hydrogen and roughly one-quarter helium,
since the nuclear fusion of hydrogen into helium only occurs in the solar core.
The first step of the nuclear
fusion reactions occurring in the solar core is the fusion of two protons into
a deuteron. This reaction is more
properly written + → + e+
+ νe. Again, in nuclear physics we use the same
symbol for the nucleus of an atom as a chemist would use for the entire
atom. For example, chemists use the
symbol for the hydrogen-1 atom (the protium atom),
but nuclear physicists use the same symbol for the hydrogen-1 nucleus (simply a
proton). Also,
chemists use the symbol for the hydrogen-2 atom (the deuterium atom),
but nuclear physicists use the same symbol for the hydrogen-2 nucleus (a deuteron). The symbol νe (the lowercase
Greek letter nu) represents a neutrino, a quantum-mechanical particle that we
will discuss shortly. The symbol e+ represents the
antielectron, commonly known as the positron.
For every particle in the universe, there is a corresponding antimatter
particle. A particle of antimatter has
the identical mass as its corresponding particle of ordinary matter, but the
antimatter particle has the opposite electric charge as the ordinary matter
particle. Other parameters are opposite
as well. We have discussed that the
proton is positively charged, but there is another particle with identical mass
as the proton called the antiproton, which is negatively
charged instead of positively charged.
We have discussed that the electron is negatively charged, but there is
another particle with identical mass as the electron called the antielectron,
which is positively charged instead of negatively
charged. This is why the antielectron is commonly known as the positron. Notice that the symbol of the antielectron
(the positron) is e+, since
we may regard this antimatter particle as a positive electron. Indeed, the symbol of the ordinary electron
is e–, since the ordinary
electron is negatively charged. We
emphasize that antimatter is not science fiction; antimatter is proven science
fact. Physicists have synthesized
antimatter particles for many decades.
Antiprotons and antineutrons compose antinuclei,
and antielectrons (positrons) can be attracted by
these antinuclei to form antiatoms. Antiatoms can even chemically bond with each
other to form antimolecules. Antimatter is extraordinarily rare in our
universe, but this is fortunate actually.
When a matter particle and its corresponding antimatter particle meet,
they completely annihilate each other, becoming pure energy. This is the complete conversion of matter
into energy. The overwhelming majority
of particles in the universe are ordinary matter particles; antimatter
particles are extraordinarily rare. All
the stars, planets, moons, asteroids, and comets in the entire universe are
composed of matter, not antimatter. In
particular, the Sun is composed of ordinary matter. Thus, when the antielectron (the positron) is generated in this first step of the nuclear fusion in the
solar core, the antielectron (positron) immediately annihilates with an
ordinary electron, generating energy.
The next step of the nuclear fusion reactions occurring in the solar
core is the fusion of a proton and a deuteron into a helium-3 nucleus. This reaction is more properly written + → . The third and final step of the nuclear
fusion reactions occurring in the solar core is the fusion of two helium-3
nuclei into a helium-4 nucleus (an alpha particle) plus two protons. This reaction is more properly written + → + + .
The two protons produced by this final
step can then fuse, thus initiating the first step of this nuclear reaction
chain. Hence, the overall reaction of
all of these nuclear fusion reactions is called the
proton-proton cycle, since the fusion of two protons begins the reaction chain
and two protons are produced by the end of the reaction chain which can begin
the entire reaction chain over again.
However, this may lead us to suspect that this nuclear reaction chain
continues indefinitely, but this is false.
If we construct the overall reaction, we discover that four protons fuse
into a helium-4 nucleus (an alpha particle) plus energy plus two
neutrinos. This overall reaction is more
properly written 4 → + energy + 2νe. Hence, hydrogen is being converted
into helium in the Sun’s core.
Therefore, the solar core is slowly but progressively becoming less and
less hydrogen and more and more helium.
Again, only the solar core is hot enough for these nuclear fusion
reactions to occur. Nuclear reactions do
not occur throughout most of the Sun.
Hence, most of the Sun remains three-quarters hydrogen and one-quarter
helium. There will never come a time
when the entire Sun is pure helium.
However, the solar core will become nearly entirely helium in roughly
five billion years. This will begin the
death of the Sun, which we will discuss shortly. Hence, this proton-proton cycle will not
continue indefinitely, since the solar core will eventually exhaust its supply
of hydrogen, thus ending this nuclear reaction chain. Note that the first step of this reaction
chain is governed by the weak nuclear force, which is
a slow force. This contributes to the
Sun’s long lifetime. Instead of
consuming all of the hydrogen in its core in a short amount of time, the
proton-proton cycle is slowed by the first step in the
nuclear reaction chain, stretching out the conversion of hydrogen into helium
in the solar core over a timescale of billions of years. The energy generated in the proton-proton
cycle is in the form of high-energy photons in the gamma-ray part of the
Electromagnetic Spectrum.
Although hydrogen and helium
are gases at ordinary temperatures, the interior of the Sun is so hot that the
hydrogen and helium atoms are ionized.
The composition of the Sun is actually positively-charged
nuclei, negatively-charged electrons, and high-energy photons all colliding
with one another. This hot state of
matter is called a plasma. Therefore, the high-energy photons created by
the proton-proton cycle in the solar core cannot easily escape the Sun. They continuously collide with positive
nuclei and negative electrons.
Therefore, the trajectory (the path) of these photons is
randomized. Of course, these
photons do propagate in a straight line at the speed of light between
collisions, but their overall trajectory (path) is not a straight line; it is a
random trajectory (path) resulting from continuous collisions with nuclei and
electrons. This type of trajectory is called a random walk, since it is rather like the path a
pedestrian would take while aimlessly walking the streets of a city. Note therefore that light cannot travel
easily through the Sun. In other words,
the Sun is not transparent; the Sun is opaque.
The layer of the Sun around the core where the photons execute this
random walk is called the radiation zone. It takes somewhere between one hundred
thousand years and one million years for a typical photon to escape out of the
radiation zone. It would only take photons
roughly two seconds to travel from the core of the Sun to the surface of the
Sun if they could move in straight lines at the speed of light without
suffering any collisions. However,
photons take somewhere between one hundred thousand years and one million years
to travel out of the radiation zone, due to their random walks resulting from
their continuous collisions with nuclei and electrons. The next layer of the Sun around the
radiation zone is the convection zone, where energy is
transported much faster through rising masses of more hot plasma and
sinking masses of less hot plasma. These are convection cells similar to the convection cells in the
Earth’s asthenosphere that we discussed earlier in the course, although the
convection cells in the Sun’s convection zone are much, much hotter. The outermost layer of the Sun around the
convection zone is the photosphere, the actual surface of the Sun that we can
see. At the photosphere, energy leaves
the Sun in the form of electromagnetic waves (photons) from across the entire
Electromagnetic Spectrum. More
precisely, electromagnetic waves (photons) radiate from the photosphere with a
continuous blackbody spectrum, primarily in visible light (peaking in yellow
visible light) in accordance with the temperature of the photosphere (roughly
six thousand kelvins) as determined by the Wien displacement law. The photons that leave the photosphere travel
out into the surrounding outer space at the speed of light. Some of these photons spend roughly eight
minutes traveling to the Earth. Each
time we feel the warmth of sunlight upon us, we should reflect upon the journey
that sunlight traveled before finally arriving upon us. First, the energy was
created in the solar core through nuclear fusion reactions (the
proton-proton cycle). Then, the energy spent
between one hundred thousand years and one million years trying to escape from
the Sun’s radiation zone. Then, the
energy was transported faster by convection through
the Sun’s convection zone. Then, the
energy escaped the photosphere (the surface of the Sun), traveling through
outer space toward the Earth for roughly eight minutes before finally bathing
us with its warmth.
Our understanding of the
interior of the Sun comes from theoretical calculations together with computer
simulations. The results of this
theoretical work can be tested through the observation
of vibrations on the photosphere. The
study of these vibrations is called helioseismology,
since we may regard these vibrations as sunquakes. It is remarkable that our understanding of
the interior of the Sun is tested through measuring sunquakes, just as our
understanding of the interior of the Earth is tested through measuring
earthquakes and our understanding of the interior of the Earth’s Moon is tested
through measuring moonquakes, as we discussed earlier in the course. Our understanding of the interior of the Sun is also confirmed through the actual appearance of the
photosphere. The surface of the Sun does
not look smooth; the surface of the Sun looks grainy or sandy. This grainy or sandy appearance of the
photosphere is called granulation. The photosphere is composed
of both more bright granules and less bright granules. These granules on the photosphere reveal the
convection in the convection zone beneath the photosphere. Rising masses of more hot plasma manifest
themselves as more bright granules on the photosphere, while sinking masses of
less hot plasma manifest themselves as less bright granules on the photosphere.
The Sun creates a powerful
magnetic field. As we discussed earlier
in the course, the Earth’s magnetic field is generated by its rotation together
with circulating currents of molten metal in its outer core, and the magnetic
field of a jovian, gas-giant (outer) planet is
generated by its rotation together with circulating currents of electrically
conducting hydrogen (metallic hydrogen) in deeper layers of the planet. Similarly, the Sun’s magnetic field is generated by its rotation together with convection cells
of circulating hot plasma in its convection zone. However, the dynamics of the Sun’s magnetic
field is complicated by the Sun’s differential
rotation. We use the term rigid body
rotation when every part of an object rotates together at the same rate, while
we use the term differential rotation when different parts of an object rotate
at different rates. Fluids suffer from
differential rotation. For example, the jovian, gas-giant (outer) planets
suffer from differential rotation, since their outer layers are composed
primarily of hydrogen gas and helium gas.
Solids suffer from rigid body rotation.
For example, the terrestrial (inner) planets suffer from rigid body
rotation, since they are composed primarily of metal and rock. Caution: this is actually an
oversimplification. As we discussed
earlier in the course, different parts of the Earth actually rotate at
different rates. Nevertheless, as
compared with the jovian,
gas-giant (outer) planets, we may regard the Earth and all the terrestrial
(inner) planets as suffering from rigid body rotation. The Sun is not a solid object. The Sun is a hot plasma, which is a type of
fluid. Therefore, the Sun suffers from
differential rotation. On average, the
Sun rotates roughly once per month, but in actuality different parts of the Sun
rotate at different rates. This
differential rotation drags and stretches the Sun’s magnetic field lines. As magnetic field lines are
stretched, they increase in tension, just as strings or elastic bands
increase in tension when stretched.
Eventually, magnetic field lines may break from too much tension, again
just as strings or elastic bands may break from too much tension. When the Sun’s magnetic field lines break,
they reconnect with complex patterns.
After a magnetic break, a magnetic reconnection often causes magnetic
field lines to anchor themselves at two places on the photosphere (the surface
of the Sun). The magnetic field lines
point out of the photosphere at one anchor, bend above the photosphere, and
point back into the photosphere at the other anchor. Wherever they anchor themselves on the
photosphere will be regions of very strong magnetic fields that block
convection in the convection zone beneath these anchors, causing these regions
of the photosphere to be less hot than the rest of the photosphere. These less hot regions with strong magnetic
fields are called sunspots, since they appear black as
compared with the rest of the surface of the Sun (the photosphere). The temperatures of these sunspots are still
in the thousands of kelvins however; sunspots are simply not as hot as the rest
of the photosphere at roughly six thousand kelvins. If the temperatures of sunspots are still in
the thousands of kelvins, then these sunspots are hot enough to radiate visible
light. Indeed, these sunspots are
actually quite luminous; sunspots only appear black because we are comparing
them with the rest of the surface of the Sun.
Since broken and then reconnected magnetic field lines often anchor
themselves at two places on the photosphere, sunspots often occur in
pairs. One sunspot will have an
outwardly directed magnetic field, while the other sunspot will have an
inwardly directed magnetic field. Plasma
eruptions on the photosphere often follow the Sun’s magnetic field lines. As such, a plasma eruption often forms an arch
anchored at a pair of sunspots. This
arched plasma eruption is called a solar prominence. If a tremendous amount of tension in the
Sun’s magnetic field lines is finally liberated
through a magnetic break followed by a magnetic reconnection, a violent plasma
eruption will burst outward from the photosphere; this plasma eruption is
called a solar flare. These solar flares
travel outward from the Sun, and hence some of these solar flares travel toward
the direction of the Earth. Fortunately,
the Earth’s magnetic field shields us from most solar activities such as solar
flares. However, our artificial
satellites in orbit around the Earth are not well protected
from solar activity. Our artificial
satellites are continuously bombarded, damaged, and even on
occasion completely destroyed by solar activities such as solar flares.
Astronomers have directly
observed for roughly four hundred years (since the invention of the telescope)
that the number of sunspots goes through a roughly eleven-year cycle. In one complete cycle, the number of sunspots
increases then decreases over a time period of roughly
eleven years. Furthermore, measurements
of the radioactive isotope carbon-fourteen within trees have revealed that this roughly
eleven-year solar cycle itself goes through a roughly
two-hundred-year cycle. This is the de Vries cycle, named for the Dutch physicist Hessel de Vries, one of the pioneers of radiocarbon dating. According to the de Vries
cycle, the Sun gradually increases in activity to what is called
a solar maximum then gradually decreases in activity to what is called a solar
minimum. Caution: the eleven-year solar
cycles continue to occur throughout each two-century de Vries
cycle. Since one complete de Vries cycle lasts for roughly two centuries, each solar
maximum and each solar minimum lasts for roughly one hundred years. Over the past twelve thousand years (since
the beginning of the current interglacial period of the Current Ice Age), there
have been roughly sixty complete de Vries cycles,
with each de Vries cycle having one solar maximum and
one solar minimum. The Modern Maximum
occurred throughout most of the twentieth century, and the Modern Minimum began
toward the beginning of the twenty-first century (the current century). The roughly eleven-year sunspot cycle and the
roughly two-century de Vries sunspot cycle both
strongly determine variations in global temperatures on planet Earth, as we
discussed earlier in the course. In
particular, the Modern Maximum that occurred throughout most of the twentieth
century contributed to the warming temperatures of that century, and the Modern
Minimum that began toward the beginning of the twenty-first century (the
current century) has already caused cooling temperatures that will continue for
the rest of the current century.
The Sun’s atmosphere is
composed primarily of hydrogen and helium.
As we leave the photosphere (the surface of the Sun)
and climb the solar atmosphere and ultimately travel into the
surrounding outer space, we expect the temperature to become cooler and cooler,
but this is not the case. As we leave
the photosphere, the temperature actually becomes hotter. The lower layer of the solar atmosphere is
the chromosphere. The temperature
approaches roughly one hundred thousand kelvins as we climb the
chromosphere. Because of these hot
temperatures, the chromosphere radiates primarily ultraviolet light, in
accordance with the Wien displacement law.
The upper layer of the solar atmosphere is the corona, the main part of
the Sun’s atmosphere. The solar corona
is even hotter, roughly one million kelvins in temperature. Because of these even hotter temperatures,
the solar corona radiates primarily X-rays, again in accordance with the Wien
displacement law. It is only when we climb
beyond the solar corona and travel into the surrounding outer space that the
temperature finally cools. We do not
understand why the solar atmosphere is so hot.
Perhaps the Sun’s atmosphere is heated by
prominences, flares, and other solar activities from the photosphere. Although this sounds reasonable, this theory
is nevertheless not well developed.
Since the solar atmosphere is so hot, its composition is primarily not
hydrogen gas and helium gas but primarily ionized hydrogen (protons and
electrons) and ionized helium (alpha particles and electrons). Moreover, the hot temperatures of the solar
atmosphere cause many of these particles to move sufficiently fast that they
can escape from the Sun’s gravitational attraction. The result is the solar wind, a stream of
charged particles radiating outward from the Sun composed primarily of protons
(hydrogen nuclei), electrons, and alpha particles (helium nuclei). This solar wind is capable of completely
ionizing the Earth’s atmosphere in a fairly short amount
of time. Fortunately, the Earth’s
magnetic field is sufficiently strong to deflect most of the Sun’s solar
wind. Some of the charged particles in
the solar wind do however become trapped within the
Earth’s magnetic field. These charged
particles execute helical trajectories around the Earth’s magnetic field
lines. These regions of the Earth’s
magnetic field are called the Van Allen belts, named
for the American physicist James Van Allen who discovered them. The charged particles within the Van Allen
belts may create an aurora, either aurora borealis (or more commonly the
northern lights) near the Earth’s north magnetic pole or aurora australis (or more commonly the southern lights) near the
Earth’s south magnetic pole, as we discussed earlier in the course. If the Sun happens to be less active, its
solar wind would be weaker, the resulting aurorae would appear less
spectacular, and we would only be able to enjoy them near the Earth’s magnetic
poles. If the Sun happens to be more
active, its solar wind would be stronger, the resulting aurorae would appear
more spectacular, and we would be able to enjoy them further from the Earth’s
magnetic poles.
Neutrinos are extremely
weakly interacting quantum-mechanical particles. Neutrinos do not participate in the strong
nuclear force for example. Neutrinos
also refuse to participate in the electromagnetic force, since they are
electrically neutral. This is why they are called neutrinos!
Of course, everything in the universe feels gravity, but the mass of a
neutrino is such a tiny number that physicists have not yet succeeded in even
measuring its value. Since the mass of a
neutrino is so extraordinarily tiny, neutrinos do not noticeably feel gravity. Therefore, for all practical purposes
neutrinos do not participate in the gravitational force. Whereas the photons that are
created in the solar core spend between one hundred thousand years and
one million years trying to escape from within the Sun as we discussed,
neutrinos are so weakly interacting that they immediately escape from within
the Sun after being created in the solar core.
Since neutrinos propagate almost at the speed of light, the neutrinos
created by the proton-proton cycle in the Sun’s core travel in straight lines
from the solar core to the photosphere in roughly two seconds. The neutrinos continue to travel outward from
the Sun, through its atmosphere and then into the surrounding outer space. Some of these neutrinos spend roughly eight
minutes traveling to the Earth.
Neutrinos are so weakly interacting that when these neutrinos arrive at
the Earth, they simply pass through the Earth.
Billions and billions of neutrinos from the Sun pass through our bodies
every second of every day! Neutrinos are
so weakly interacting that they do virtually nothing with the atoms that
compose our bodies. This is not just the
case during daytime when we are on the side of the Earth facing toward the
Sun. This is also true during nighttime
when we are on the side of the Earth facing away from the Sun. In this case, these solar neutrinos arrive at
the Earth, pass straight through the Earth, and pass straight through or bodies
on the nighttime side of the Earth.
Every second of every day of our lives, billions and billions of solar
neutrinos continuously pass through our bodies!
If we could detect these
solar neutrinos, this would provide nearly real-time information about the
solar core. The light we collect from
the Sun may have taken roughly eight minutes to travel from the photosphere to
the Earth, but those photons were actually created in
the solar core at least one hundred thousand years ago and even up to one
million years ago. If we only rely upon
the light from the Sun to understand the interior of the Sun, then our
knowledge about the solar core is actually up to one million years out of
date. Of course, one million years is
actually rather recent as compared to the Sun’s age of roughly five billion
years. Nevertheless, it would be
exciting to have information about the solar core that is only eight minutes
old. Unfortunately, neutrinos are so
weakly interacting that detecting them is virtually impossible. Although neutrinos do not participate in the
gravitational force (practically speaking) or the strong
nuclear force or the electromagnetic force, neutrinos do on occasion
participate in the weak nuclear force.
As we discussed, the first step of the proton-proton cycle is governed by the weak nuclear force, and note above that
that nuclear reaction involves a neutrino.
Several decades ago, physicists built neutrino detectors using the
principles of neutrinos participating in the weak nuclear force. Nevertheless, neutrinos are so weakly
interacting that even though billions and billions of solar neutrinos pass
through these detectors every second of every day, a neutrino detector only
detects one neutrino per day! Working at
a neutrino detector is the most boring job in the world. On one day, we see a single blip on a
screen. The following day, we see
another single blip. The day after that,
we see one single blip yet again.
Boring! This is also frustrating,
since we know that billions and billions of neutrinos are actually passing
through the detector every second of every day, but we only detect one neutrino
per day! Over several decades,
physicists have only detected one-third of the number of neutrinos that
theoretical calculations predict that we should be detecting from the Sun. This is called the
solar neutrino problem. There have been
many theories proposed over the decades to resolve the solar neutrino
problem. One such idea is the theory of
neutrino oscillations. There are three
different flavors (or varieties or types) of neutrinos. According to the theory of neutrino
oscillations, there is a certain probability that a neutrino can spontaneously
change its flavor from one type to another type. Only one type of neutrino is
created by the proton-proton cycle in the solar core. According to the theory of neutrino
oscillations, some of these neutrinos spontaneously change their flavor during
their roughly eight-minute journey from the Sun to the Earth. Perhaps we have only been detecting one-third
of the number of neutrinos we should be detecting because our neutrino
detectors can only detect one flavor of neutrino instead of all three flavors
of neutrinos. This theory of neutrino
oscillations was attacked and ridiculed by some physicists for decades until it
was proven to be the correct theory to resolve the
solar neutrino problem. Several years
ago, physicists finally built neutrino detectors that could detect all three
flavors of neutrinos. Not only have we
detected all three flavors of neutrinos from the Sun, but
totaling all three detected flavors has finally yielded results consistent with
theoretical calculations. Hence, the
resolution of the solar neutrino problem is indeed the theory of neutrino
oscillations.
Stellar Properties
Other stars besides the Sun
are at least two hundred thousand times further from the Earth as compared with
the Sun. Therefore, we know much less
about others stars as compared with our Sun.
Nevertheless, we will attempt to determine the properties of other stars
by applying the same procedures we applied to our Sun. Firstly, from the absorption spectral lines
within a star’s light, we can determine the composition of the star. We discover that all stars are composed of
all the atoms on the Periodic Table of Elements, but not in equal amounts. Only two atoms account for close to one
hundred percent of the mass of all stars; all the other atoms on the Periodic
Table of Elements account for only a tiny fraction (tiny percentage) of the
mass of stars. All stars are composed of
roughly seventy-five percent (three-quarters) hydrogen and roughly twenty-five
percent (one-quarter) helium. Again, all
the other atoms on the Periodic Table of Elements make up a tiny fraction (tiny
percentage) of the mass of stars.
To determine the distance to
stars, we measure their parallax. As we
discussed earlier in the course, parallax is the apparent motion of an object,
not because it is moving but because the observer is in fact moving. The motion of the Earth around the Sun causes
the stars to appear to shift their positions in the sky by tiny amounts. By measuring the angle of this shift, we can
determine the distance to the star. As
we discussed earlier in the course, the orbit of the Earth around the Sun is an
ellipse with a semi-major axis equal to one astronomical unit (1 au), roughly
equal to one hundred and fifty million kilometers. We also discussed earlier in the course that
the eccentricity of the Earth’s orbit around the Sun is so close to zero that
its orbit is nearly a circle, and so we may regard one astronomical unit as the
radius of the Earth’s roughly circular orbit around the Sun. More plainly, we may regard one astronomical
unit as the distance between the Earth and the Sun. Astronomers define the parallax angle as the
apparent angular shift of a star over a baseline of the Earth’s orbital radius. Further distances result in smaller parallax
angles. Even the nearest stars (besides
the Sun) are so distant that their parallax shifts are much smaller than even a
one-degree angle. A one-degree angle is
already small, since one degree is one full circle divided into three hundred
and sixty equal parts. The parallax
shifts of even the nearest stars (besides the Sun) are much smaller than even
one degree! One sixtieth of a degree is written 1′ and is called one arcminute or one
minute of arc. Notice that minutes of
arc are indicated with a single prime. Caution: the single prime is
also used for feet of length in the United States. One sixtieth of one arcminute is written 1″ and is called one arcsecond
or one second of arc. Notice that
seconds of arc are indicated with a double prime. Caution: the double prime is
also used for inches of length in the United States. Since sixty multiplied by sixty is 3600, this
means that one arcsecond is one degree divided into
3600 equal angles. A one-degree angle is
already small, but now imagine dividing that small angle into 3600 equal
angles! The nearest stars (besides the
Sun) suffer parallax shifts even smaller than one arcsecond! Since stars must be incredibly distant to
suffer such tiny parallax shifts, astronomers have defined a new unit of
distance to measure distances to stars.
The distance at which a star would appear to suffer a parallax of 1″ (one arcsecond or one
second of arc) is called a parsec, abbreviated pc. The word parsec is derived from the three
words parallax, arc, and second. It is
not difficult to calculate that one parsec of distance is slightly more than
two hundred thousand astronomical units.
If we multiply two hundred thousand astronomical units by roughly one
hundred and fifty million kilometers for each astronomical unit, we deduce that
one parsec is roughly thirty-one trillion kilometers! This is an incredible distance, and the
nearest stars (besides the Sun) are further than even this! One parsec is also equal to 3.26 light-years,
where one light-year is the distance that light travels in a time of one year,
as we discussed toward the beginning of the course. If a star suffers a parallax of 1″ (one arcsecond or one second of
arc), then it is 1 pc (one parsec) distant, by the definition of the
parsec. If a star suffers an even
smaller parallax (as all stars besides the Sun do), then the star is at a
proportionally further distance. For
example, if a star suffers a parallax of one-half of one arcsecond,
then it is two parsecs distant. If a
star suffers a parallax of one-tenth of one arcsecond,
then it is ten parsecs distant. We can
also invert this argument and predict the parallax from the distance. For example, if a star is twenty parsecs
distant, then it must suffer a parallax of one-twentieth of one arcsecond. If a star
is fifty parsecs distant, then it must suffer a parallax of one-fiftieth of one
arcsecond.
The Cosmological Distance
Ladder is a list of methods to determine distances to astronomical
objects. Any given method can only be used over a certain range of distances, and so
we must use other methods for further distances. That new method can only be
used over its own range of further distances, and so we must use yet
another method for even further distances, and so on and so forth. The parallax method of determining distances
is the lowest rung of the Cosmological Distance Ladder, since parallax angles
are so tiny that we can only measure them for nearby stars within the so-called
solar neighborhood. Beyond distances of
a couple thousand parsecs, parallax angles become too tiny to measure even with
modern telescopes. Therefore, we cannot
measure the parallax for most of the stars of our Milky Way Galaxy, and
measuring parallaxes beyond our Milky Way Galaxy is hopeless. We will spend the rest of this course adding
higher and higher rungs to the Cosmological Distance Ladder until have a list
of methods that will enable us to determine distances from nearby stars in the
solar neighborhood all the way to the edge of the observable universe. Nearby stars are within the so-called solar
neighborhood, nearby galaxies slightly beyond our Milky Way Galaxy are within
the so-called galactic neighborhood, and the edge of the observable universe is
called the cosmic horizon. Although we
cannot use parallax to determine distances beyond the solar neighborhood,
astrophysicists nevertheless continue to use the parsec as the unit of distance
even for astronomical objects whose distances are determined using non-parallax
methods. One thousand parsecs is called
one kiloparsec (abbreviated kpc),
since the prefix kilo- always means thousand. For example, there are one thousand meters in
one kilometer, and there are one thousand grams in one kilogram. One million parsecs is called one megaparsec (abbreviated Mpc),
since the prefix mega- always means million. One billion parsecs is called one gigaparsec (abbreviated Gpc),
since the prefix giga-
always means billion. Theoretically, one
trillion parsecs would be called one teraparsec
(abbreviated Tpc), since the prefix
tera- always means trillion. However, the entire observable universe is
only a few gigaparsecs across. As we will discuss toward the end of this
course, the universe is expanding, and hence the observable universe is
continuously growing in size. In many
billions of years, the observable universe will eventually expand to become teraparsecs in size.
However, the observable universe is presently only a few gigaparsecs across.
Therefore, the teraparsec is not yet a
physically meaningful unit of distance.
Caution: current cosmological models suggest that the entire universe
beyond the observable universe is actually infinite in size. It is the observable
universe that is only a few gigaparsecs
across, not the entire universe. We will
make clear the distinction between the observable universe and the entire
universe toward the end of the course.
Until we discuss higher rungs
of the Cosmological Distance Ladder, for now our discussion can only focus upon
the parallax method to measure distances to stars within a couple thousand
parsecs (within the solar neighborhood).
Nevertheless, there are still millions of stars within this
distance. Therefore, we can discuss the
determination of the luminosities of these nearby stars within the solar
neighborhood. As we discussed, we can
calculate the luminosity of any object from the intensity of its light I and its distance from us r using the equation I = ℒ / 4πr2, where ℒ is the luminosity of the object. The intensity of a star’s light is often expressed as an apparent magnitude, while the
luminosity of the star is often expressed as an absolute magnitude. This magnitude scale was
formulated by the ancient Greek mathematician and astronomer Hipparchus of
Nicaea. Hipparchus called the
brightest stars we can see in the night sky first-magnitude stars. Bright stars that were not as bright as
first-magnitude stars were called second-magnitude
stars. Stars of intermediate brightness
in the night sky were called third-magnitude
stars. Dim stars were
called fourth-magnitude stars, and the dimmest stars visible to the
human eye were called fifth-magnitude stars.
This magnitude scale is rather illogical, since dimmer stars are
assigned higher magnitude numbers, while brighter stars are assigned lower
magnitude numbers. Nevertheless, modern
astrophysicists not only continue to use this magnitude scale, but modern
astrophysicists have even quantified this magnitude scale. Firstly, there are decimal magnitudes. For example, a 4.3-magnitude star is brighter
than a 4.7-magnitude star. As another
example, a 2.5-magnitude star is dimmer than a 2.1-magnitude star. Secondly, the invention of the telescope
enables us to observe stars much dimmer than even the dimmest stars that the
naked eye is able to see. A
sixth-magnitude star is even dimmer than a fifth-magnitude star, and a
seventh-magnitude star is dimmer still.
The Hubble Space Telescope has imaged stars all the way down to roughly
thirtieth-magnitude! Thirdly, the
magnitude scale is also quantified in the other
direction. A zeroth-magnitude star is
brighter than a first-magnitude star. A
star with magnitude negative-one is even brighter than a zeroth-magnitude star,
and a star with magnitude negative-two is brighter still. Our Sun has a magnitude of roughly
negative-twenty-seven! More precisely,
the modern quantified magnitude scale is a logarithmic scale. In particular, every unit on the magnitude
scale corresponds to a factor of roughly 2.5 in brightness. For example, a sixth-magnitude star is
roughly 2.5 times brighter than a seventh-magnitude star. A fifth-magnitude star is roughly 2.5 times
brighter than a sixth-magnitude star, which makes a fifth-magnitude star
roughly 6.25 times brighter than a seventh-magnitude star (since 2.5 times 2.5
is 6.25). A fourth-magnitude star is
roughly 2.5 times brighter than a fifth-magnitude star, which makes a
fourth-magnitude star roughly 6.25 times brighter than a sixth-magnitude star,
which makes a fourth-magnitude star roughly 15.625 times brighter than a
seventh-magnitude star (since 2.5 times 2.5 times 2.5 is 15.625). In brief, one magnitude of separation is
roughly a factor of 2.5 in brightness, two magnitudes of separation is roughly
a factor 6.25 in brightness, and three magnitudes of separation is roughly a
factor of 15.625 in brightness. Four
magnitudes of separation is nearly a factor of 40 in brightness, and five
magnitudes of separation is nearly a factor of 100 in brightness! This reveals that lower magnitude stars are
much brighter than higher magnitude stars, since we must multiply by a string
of factors to calculate their relative brightnesses. Stated the other way around, higher magnitude
stars are much dimmer than lower magnitude stars, since we must divide by a
string of factors to calculate their relative brightnesses. The apparent magnitude of a star is how
bright the star appears, depending upon its distance. The absolute magnitude of a star expresses
its luminosity or its intrinsic brightness.
More precisely, astronomers define the absolute magnitude of a star as
the apparent magnitude the star would have if it were ten parsecs distant. It is easy to prove that this precise
definition of absolute magnitude relates directly to luminosity or intrinsic brightness. Therefore, we will casually regard all three
of these variables (luminosity, absolute magnitude, and intrinsic brightness)
as essentially the same quantity. If a
star has a relatively constant luminosity or intrinsic brightness as most stars
do, then its absolute magnitude is a fixed number. However, the star will appear dimmer from
further away, and the star will appear brighter when closer. This is precisely the same as the appearance
of a lightbulb. Most lightbulbs have a
fixed luminosity (power output), but a lightbulb will still appear dimmer from
further away, and the lightbulb will still appear brighter when closer. Because of the illogical magnitude scale, the
apparent magnitude of a star will be a higher number (since the star appears
dimmer) when further from the star, and the apparent magnitude of a star will
be a lower number (since the star appears brighter) when closer to the
star. Again, Hipparchus of Nicaea
assigned lower magnitude numbers to brighter stars, and Hipparchus of Nicaea
assigned higher magnitude numbers to dimmer stars.
As we discussed,
astrophysicists use two methods to determine the surface temperature of our
Sun. Perhaps we can apply these same two
methods to determine the surface temperatures of other stars. One of these methods uses the Wien
displacement law (essentially using the color of the star), and the other
method uses the Stefan-Boltzmann law ℒ = σ(4πR2)T 4, where T is
the surface temperature of the star, ℒ is the luminosity (or absolute magnitude or intrinsic
brightness) of the star, and σ
is the Stefan-Boltzmann constant.
Warning: we use lowercase r
for the distance from the star, and we use uppercase (capital) R for the actual radius (physical size)
of the star. Let us first consider the
Stefan-Boltzmann law. To use this
equation to calculate the surface temperature of the star, we need the
luminosity and the actual radius (physical size) of the star. Although we just discussed the determination
of the luminosities of nearby stars in the solar neighborhood, our telescopes
are not powerful enough to magnify even these nearby stars enough to actually see their physical radii (their physical
sizes). Stars appear to be twinkling
points of light to the naked eye, and most stars still appear to be twinkling
points of light through even our most powerful telescopes. If we cannot measure the actual physical
radii of stars (their physical sizes), then we cannot use the Stefan-Boltzmann
law to calculate their surface temperatures.
We are now forced to consider the Wien
displacement law. Unfortunately, even
nearby stars in the solar neighborhood are very dim, and so we receive
insufficient light from them to graph their continuous blackbody spectra to
find the primary wavelength of their light, which we require to calculate the
surface temperature using the Wien displacement law. All seems lost, but roughly
a century ago astronomers formulated an ingenious method to construct the
continuous blackbody spectrum of a star in a coarse but effective way. We place a red filter on our telescope that
permits only red light to enter the telescope.
Thus, we measure the brightness of a star in red light only. This is called the star’s red magnitude with
the symbol mR.
After removing the red filter, we then place a blue filter on the
telescope that permits only blue light to enter the telescope. Thus, we measure the brightness of the same
star in blue light only. This is called the star’s blue magnitude with the symbol mB.
After removing the blue filter, we then place a yellow-green filter on
the telescope that permits only yellow-green light to enter the telescope. Thus, we measure the brightness of the same
star in yellow-green light only. This is called the star’s visual magnitude with the symbol mV. (Astronomers use the word visual since
yellow-green corresponds with the primary wavelength of light emitted by our
own Sun.) After measuring the brightness
of the star at these different wavelengths, we then subtract these color
magnitudes. The difference between two
color magnitudes of the same star is called a color
index. The three possible color indices
we may calculate using these three filters are mB–mV (blue minus visual), mV–mR
(visual minus red), and mB–mR
(blue minus red). These color indices
yield estimates for the surface temperature of the star. For example, if the star radiates more blue
light than any other wavelength, its surface temperature must be hotter than
the surface temperature of our own Sun.
If the star radiates more red light than any other wavelength, its
surface temperature must be cooler than the surface temperature of our own
Sun. If the star radiates more
yellow-green (visual) light than any other wavelength, its surface temperature
must be roughly the same as the surface temperature of our own Sun. Because of the illogical magnitude scale,
both mB–mV and mV–mR will be negative numbers for hot,
blue stars. Also because of this
illogical magnitude scale, both mB–mV and mV–mR will be
positive numbers for cool, red stars. Moreover because of this illogical magnitude scale, mB–mV will be a positive number and mV–mR will be a
negative number for intermediate-temperature, yellow-green stars like our
Sun. In summary, we can estimate the
surface temperature of a star by measuring its color magnitudes (brightnesses at different wavelengths) and calculating
color indices (differences of color magnitudes). By using many more filters and carefully
measuring the brightness of the star at many different wavelengths (colors), we
can calculate many color indices (perform many subtractions) to coarsely but
effectively pinpoint the peak wavelength of a star’s continuous blackbody
spectrum, enabling us to fairly accurately calculate
its surface temperature using the Wien displacement law. Now that we have calculated the surface
temperature of the star, we can then use the Stefan-Boltzmann law to calculate
the actual radius (physical size) of the star, since the actual radius
(physical size) of the star is the only unknown remaining in that
equation. This is remarkable. Even though our most powerful telescopes
cannot magnify most stars to actually see their physical radii (their physical
sizes), astronomers have nevertheless succeeded in calculating the physical
radii (physical sizes) of stars using this procedure. As the decades have passed, astronomers have
constructed larger and larger and hence more and more
powerful telescopes. If a star is close enough and large enough, astronomers have
eventually been able to magnify these stars sufficiently to actually see their
physical radius (their physical size) through these very powerful
telescopes. The actual radius (physical
size) of stars that astronomers have directly measured through these very
powerful telescopes is consistent with calculations from decades earlier using
the distances, the luminosities, and the surface temperatures of stars.
As we discussed earlier in
the course, the only reliable method to calculate the mass of any object in the
universe is to use Kepler’s third law.
Fortunately, most stars are members of binary star systems: two stars
orbiting each other, as we will discuss shortly. Therefore, we may use the orbital parameters
of the two stars (the orbital period and the semi-major axes of the orbits) to
calculate the masses of the stars. In
summary, astrophysicists have determined the composition of stars, the distance
to stars, the luminosity or the absolute magnitude or the intrinsic brightness
of stars, the surface temperature of stars, the physical radius (physical size)
of stars, and the mass of stars.
At first, astronomers
classified stars based on the strength of their hydrogen lines in their
absorption spectra, since stars are composed mostly of hydrogen. Stars with the strongest hydrogen absorption
lines were called A-type stars. Stars with strong hydrogen absorption lines
but not as strong as A-type stars were called B-type
stars. Stars with strong hydrogen
absorption lines but not as strong as A-type stars or B-type stars were called C-type stars, and so on and so forth. In brief, stars with strong hydrogen
absorption lines have a spectral type near the beginning of the English
alphabet, while stars with weak hydrogen absorption lines have a spectral type
near the end of the English alphabet.
When astronomers determined the surface temperatures of stars using
color magnitudes and color indices, they realized that stars should
be classified based on their temperatures, not based on the strength of
their hydrogen absorption lines.
Therefore, astronomers reordered the stellar spectral types based on
surface temperature. Astronomers
discovered that the hottest, bluest stars are O-type stars. Stars that are hot and blue, but not as hot
and not as blue as O-type stars, were the B-type stars. Next come A-type stars, which are white-hot
stars, but not as hot as O-type or B-type stars. After A-type stars come F-type stars which
are also white-hot stars, but not as white-hot as A-type stars. Next come G-type stars which are yellow-hot
stars, like our own Sun. In fact, our
Sun is considered a G-type star. Even cooler than G-type stars are K-type
stars, which are orange in color.
Finally, the coolest, reddest stars are M-type stars. In summary, the spectral types of stars in the
correct order starting with the hottest stars are O, B, A, F, G, K, and finally
M for the coolest stars. For several
decades, all astronomers memorized this temperature sequence using the
mnemonic, “Oh be a fine guy/gal, kiss me!”
Astronomers have also quantified this spectral sequence. In particular, each of these spectral types is subdivided into ten subclasses running from zero through
nine. The hottest, bluest stars have a
spectral type O0 followed by O1,
O2, O3, O4,
O5, O6, O7,
O8, and O9. After O9 would come
B0, B1, B2,
B3, B4, B5,
B6, B7, B8,
and B9. After B9 would come A0 through A9, then F0 through F9, G0 through G9, K0, through K9, and M0 through finally M9, the spectral type of the coolest, reddest stars. As a simple exercise, a K4-star
is hotter than a K7-star. As another simple exercise, a B6-star is cooler than a B3-star. Using this quantified temperature sequence,
our Sun is more precisely classified as a G2-star. We will
discuss shortly that stars also have a luminosity type in addition to the
spectral type. The luminosity type of a
star is labeled with a Roman numeral, such as I, II, III, IV,
and V. We will discuss the
meaning of each of these luminosity types shortly. Our Sun’s luminosity type is Roman numeral V,
as we will discuss. Therefore, our Sun’s
full spectral-luminosity type is G2V. Again, G2 is our
Sun’s spectral type, which indicates that our Sun is a yellow star. The Roman numeral V is our Sun’s luminosity
type, as we will discuss shortly.
The Hertzsprung-Russell Diagram
The Hertzsprung-Russell
diagram (or the H-R diagram for short) is the single most important diagram in
all of astrophysics. This diagram is named for Danish astronomer Ejnar
Hertzsprung and American astronomer Henry Norris
Russell, the two astronomers who first constructed this diagram. The vertical axis of the Hertzsprung-Russell
diagram is luminosity or absolute magnitude or intrinsic brightness. More luminous
(intrinsically brighter) stars are toward the top of the Hertzsprung-Russell diagram, while less luminous
(intrinsically dimmer) stars are toward the bottom of the Hertzsprung-Russell
diagram. The horizontal axis of the Hertzsprung-Russell diagram is temperature or spectral type
or color. Hotter, bluer stars are toward
the left on the Hertzsprung-Russell diagram, while
cooler, redder stars are toward the right on the Hertzsprung-Russell
diagram. Since the horizontal axis of
the Hertzsprung-Russell diagram is temperature or
spectral type or color, the horizontal axis can be labeled with the spectral
types O, B, A, F, G, K, and M. Again,
notice that the hotter, bluer stars are toward the left, while the cooler,
redder stars are toward the right. We
emphasize that the vertical axis of the Hertzsprung-Russell
diagram is the absolute magnitude, not the apparent magnitude. Therefore, we must measure the distance to a
star to calculate its absolute magnitude (or luminosity or intrinsic
brightness) before we can plot the star on the Hertzsprung-Russell
diagram. Thus far
in this course, we have only discussed the measurement of distances to nearby
stars within the solar neighborhood, within a couple thousand parsecs. Until we discuss higher rungs of the
Cosmological Distance Ladder, we can only discuss the construction of the Hertzsprung-Russell diagram for nearby stars, within the
solar neighborhood. Fortunately, there
are millions of stars within the solar neighborhood. Assuming that there is nothing particularly
unusual with the stars in the solar neighborhood as compared
with all other stars throughout the universe, we should be able to
determine the fundamental properties of all the stars in the entire universe by
constructing the Hertzsprung-Russell diagram for the
stars within the solar neighborhood.
The first thing we notice
when we construct the Hertzsprung-Russell diagram for
the solar neighborhood is that the vast majority of the stars on the diagram
are along a band from the upper left corner of the diagram to the lower right
corner of the diagram. The astronomers Hertzsprung and Russell called this band the main part of
the diagram. Hence, this band on the Hertzsprung-Russell diagram was
eventually named the main sequence.
We will clearly define what we mean by a main sequence star
shortly. For now, hotter main sequence
stars are more luminous (intrinsically brighter), while cooler main sequence
stars are less luminous (intrinsically dimmer).
Therefore, we may naïvely consider main sequence stars to be normal
stars, since we simplistically expect hotter stars to be more luminous and
cooler stars to be less luminous. Also notice that the vast majority of stars on the Hertzsprung-Russell diagram are main sequence stars, again
persuading us to naïvely consider these main sequence stars to be normal
stars. The entire main sequence is assigned the luminosity type Roman numeral V. Our Sun is a main sequence star, as are the
vast majority of all stars. Hence, our
Sun’s luminosity type is Roman numeral V.
Thus, our Sun’s spectral-luminosity type is G2V,
where G2 is the spectral type (meaning that our Sun
is yellow hot) and Roman numeral V is the luminosity type (meaning that our Sun
is a main sequence star).
Although the vast majority of
stars on the Hertzsprung-Russell diagram are along
the main sequence, there is a collection of stars on the upper right corner of
the diagram and another collection of stars on the lower left corner of the
diagram. The collection of stars in the
upper right corner of the Hertzsprung-Russell diagram
are intrinsically bright (since they are toward the top of the diagram) and
cool (since they are toward the right on the diagram). How is it possible for a cool star to be
intrinsically bright? Some students
argue that these stars are only apparently bright, since they are closer to us,
but this argument is incorrect. Again,
the vertical axis of the Hertzsprung-Russell diagram
is the absolute magnitude, not the apparent magnitude. Stars that are toward the top of the Hertzsprung-Russell diagram are not apparently bright
because they happen to be close to us; stars that are toward the top of the Hertzsprung-Russell diagram are intrinsically bright. Thus, the stars on the upper right corner of
the Hertzsprung-Russell diagram are truly
intrinsically bright even though they are cool.
How can this be the case? The
Stefan-Boltzmann law ℒ = σ(4πR2)T 4 reveals the answer.
The luminosity is determined by two variables:
radius (size) and temperature. The
temperature is the more important variable, since it is
raised to the fourth power in the Stefan-Boltzmann law. The radius (size) is the less important
variable, since it is raised to only the second power
in the Stefan-Boltzmann law. However,
imagine a star with a radius (a size) so enormous that squaring its radius
overpowers its cool temperature to the fourth power, resulting in a large
luminosity. Thus, the stars on the upper
right corner of the Hertzsprung-Russell diagram have
high luminosities (intrinsically bright) because they are giant (since they are
enormous) even though they are red (since they are cool). This is precisely why these stars are called red giants.
This collection of stars on the upper right corner of the Hertzsprung-Russell diagram is more properly subdivided
into red supergiants (the largest stars since they
are the most luminous), the red bright giants, the red ordinary giants, and the
red subgiants (the smallest red giants since they are the least luminous). The red supergiants
have luminosity type Roman numeral I, the red bright giants have luminosity
type Roman numeral II, the red ordinary giants have luminosity type Roman
numeral III, and the red subgiants have luminosity type Roman numeral IV. Red supergiants are
the largest stars in the entire universe; they have a radius comparable to the
radius of the Earth’s orbit around the Sun!
If we could replace our Sun with a red supergiant star, it would engulf
the entire inner Solar System! We will
often casually refer to the entire collection of stars on the upper right
corner of the Hertzsprung-Russell diagram as simply
red giants.
The stars in the lower left
corner of the Hertzsprung-Russell diagram are
intrinsically dim (since they are toward the bottom of the diagram) and hot
(since they are toward the left on the diagram). How is it possible for a hot star to be
intrinsically dim? Some students argue
that these stars are only apparently dim, since they are further from us, but
this argument is incorrect. Again, the
vertical axis of the Hertzsprung-Russell diagram is
the absolute magnitude, not the apparent magnitude. Stars that are toward the bottom of the Hertzsprung-Russell diagram are not apparently dim because
they happen to be far from us; stars that are toward the bottom of the Hertzsprung-Russell diagram are intrinsically dim. Thus, the stars on the lower left corner of
the Hertzsprung-Russell diagram are truly
intrinsically dim even though they are hot.
How can this be the case? The
Stefan-Boltzmann law ℒ = σ(4πR2)T 4 again reveals the answer. The luminosity is
determined by two variables: radius (size) and temperature. The temperature is the more important
variable, since it is raised to the fourth power in
the Stefan-Boltzmann law. The radius
(size) is the less important variable, since it is raised
to only the second power in the Stefan-Boltzmann law. However, imagine a star with a radius (a
size) so small that squaring its radius overpowers its hot temperature to the
fourth power, resulting in a small luminosity.
Thus, the stars on the lower left corner of the Hertzsprung-Russell
diagram have low luminosities (intrinsically dim) because they are dwarfs
(since they are small) even though they are white hot. This is precisely why these stars are called white dwarfs.
Besides neutron stars and black holes, both of which we will discuss
shortly, white dwarfs are the smallest stars in the entire universe; they have
a radius roughly equal to the radius of planet Earth! In summary, the vast majority of stars are
main sequence stars, where the main sequence runs from the upper left corner of
the Hertzsprung-Russell diagram to the lower right
corner of the Hertzsprung-Russell diagram. Some stars are red giants, which are toward
the upper right corner of the Hertzsprung-Russell
diagram, and some stars are white dwarfs, which are toward the lower left
corner of the Hertzsprung-Russell diagram. Red giants are intrinsically bright because
they are so large even though they are cool, hence their name red giants. White dwarfs are intrinsically dim because
they are so small even though they are hot, hence their name white dwarfs.
The main sequence is both a
temperature sequence and a luminosity sequence.
In particular, given any two stars on the main sequence, the hotter star
will be more luminous (intrinsically brighter), while the cooler star will be
less luminous (intrinsically dimmer).
Warning: this is only true on the main sequence. Is it possible for a hotter star to be less
luminous? Yes, white dwarfs are hot but
are intrinsically dim. Is it possible
for a cooler star to be more luminous?
Yes, red giants are cool but are intrinsically bright. However, given two stars on the main
sequence, the hotter star is indeed more luminous, and the cooler star is
indeed less luminous. For example,
suppose the spectral types of two stars are A9 and F2. Although the A9 star is certainly hotter since it has an earlier
spectral type and the F2 star is certainly cooler
since it has a later spectral type (recall OBAFGKM),
we cannot draw any conclusion about the luminosities of these two stars. If however in addition to the spectral types
of the two stars we are also told that both stars are on the main sequence,
only then may we draw the conclusion that the A9V
star (Roman numeral V for main sequence) is more luminous, while the F2V star (Roman numeral V for main sequence) is less
luminous.
In addition to being a
temperature sequence and a luminosity sequence, the main sequence is also a
radius (size) sequence. In particular,
given any two stars on the main sequence, the hotter, more luminous star will
be larger, while the cooler, less luminous star will be smaller. Warning: this is only true on the main
sequence. Is it possible for a hotter
star to be smaller? Yes, white dwarfs
are hot but are small. Is it possible
for a cooler star to be larger? Yes, red
giants are cool but are large. However,
given two stars on the main sequence, the hotter, more luminous star is indeed
larger, and the cooler, less luminous star is indeed smaller. For example, suppose the spectral types of
two stars are B2 and M8. Although the B2
star is certainly hotter since it has an earlier spectral type and the M8 star is certainly cooler since it has a later spectral
type (recall OBAFGKM), we cannot draw any conclusion
about the luminosities or the sizes of these two stars. If however in addition to
the spectral types of the two stars we are also told that both stars are on the
main sequence, only then may we draw the conclusion that the B2V star (Roman numeral V for main sequence) is more
luminous and larger, while the M8V star (Roman numeral
V for main sequence) is less luminous and smaller.
In addition to being a
temperature sequence, a luminosity sequence, and a radius (size) sequence, the
main sequence is also a mass sequence.
In particular, given any two stars on the main sequence, the hotter,
more luminous, larger star will be more massive, while the cooler, less
luminous, smaller star will be less massive.
Warning: this is only true on the main sequence. For example, suppose the spectral types of
two stars are G7 and K5. Although the G7
star is certainly hotter since it has an earlier spectral type and the K5 star is certainly cooler since it has a later spectral
type (recall OBAFGKM), we cannot draw any conclusion
about the luminosities, the radii (sizes), or the masses of these two
stars. If however in
addition to the spectral types of the two stars we are also told that both
stars are on the main sequence, only then may we draw the conclusion that the G7V star (Roman numeral V for main sequence) is more
luminous, larger, and more massive, while the K5V
star (Roman numeral V for main sequence) is less luminous, smaller, and less
massive.
In nearly every way
imaginable, our Sun is an ordinary star.
Firstly, our Sun is a main sequence star, just as the vast majority of
stars are main sequence stars. Recall
that the spectral-luminosity type of our Sun is G2V,
and notice that its spectral type G2 places it
roughly in the middle of the main sequence.
Our Sun is not toward the beginning of the main sequence such as an
O-type or a B-type main sequence star, nor is our Sun toward the end of the
main sequence such as a K-type or an M-type main sequence star. Therefore, our Sun is not particularly hot,
nor is our Sun particularly cool; our Sun is intermediate in temperature. Our Sun is not particularly intrinsically
bright, nor is our Sun particularly intrinsically dim; our Sun is intermediate
in luminosity. Our Sun is not
particularly large, nor is our Sun particularly small; our Sun is intermediate
in size. Our Sun is not particularly
high mass, nor is our Sun particularly low mass; our Sun is intermediate in
mass. Recall that our Sun has been
fusing hydrogen into helium in its core for roughly five billion years, and our
Sun will continue to fuse hydrogen into helium in its core for another roughly
five billion years. Therefore, our Sun
is not particularly young, nor is our Sun particularly old; our Sun is
intermediate in age. In nearly every way
imaginable, our Sun is an ordinary star.
The main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, and two more types of sequences that we will discuss shortly. We are compelled to ask the following
question: is there any type of sequence that the main sequence is not? When the astronomers Hertzsprung
and Russell first constructed the Hertzsprung-Russell
diagram, they believed that the main sequence was an evolutionary
sequence. In other words, they believed
that supposedly stars are born hot, bright O-type stars, and supposedly stars
cool as they shine, becoming B-type followed by A-type then F-type then G-type
then K-type until finally they supposedly die cool, dim M-type stars. Today, we realize that this is completely
incorrect. Stars do not evolve along the
main sequence. Unfortunately, the
astronomers Hertzsprung and Russell believed so
strongly that the main sequence was an evolutionary sequence that they called
the main sequence stars toward the upper left corner of the Hertzsprung-Russell
diagram early-type stars, and they called the main sequence stars toward the
lower right corner of the Hertzsprung-Russell diagram
late-type stars. Most unfortunately,
this incorrect nomenclature persists among astronomers and astrophysicists to
the present day. For example, an astronomer
or astrophysicist may refer to a K3V star as being
earlier than a K5V star. As another example, an astronomer or
astrophysicist may refer to an O9V star as being
later than an O3V star. Since this incorrect nomenclature persists to
the present day, we will also use this incorrect nomenclature in this
course. To summarize, the main sequence
is a temperature sequence, a luminosity sequence, a radius (size) sequence, a
mass sequence, and two more types of sequences that we will discuss
shortly. By these sequences, we mean
that given any two stars on the main sequence, the star earlier in the sequence
OBAFGKM will be hotter, more luminous, larger, and
more massive, while the star later in the sequence OBAFGKM
will be cooler, less luminous, smaller, and less massive. However, the main sequence is not an
evolutionary sequence, even though we will refer to main sequence stars toward
the left of the OBAFGKM sequence as being early-type
and main sequence stars toward the right of the OBAFGKM
sequence as being late-type. We
emphasize this again: the main sequence is not an evolutionary sequence. If stars do not evolve along the main
sequence, then how do stars actually evolve?
How are stars actually born? How
do stars actually live? How do stars
actually die? This is the next major
topic of this course, and our entire discussion of stellar evolution will be in
the context of the Hertzsprung-Russell diagram.
Stellar Evolution: Birth, Life, and Death
Stars are born from a diffuse
nebula, a giant cloud of gas many light-years across composed primarily of
hydrogen and helium. The
gases within a diffuse nebula are pushed by many different forces, including
thermal pressures, gravitational forces, magnetic pressures, and even cosmic
rays (ultra high-energy particles). All these different forces are comparable in
strength with each other in interstellar space (the space between star
systems). Thus, the gases within a
diffuse nebula are pushed in seemingly random directions, causing some regions
within the diffuse nebula to be more dense than average and other regions
within the diffuse nebula to be less dense (or more tenuous) than average. Small regions within a diffuse nebula may
become dense enough that gravity dominates over all other forces. Thus, those small regions of the diffuse
nebula will collapse from their self-gravity (under their own weight). We can gain insight into how stars are born
by considering only gravitational forces and thermal pressures. Note that this simplified argument ignores
other forces, such as magnetic pressures and cosmic rays for example. Consider a self-gravitating cloud of gas with
thermal pressures resulting from its own temperature. If this cloud of gas is more massive than a
certain critical mass, then its self-gravity will dominate over its own thermal
pressures, and the cloud will contract.
If the cloud of gas is less massive than that critical mass, then its
own thermal pressures will dominate over its self-gravity, and the cloud will
expand. If the cloud of gas is equal in
mass to this critical mass, then its self-gravity will balance its own thermal
pressures, and the cloud will remain in equilibrium. This critical mass is
called the Jeans limit, named for the British physicist James Jeans who
first performed this simplified calculation.
Even in this simplified analysis, note that the Jeans limit is not a
particular amount of mass, since the Jeans limit itself depends upon the
temperature as well as the density of the gas.
In other words, the Jeans limit is actually a range of masses that
depends upon the temperature and the density of the gas. As a result, there is a range of masses that
a star can be born with, as we will discuss shortly. We again emphasize that this is a simplified
analysis. A cloud of gas more massive
than the Jeans limit may still not contract if magnetic pressures for example
are sufficiently strong. Astrophysicists
can measure the magnetic fields within a diffuse nebula from the polarization
of starlight that passes through the nebula, and astrophysicists have
discovered magnetic fields within regions of diffuse nebulae that are
sufficiently strong to prevent the contraction of gas within those regions of
the diffuse nebula. Nevertheless, if a
small region of a diffuse nebula is dense enough for gravity to dominate over
all other forces, then that small region of the diffuse nebula will contract,
collapsing from its self-gravity (under its own weight). At first, the collapse does not significantly
change the temperature of the gas, since the gas is so tenuous (low density)
that its constituent particles are so far from one another that they almost
never collide with one another. However,
as the cloud continues to collapse, it becomes more and more dense and hence
more and more opaque (less and less transparent). Eventually, the cloud becomes so dense that
if it continues to collapse, its constituent particles begin to collide with
one another more and more frequently, thus causing the collapsing cloud to
become warmer. The collapsing cloud has
now become sufficiently dense that it is able to convert gravitational energy
into heat, which is Kelvin-Helmholtz (gravitational) contraction as we
discussed. Although this collapsing
cloud is not yet a star, we now call it a protostar
beginning with this transition in density and hence increase in opacity
(decrease in transparency). As a protostar continues to collapse, it becomes hotter and
hotter due to Kelvin-Helmholtz (gravitational) contraction. These hotter temperatures cause greater
thermal pressures, which push against the self-gravity of the protostar. Hence,
the collapse of the protostar slows. This imbalance between gravitational forces
and thermal pressures may cause pulsations within the protostar,
causing its size to oscillate from large to small and back again. As a result, the luminosity of the protostar oscillates from bright to dim and back
again. These protostars
are called Tauri variable
stars, which we will discuss later in the course. For now, if the protostar
is sufficiently massive for its self-gravity to continue to dominate over all
other forces, then it will continue to collapse, becoming hotter and
hotter. Eventually, the protostar has collapsed to such a small size that its core
temperature reaches millions of kelvins, and hydrogen begins fusing into
helium. These nuclear fusion reactions
provide an outward pressure to balance inward self-gravity. When the protostar
attains gravitational equilibrium, we say that a star is born.
All stars are born main sequence
stars. If all stars are born main
sequence stars, then where do red giants and white dwarfs come from? These stars come from stellar death, as we
will discuss shortly. For now, all stars
are born main sequence stars, but where along the main sequence are stars
born? With which spectral type, O, B, A,
F, G, K, or M, is a star born? As we
discussed, the Jeans limit is actually a range of masses. Hence, there is a range of many different
masses a star can be born with, and it is the mass that a star
is born with that determines the spectral type of the star. In fact, the mass of a star is the single
most important physical quantity of a star.
The mass of a star determines how it will be born, how it will live, and
how it will die. We will discuss stellar
life and stellar death shortly. For now,
if a star happens to be born with high mass because it had to overcome a large
Jeans limit, then it will be born early on the main sequence, perhaps O-type or
B-type. If a star happens to be born
with low mass because it had to overcome a small Jeans limit, then it will be
born late on the main sequence, perhaps K-type or M-type. If a star happens to be born with
intermediate mass because it had to overcome an intermediate Jeans limit, then
it will be born roughly in the middle of the main sequence, perhaps A-type,
F-type, or G-type. In brief, the mass a
star is born with determines its spectral type on the main sequence. Our argument is as follows. If a star happens to be born with high mass,
it will have strong self-gravity.
Therefore, a strong outward pressure is necessary to balance that strong
inward self-gravity, and there will be a correspondingly hot temperature
associated with that strong pressure.
Hence, the star will be born hot and bright. If a star happens to be born with low mass,
it will have weak self-gravity.
Therefore, a weak outward pressure is necessary to balance that weak
inward self-gravity, and there will be a correspondingly cool temperature
associated with that weak pressure.
Hence, the star will be born cool and dim. This explains why the main sequence is a
temperature sequence, a luminosity sequence, and a mass sequence. High-mass stars must be born hot and bright
to provide the strong outward pressure necessary to balance the strong inward
self-gravity created by its high mass, while low-mass stars must be born cool
and dim to provide the weak outward pressure necessary to balance the weak
inward self-gravity created by its low mass.
There is an upper limit of
mass that a star is permitted to be born with. This limit is called
the Eddington limit, named for the British physicist Arthur Eddington who first
calculated this upper mass limit. The
Eddington limit is roughly equal to 100M☉ (one hundred solar masses or one hundred times the
mass of our Sun). If a protostar happens to have a mass greater than this
Eddington limit, then the outward radiation pressure generated by its
incredible luminosity will not just balance its inward self-gravity; that
enormous outward radiation pressure will overpower its inward
self-gravity. The protostar
collapses at first as usual, but the enormous outward radiation pressure
eventually halts the collapse and actually forces the protostar
to expand. Essentially, the protostar blows itself apart before it could ever be born a
main sequence star. Indeed, astronomers
have never discovered a star with a mass significantly greater than roughly 100M☉ (one
hundred solar masses or one hundred times the mass of our Sun). This Eddington limit defines the beginning of
the main sequence. The earliest main
sequence star has spectral-luminosity type O0V, and
these stars have a mass roughly equal to the Eddington limit of roughly 100M☉ (one
hundred solar masses or one hundred times the mass of our Sun). If a protostar
happens to have a mass greater than the Eddington limit, it will blow itself
apart before it can even be born a main sequence star. If a protostar
happens to have a mass less than the Eddington limit, it will be born a main
sequence star, fusing hydrogen into helium in its core.
There is a lower limit of
mass that a star is permitted to be born with, roughly
equal to 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun). If a protostar
happens to have a mass less than this lower limit,
then its self-gravity will be so weak that the outward pressure necessary to
balance its weak inward self-gravity is also extraordinarily weak. The corresponding temperature is so cool that
nuclear fusion is never ignited in the core. The protostar does
eventually stop collapsing and attains gravitational equilibrium with outward
pressure balancing inward self-gravity, but the outward pressure is not provided by the nuclear fusion of hydrogen into
helium. The outward pressure is provided by electron degeneracy pressure, which we will
discuss in detail shortly. In this
course, we strictly define a main sequence star as a star that fuses hydrogen
into helium in its core. Therefore, main
sequence stars are also called hydrogen-burning stars. The use of the word burning is technically
incorrect, since the word burning implies chemical reactions instead of nuclear
reactions. Nevertheless, astronomers and
astrophysicists use this word burning not just for hydrogen fusing into helium but for any nuclear reaction. Again, the strict definition of a main
sequence star is a hydrogen-burning star, a star that fuses hydrogen into
helium in its core. If a protostar happens to have a mass less than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then it will not be born a
main sequence star. The protostar becomes a very low mass sphere of mostly hydrogen
and helium that is not hot enough to fuse hydrogen into helium in its
core. These are called
brown dwarf stars, although they are not strictly stars. The simple term brown dwarf instead of the
term brown dwarf star would be more correct.
This lower limit of 0.08M☉ (0.08 solar masses or eight percent the mass of our
Sun) defines the end of the main sequence.
The latest main sequence star has spectral-luminosity type M9V, and these stars have a mass roughly equal to 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun). We can actually plot brown dwarfs on the Hertzsprung-Russell diagram. Since brown dwarfs are less massive and
cooler and dimmer and smaller than even M9V stars at
the end of the main sequence, brown dwarfs would be further to the right (since
they are cooler) and further down (since they are dimmer) than the end of the
main sequence. These brown dwarfs even
have their own spectral type; brown dwarfs are classified
as L-type stars, even cooler and dimmer than M-type main sequence stars. Therefore, a more complete listing of
spectral types in the correct order from hottest to coolest is OBAFGKML. These
L-type stars (brown dwarfs) are sufficiently cool that they radiate more
infrared light and less visible light as compared with main sequence
(hydrogen-burning) stars. If a protostar happens to have a mass greater than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then the protostar will be born a main sequence star, fusing
hydrogen into helium in its core. If a protostar happens to have a mass less than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then the protostar will be born a brown dwarf, a very low mass
sphere of mostly hydrogen and helium that is not hot enough to fuse hydrogen
into helium in its core. These brown
dwarf stars should sound familiar. A
gas-giant planet is a sphere of mostly hydrogen and helium that is much smaller
and much less massive than a star and is not hot enough to fuse hydrogen into
helium in its core, as we discussed earlier in the course. We suspect that the term brown dwarf star is
simply another name for gas-giant planet.
Indeed, there is virtually no difference between a brown dwarf star and
a gas-giant planet. The only difference
is the circumstances of their formation (their birth). If the object formed from a collapsing cloud
of gas within a diffuse nebula, then we name it a brown dwarf star. If the object formed within the
protoplanetary disk orbiting a true main sequence star, then we name it a
gas-giant planet. Other than their
formation (how they are born), there is virtually no difference between a brown
dwarf star and a gas-giant planet.
However, many students then conclude that Jupiter is a failed star. These students argue that if Jupiter had been
just a little more massive that it would have become a true main sequence star,
resulting in us living in a binary star system. (We will discuss binary star systems
shortly.) This conclusion is false. The minimum mass necessary to become a true
main sequence star is roughly 0.08M☉ (0.08 solar masses or eight percent the mass of our
Sun), but Jupiter has a mass of only 0.001M☉ (one-thousandth
of a solar mass), as we discussed earlier in the course. The ratio between 0.08 and 0.001 is
eighty. Hence, the minimum mass
necessary to become a true main sequence star is roughly eighty jovian masses. In other words, Jupiter only has a very small
fraction (one-eightieth) of the mass necessary to become a true main sequence
star. Thus, the mass of Jupiter is not
close to the minimum mass necessary to become a true main sequence star. Therefore, Jupiter should
certainly be regarded as a gas-giant planet, not as a failed star. On the other hand, Jupiter might
be incorrectly regarded as a brown dwarf star by intelligent alien
lifeforms living billions of years from now, as we will discuss.
As a protostar
collapses, it spins faster and faster in accordance with the Law of
Conservation of Angular Momentum. The
amount by which a protostar collapses is so
tremendous that we can easily calculate that incredibly strong centrifugal
forces should rip apart all protostars during their
collapse, thus preventing any stars from ever forming. Since stars obviously are born, protostars must lose angular momentum as they
collapse. Firstly, a protoplanetary disk
forms around the protostar from which planets will
eventually form, as we discussed earlier in the course. Most of the angular momentum of the
collapsing gas resides in the material orbiting around the protostar,
not in the protostar itself. In the case of our own Solar System for
example, although the mass of the Sun accounts for roughly 99.9 percent of the
total mass of the entire Solar System as we discussed earlier in the course,
the rotational angular momentum of the Sun accounts for less than four percent
of the total angular momentum of the entire Solar System. In other words, the orbiting planets account
for more than ninety-six percent of the total angular momentum of the entire
Solar System. Secondly, the magnetic
field of a protostar strengthens as it
collapses. This strengthening magnetic
field ejects ionized gases at fast speeds, and these ejected ionized gases
carry angular momentum away from the protostar. These ionized gases are
often ejected as narrow columns or jets near the angular momentum axis
of the forming star system, and these jets illuminate surrounding gases as the
fast-moving jets collide with the surrounding gases. These illuminated gases, together with the
colliding ionized gases ejected from the young star system, are
called Herbig-Haro objects or HH objects for short, named for the American astronomer
George Herbig and the Mexican astronomer Guillermo Haro who discovered them.
Although protostars lose most of their angular
momentum through these mechanisms, they nevertheless collapse by such
tremendous amounts that they do rotate faster as they collapse. A protostar
eventually rotates so fast that the centrifugal force becomes sufficiently
strong that the protostar usually rips itself apart
into two protostars.
These two protostars remain orbiting each
other, and both protostars are eventually born as two
main sequence stars orbiting each other.
This process is called fragmentation, and it
results in a binary star system: two stars orbiting each other with possibly
planets orbiting both of the stars. Most
star systems are binary star systems, since fragmentation usually occurs from
the strong centrifugal forces from the fast rotation due to the tremendous
collapse in size of the protostar. Fragmentation can be even more severe,
resulting in three protostars that are eventually
born as a trinary star system: three stars orbiting each other with possibly
planets orbiting all three stars.
Usually, the two more massive stars orbit each other on a tighter orbit
while the least massive third star orbits those two stars along a larger
orbit. Planets would then orbit all
three stars on even larger orbits. The
closest star system to our Solar System, the α
(alpha) Centauri star system, happens to be a trinary star system. This star system is slightly more than one
parsec distant from our Solar System, and the three stars are
named α (alpha) Centauri A, α (alpha) Centauri B, and α
(alpha) Centauri C. The star α
(alpha) Centauri A happens to be a G2V star, just
like our own Sun. The star α
(alpha) Centauri C happens to be closer to our Solar System than the other two stars. Hence, α (alpha) Centauri C is the
closest star to us, besides the Sun of course.
For this reason, the star α (alpha) Centauri C is
also called Proxima Centauri, since the word
proximity means near or close. Other
nearby star systems include the Barnard star system nearly two parsecs distant,
the Luhman 16 binary star system roughly two parsecs
distant, the Wolf 359 star system roughly 2.4 parsecs distant, and the Sirius
binary star system nearly three parsecs distant. We will discuss the Sirius binary star system
in more detail shortly. Fragmentation
may result in a quadruplet star system: four stars orbiting each other with
possibly planets orbiting all four stars.
Often, two of the stars orbit each other on one tight orbit, the other
two stars orbit each other on another tight orbit, and both pairs of stars
orbit each other on a larger orbit.
Essentially, a quadruplet star system is often a double binary star
system. Planets would then orbit all
four stars on even larger orbits.
Fragmentation may result in quintuplet star systems (five stars orbiting
each other with possibly planets orbiting all five stars), sextuplet star
systems (six stars orbiting each other with possibly planets orbiting all six
stars), and so on and so forth. All such
star systems are rare. Again, most star
systems are binary star systems, with a fair number of star systems as
single-star star systems, such as our own Solar System. We discussed that our Sun is an ordinary star
in several respects. However, there is one
thing unusual about our Sun: it did not suffer from fragmentation while it was
being born as a protostar, since it is the only star
in our Solar System. This is unusual,
since protostars usually suffer from fragmentation,
as we discussed. Since fragmentation
usually occurs, a high mass protostar will usually become fragmented into lower mass protostars. Hence,
high mass main sequence stars are less abundant (more rare), while low mass
main sequence stars are more abundant (more common). We conclude that the main sequence is also a
population-abundance sequence, meaning earlier main sequence stars (hotter,
more luminous, larger, more massive main sequence stars) are less abundant
(more rare), while later main sequence stars (cooler, less luminous, smaller,
less massive main sequence stars) are more abundant (more common). Therefore, most stars are born M-type main
sequence stars. Many stars are also born
K-type main sequence stars, but not as commonly as M-type main sequence
stars. A fair number of stars are born
G-type main sequence stars. Few stars
are born F-type main sequence stars, and even fewer stars are born A-type main
sequence stars. A
small fraction of all stars are born B-type main sequence stars, and
only a tiny fraction of stars are born O-type main sequence stars. To summarize, the main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, a population-abundance sequence, and one more type of sequence that
we will discuss shortly. By these
sequences, we mean that given any two stars on the main sequence, the star
earlier in the sequence OBAFGKM will be hotter, more
luminous, larger, more massive, and less abundant (more rare), while the star
later in the sequence OBAFGKM will be cooler, less
luminous, smaller, less massive, and more abundant (more common). Caution: the main sequence is not an
evolutionary sequence!
After a main sequence star is
born, it spends its life fusing hydrogen into helium in its core. The duration of time that a main sequence
star spends fusing hydrogen into helium in its core depends upon its mass. Again, we see that the mass of a star is the
single most important physical quantity of a star. The mass of a star determines how it will be
born, how it will live, and how it will die.
We have already discussed stellar birth, and we will discuss stellar
death shortly. For now, the mass of a
star determines the duration of its main sequence (hydrogen-burning)
lifetime. Many students argue that high
mass stars should live longer lives, since they have more mass and therefore
more hydrogen to use as fuel for the nuclear fusion reactions in the core. These students also argue that low mass stars
should live shorter lives, since they have less mass and therefore less
hydrogen to use as fuel for the nuclear fusion reactions in the core. Although this argument seems reasonable, it
is completely wrong. In fact, the
opposite is true: high mass main sequence stars have shorter lifetimes, while
low mass main sequence stars have longer lifetimes. Firstly, simple calculations using the basic
properties of stars reveal that this must be the case. Early-type main sequence stars may have more
mass and therefore more hydrogen to use as fuel for the nuclear fusion
reactions in the core, but early-type main sequence stars are also much more
luminous. This luminosity is ultimately
coming from the nuclear reactions in the core, and so we conclude that the
nuclear reactions are proceeding at a faster rate. Late-type main sequence stars may have less
mass and therefore less hydrogen to use as fuel for the nuclear fusion
reactions in the core, but late-type main sequence stars are also much less
luminous. Again, this luminosity is
ultimately coming from the nuclear reactions in the core, and so we conclude
that the nuclear reactions are proceeding at a slower rate. These conclusions we
have drawn from simple calculations are consistent with more complex
calculations. Nuclear reactions should
proceed at faster rates at hotter temperatures, and nuclear reactions should
proceed at slower rates at cooler temperatures.
(The same is also true for chemical reactions.) High mass main sequence stars are so hot that
the nuclear fusion reactions proceed so quickly that these high mass stars burn
through all their hydrogen in an extremely brief amount of time, even though
they have much more hydrogen to burn than low mass stars. Low mass main sequence stars are so cool that
the nuclear fusion reactions proceed so slowly that it takes these low mass
stars an extremely long amount of time to burn through their hydrogen, even
though they have much less hydrogen to burn than high mass stars. Many students argue that the lifetime of a
high mass star may be relatively shorter but must in fact be actually longer
since it has much more hydrogen to burn.
These students also argue that the lifetime of a low mass star may be
relatively longer but must in fact be actually shorter since it has much less
hydrogen to burn. This argument is again
false. High mass main sequence stars are
so hot with such tremendous luminosity that the nuclear fusion reactions
proceed so quickly that their actual lifetime is truly shorter,
even though these high mass stars have much more hydrogen to burn. Low mass main sequence stars are so cool with
such little luminosity that the nuclear fusion reactions proceed so slowly that
their actual lifetime is truly longer, even though these low mass stars have
much less hydrogen to burn. The
following analogy is helpful. Most students
believe that a rich person who earns millions of dollars per year will be able
to survive much longer than a poor slob who only earns a few thousand dollars
per year, but in fact the opposite is true. A rich person who earns millions of dollars
per year almost always spends their money at a furious pace. The rich spend their money so fast that they
burn through their money in a short amount of time, even though they have more
money to burn. A poor slob who earns
only a few thousand dollars per year hardly has any money to spend. Hence, the poor spend their money at such a
slow pace that they can survive their entire lifetimes on their miserable
salaries. Indeed, this is actually the
case; the highest bankruptcy rates are among the rich, not among the poor. Similarly, high mass main sequence stars burn
through their hydrogen more quickly even though they have more hydrogen to
burn, while low mass main sequence stars burn through their hydrogen more
slowly even though they have less hydrogen to burn. O-type main sequence stars are so hot and so
luminous that their nuclear fusion reactions proceed so quickly that they burn
through their hydrogen in an incredibly short amount of time even though they
have much more hydrogen to burn; the main-sequence lifetime of an O-type star
is roughly one million years, incredibly short by astronomical terms. B-type main sequence stars are also hot
enough and luminous enough that their nuclear fusion reactions proceed so quickly
that they burn through their hydrogen in a very short amount of time, although
they are not as hot and not as luminous as O-type stars. Therefore, B-type stars live somewhat longer
than O-type stars; the main-sequence lifetime of a B-type star is roughly ten
million years, still short by astronomical terms. A-type stars are also hot and luminous, but
not as hot and not as luminous as O-type or B-type stars. Therefore, their nuclear fusion reactions
proceed somewhat more slowly, giving A-type stars a somewhat longer
main-sequence lifetime of roughly one hundred million years. F-type stars are not as hot and not as
luminous as A-type stars; therefore, their nuclear fusion reactions proceed
somewhat more slowly, giving F-type stars a somewhat longer main-sequence
lifetime of roughly one billion years.
G-type stars have an even longer main-sequence lifetime of roughly ten
billion years. Recall that our Sun is a G2V star. Also
recall that our Sun has been fusing hydrogen into helium in its core for
roughly five billion years, and also recall that our
Sun will continue fusing hydrogen into helium in its core for the next roughly
five billion years. Therefore, the
entire main-sequence lifetime of our Sun is roughly ten billion years, as it
should be for a G-type main sequence star.
K-type stars are so cool and so dim that their nuclear fusion reactions
proceed so slowly that the main-sequence lifetime of a K-type star is roughly
one hundred billion years. This is
longer than the current age of the universe, which is only roughly fourteen
billion years. Therefore, every K-type star that has ever been born has not died
yet. We must wait at least an additional
roughly eighty-six billion years before K-type stars begin to die. Finally, M-type stars are so cool and so dim
that their nuclear fusion reactions proceed so slowly that the main-sequence
lifetime of an M-type star is roughly one trillion years, much much longer than
the current age of the universe.
Therefore, every M-type star that has ever been born
has not died yet. We must wait
countless billions of years before any M-type stars begin to die. In brief, the main sequence is a lifetime
sequence. Given any two stars on the
main sequence, the star earlier in the sequence OBAFGKM
will have a shorter main-sequence lifetime, while the star later in the
sequence OBAFGKM will have a longer main-sequence
lifetime. Brown dwarf stars are so cool
that they do not fuse hydrogen into helium in their cores. Consequently, brown dwarf stars do not expend
their hydrogen, and so we may regard brown dwarf stars as living indefinitely.
We will discuss how the mass
of a star determines its death shortly.
For now, we briefly mention that the mass of a star not only determines
its main-sequence lifetime but the duration of its death as well. The main-sequence lifetime of low-mass stars
is in the billions of years, while the main-sequence lifetime of high-mass
stars is only in the millions of years.
The processes involved with stellar death are shorter in duration as
compared with a star’s main-sequence lifetime, but these shorter durations are
in approximate proportion with the corresponding main-sequence lifetimes. In particular, the death of a low-mass star
is millions of years in duration, while the death of a high-mass star is only
thousands of years in duration. The mass
of a star even determines its protostar-lifetime. In particular, the collapse of a low-mass protostar is millions of years in duration, while the
collapse of a high-mass protostar is only thousands
of years in duration. Note that in all
cases, the main-sequence lifetime of a star is overwhelmingly longer than the
duration of its birth as a collapsing protostar and
overwhelmingly longer than the duration of its death. The main-sequence lifetime of a star is so
overwhelmingly longer than its birth and its death that the total lifetime of a
star may be regarded as its main-sequence lifetime as
an excellent approximation. Note that
the total lifetime of a high mass star is in the millions of years, but it
takes that long just for a low-mass protostar to
collapse. In other words, a high-mass
star could be born, could live its entire main-sequence lifetime, and could die
all in a time shorter than the time it takes a low-mass protostar
to collapse, meaning that a high-mass star is born, lives, and dies even before
a low-mass star can even be born!
Strictly, there are gradual
changes in the luminosity, the temperature, and even the radius (the size) of a
star over its main-sequence lifetime.
Nevertheless, these main-sequence changes are small as compared with the
changes in these quantities during stellar birth, when the changes are much
more severe. Also,
main-sequence changes are small as compared with the changes in these
quantities during stellar death, when the changes are also much more severe, as
we will discuss shortly. Therefore, we
will regard the luminosity, the temperature, and the radius (the size) of a
star as approximately constant (or fixed) over its main-sequence lifetime as a
satisfactory approximation. Since the
main-sequence lifetime of a star is overwhelmingly longer than its birth and
its death, we conclude that a star remains at approximately the same location
on the main sequence on the Hertzsprung-Russell
diagram during most of its life. This
validates our comparison of temperatures, luminosities, radii (sizes), masses,
population abundances, and lifetimes of main sequence stars as physically
meaningful. It is therefore appropriate
to summarize the sequences of physical quantities by spectral-type along the main
sequence. The main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, a population-abundance sequence, and a lifetime sequence. By these sequences, we mean
that given any two stars on the main sequence, the main sequence star earlier
in the sequence OBAFGKM will be hotter, more
luminous, larger, more massive, less abundant (more rare), with a shorter
main-sequence lifetime, while the main sequence star later in the sequence OBAFGKM will be cooler, less luminous, smaller, less
massive, more abundant (more common), with a longer main-sequence lifetime. Warning: all of these conclusions can only be drawn if both stars are on the main
sequence. If even one of the two stars
is not on the main sequence, we cannot easily make any comparisons between the
two stars. Finally, the main sequence is
not an evolutionary sequence, as we are currently discussing. In actuality, a star can be born anywhere
along the main sequence depending on its mass, and a star will remain at its
particular location on the main sequence on the Hertzsprung-Russell
diagram throughout its main-sequence lifetime, fusing hydrogen into helium in
its core. When stars die, they actually
evolve off of the main sequence, as we now discuss.
Stellar death is defined to begin when a main sequence star has exhausted
the hydrogen in its core, having fused the hydrogen into helium. Without hydrogen in its core to fuse into
helium, the star’s main-sequence (hydrogen-burning) lifetime has ended, and
stellar death begins. For the purposes
of stellar death, we divide all main sequence stars into two categories: low
mass main sequence stars and high mass main sequence stars. A low mass main sequence star has a mass less
than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses or seven, eight, or nine times the mass of our
Sun). A high mass main sequence star has
a mass greater than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses or seven, eight, or nine times the mass of our
Sun). Note that our Sun is a low mass
main sequence star as far as stellar death is concerned, since the mass of our
Sun is 1M☉ (one
solar mass), and one is less than seven, eight, or nine! In terms of spectral types, low mass main sequence
stars have spectral types A, F, G, K, or M, while high mass main sequence stars
have spectral types O or B. Again, our
Sun is a G2V star, which falls into the low mass main
sequence category. The vast majority of
all main sequence stars are low mass; only a very small fraction of all main
sequence stars are high mass. We divide
all main sequence stars into these two categories because low mass death and
high mass death are sufficiently different that we must discuss them
separately. Actually, low mass death and
high mass death are somewhat similar to each other. High mass death is simply more violent as
compared with low mass death. In other
words, low mass death is more gentle as compared with
high mass death. Since the vast majority
of all main sequence stars are low mass, most stars die gently. Since only a very small fraction of all main
sequence stars are high mass, few stars die violently. Even though high mass death is rare, we must
devote a thorough discussion to high mass death, since we owe our very
existence to violent high mass death, as we will discuss shortly. Nevertheless, we begin our discussion with
low mass death, since the vast majority of all stars die gently, including our
own Sun.
Low mass stars have long
main-sequence lifetimes. After
exhausting the hydrogen in the core, the nuclear fusion reactions end. Thus, there is no outward pressure to balance
the inward self-gravity of the helium core.
Hence, the helium core begins to collapse under its self-gravity. As the helium core collapses, it becomes
hotter, since it is converting gravitational energy into heat. A layer of hydrogen around that collapsing
helium core becomes hot enough to itself fuse into
helium. This fusion layer around the
collapsing helium core provides pressure that pushes the outer layers of the
star further outward. If the outer
layers expand, then they must become cooler.
The core of the star and the outer layers of the star are doing two
opposite things at the same time! The
core collapses and becomes hotter, while the outer layers of the star expand
and become cooler! We can only observe
the outer layers of a star; the inner layers of a star are
hidden beneath its outer layers.
Hence, we observe the outer layers of the star become larger and
cooler. Cooler temperatures correspond
to redder colors. Therefore, the star
becomes larger and redder. In other
words, the star has become a red giant.
As we discussed, all stars are born main sequence stars, while red
giants are essentially dying stars. More
correctly, the outer layers of the star gradually expand and cool over millions
of years, turning the star from a main sequence star to an orange subgiant star
to a red giant star. Although this
gradual expansion over millions of years seems long as compared with human
timescales, this expansion is relatively short as compared with the billions of
years the star spent as a main sequence star.
The imbalance between gravitational forces and thermal pressures during
the expansion from a main sequence star to an orange subgiant star to a red
giant star may cause pulsations within the star, causing its size to oscillate
from large to small and back again. As a
result, the luminosity of the star oscillates from bright to dim and back
again. These stars are
called Cepheid variable stars, which we will discuss later in the
course. The helium core continues to
collapse, becoming hotter. Eventually,
the helium core becomes so hot that helium nuclei begin fusing into heavier
nuclei, in particular carbon nuclei.
This is called helium burning, although again
the use of the word burning is incorrect nomenclature. The moment when helium begins fusing into
carbon is called the helium flash. The
nuclear fusion of helium into carbon is more properly written 3 → energy + . This nuclear reaction is
called the triple-alpha process, since three helium nuclei (three alpha
particles) fuse into a carbon nucleus.
Note that the electromagnetic repulsion between electrical charges is
directly proportional to the product of the charges. Hence, the temperature necessary to overpower
the electromagnetic repulsion between two helium nuclei (two alpha particles)
each having two positive protons is hotter than the temperature necessary to
overpower the electromagnetic repulsion between two hydrogen nuclei (two
protons), each having one positive proton.
In the case of helium-helium fusion, the electromagnetic repulsion is
proportional to two times two, which is four.
In the case of hydrogen-hydrogen fusion, the electromagnetic repulsion
is proportional to one times one, which is one.
Four is significantly greater than one, meaning more electromagnetic
repulsion. Hence, a hotter temperature
is required for helium-helium fusion (the basis of the triple-alpha process) as
compared with hydrogen-hydrogen fusion (the basis of the proton-proton
cycle). The helium flash causes a small
expansion of the core and hence a slight decrease in the core temperature. This in turn causes the outer layers of the
star to contract and warm. This
imbalance between gravitational forces and thermal pressures may cause
pulsations within the star, causing its size to oscillate from large to small
and back again. As a result, the
luminosity of the star oscillates from bright to dim and back again. These stars are called
Lyrae variable stars, which we will discuss later in
the course. Eventually, the entire star
attains a new gravitational equilibrium as a helium-burning star, although note
that there is a layer of hydrogen fusing into helium around the core where
helium fuses into carbon. The
helium-burning lifetime of the star is much shorter than its hydrogen-burning
(main-sequence) lifetime, since helium fusing into carbon occurs at much hotter
temperatures than hydrogen fusing into helium.
The star spends millions of years as a helium-burning star. Although this seems long as compared with
human timescales, these millions of years as a helium-burning star is
relatively short as compared with the billions of years the star spent as a
hydrogen-burning (main sequence) star.
Eventually, the core exhausts the helium in its core, ending the
triple-alpha process. Again, there is no
outward pressure to balance the inward self-gravity of the carbon core. Hence, the core again collapses, becoming
hotter. A layer of helium around that
collapsing carbon core becomes hot enough to fuse into carbon, and a layer of
hydrogen around that helium-burning layer becomes hot enough to fuse into
helium. These two fusion layers around
the collapsing carbon core provide pressure that again pushes the outer layers
of the star further outward, causing the outer layers to become cooler and
hence redder. The star has become a red
giant a second time! The imbalance
between gravitational forces and thermal pressures during the expansion from a
helium-burning star to a red giant star may cause pulsations within the star,
causing its size to oscillate from large to small and back again. As a result, the luminosity of the star
oscillates from bright to dim and back again.
These stars are called Mira variable stars,
which we will discuss later in the course.
Since the star is low mass, its self-gravity is
too weak to compress the carbon core sufficiently to ignite the nuclear fusion
of carbon nuclei into even heavier nuclei.
In other words, a carbon flash does not occur. Hence, the outer layers of the star continue
to expand until they become divorced from the very small, very hot carbon
core. The outer layers have become a
slowly expanding shell of gas. This is called a planetary nebula, which is a truly incorrect
term since a planetary nebula has nothing to do with planets! The planetary nebula exposes the very small,
very hot carbon core. This naked core is
very small and very hot since it has collapsed twice. We might suspect that this naked core is
intrinsically bright, since it is so hot.
However, this naked core is very small; it is roughly the size of the
Earth! According to the Stefan-Boltzmann
law, such a small size results in a low luminosity, even though the temperature
is hot. Therefore, this naked core is
small, hot, and intrinsically dim. The
naked core has become a white dwarf. As
we discussed, all stars are born main sequence stars, while red giants and
white dwarfs result from stellar death.
In summary, a low-mass main sequence (hydrogen-burning) star dies by
first becoming a red giant, enters a helium-burning phase, becomes a red giant
a second time, and finally dies as a slowly expanding planetary nebula
surrounding a white dwarf. White dwarfs
have incredible densities, since they have roughly the mass of our Sun squeezed
into roughly the size of the Earth. The
radius of the Earth, and therefore the radius of a white dwarf, is roughly 0.01R☉
(one-hundredth of a solar radius or one-hundredth the radius of our Sun). Therefore, the volume of the Earth, and
therefore the volume of a white dwarf, is roughly one-millionth the volume of
our Sun. With roughly the mass of the
Sun squeezed into roughly one-millionth the volume of the Sun, white dwarfs
therefore have densities roughly one million times normal densities! White dwarfs also have sufficiently hot
surface temperatures to radiate a fair amount of ultraviolet light. The gases of the surrounding planetary nebula
absorb some of these ultraviolet photons radiated by the hot white dwarf,
bringing the electrons within these gases to higher energy quantum states. The electrons then transition back down to
lower energy quantum states, emitting visible light photons. As a result, a planetary nebula surrounding a
white dwarf often displays a variety of beautiful colors. The planetary nebula continues to expand,
becoming cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the planetary nebula
return to the interstellar medium. The
interstellar medium (which astrophysicists always abbreviate ISM) is the very
diffuse gas that fills the Milky Way Galaxy.
In fact, a nebula is actually a part of the interstellar medium where
densities are greater than the average densities of most of the gas of the
interstellar medium. As we discussed,
stars are born from within a diffuse nebula; therefore, stars are born from the
interstellar medium. Low mass stars live
their lives fusing hydrogen into helium, begin dying by fusing helium into
carbon, and finally die by returning the gas of its outer layers back to the
interstellar medium. These gases may
someday form a new diffuse nebula from which new stars will be born. Hence, stellar evolution is actually a cycle,
since stellar death ultimately leads to stellar birth again. Beautiful examples of planetary nebulae
include the Ring Nebula in the constellation Lyra (the harp), the Little Ghost
Nebula in the constellation Ophiuchus (the serpent bearer), and the Helix
Nebula in the constellation Aquarius (the water bearer). The white dwarf at the center of a planetary
nebula spends billions of years becoming cooler and cooler and hence dimmer and
dimmer. After many more billions of
years, a white dwarf becomes so cool and so dim that it is renamed a black
dwarf.
We subdivide low mass main
sequence stars into two subcategories: ordinary low mass stars and very low
mass stars. Ordinary low mass main
sequence stars have masses from 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) down to roughly 0.5M☉
(one-half of one solar mass). Very low
mass main sequence stars have masses from roughly 0.5M☉
(one-half of one solar mass) all the way down to the lower limit of all main
sequence stars of roughly 0.08M☉ (0.08 solar masses).
Note that our Sun is an ordinary low mass star, since the mass of our
Sun is 1M☉ (one
solar mass), and one is between one-half and seven, eight, or nine! In terms of spectral types, ordinary low mass
main sequence stars have spectral types of A, F, or G, while very low mass main
sequence stars have spectral types of K or M.
Recall that our Sun is a G2V star, again
placing our Sun into the ordinary low mass subcategory. The stellar death we have discussed thus far
strictly applies to ordinary low mass main sequence stars, like our Sun. Very low mass main sequence stars die
somewhat differently. A very low mass
star spends countless billions of years fusing hydrogen into helium in its
core. After exhausting the hydrogen in
its core, the nuclear fusion reactions end.
Thus, there is no outward pressure to balance the inward self-gravity of
the helium core. Hence, the helium core
begins to collapse under its self-gravity.
As the helium core collapses, it becomes hotter, since it is converting
gravitational energy into heat. A layer
of hydrogen around that collapsing helium core becomes hot enough to fuse into
helium. This fusion layer around the
collapsing helium core provides pressure that pushes the outer layers of the
star further outward. If the outer
layers expand, then they must become cooler.
Again, the core of the star and the outer layers of the star are doing
two opposite things at the same time.
The core collapses and becomes hotter, while the outer layers of the
star expand and become cooler. Again, we
can only observe the outer layers of a star; the inner layers of a star are hidden beneath its outer layers. Hence, we observe the outer layers of the
star become larger and cooler. Cooler
temperatures correspond to redder colors.
Therefore, the star becomes larger and redder. In other words, the outer layers of the star
gradually expand and cool over millions of years, turning the star from a main
sequence star to a subgiant star to a giant star. The death of very low mass stars seems
identical with the death of ordinary low mass stars, but now the differences
begin. Very low mass stars have such
weak self-gravity that they cannot compress their cores to reach the threshold
temperatures at which helium fuses into carbon.
In other words, the helium flash never occurs, and the star only becomes
a red giant once instead of twice. The
outer layers of the star continue to expand, eventually becoming a planetary
nebula surrounding a helium white dwarf instead of a carbon white dwarf. As we discussed, every
K-type or M-type main sequence star that has ever been born has not died
yet. Hence, there are no helium white
dwarfs in the entire universe as of yet.
We must wait at least an additional roughly eighty-six billion years
before K-type stars begin to die. There
are more K-type stars than A-type stars, F-type stars, or G-type stars, since
the main sequence is a population-abundance sequence. Hence, when K-type stars begin to die, helium
white dwarfs will become the majority of the white dwarfs in the universe,
turning the carbon white dwarfs into a minority of the white dwarfs in the
universe. Countless billions of years
after that, M-type main sequence stars will begin to die, and there are even
more M-type main sequence stars than K-type main sequence stars, since again
the main sequence is a population-abundance sequence. Hence, when M-type stars begin to die, helium
white dwarfs will become the overwhelming majority of all white dwarfs in the
universe, while carbon white dwarfs will become an overwhelming minority of all
white dwarfs in the universe.
Ordinary low mass stars do
not have sufficient self-gravity to compress their carbon cores to sufficient
temperatures for the carbon flash to occur.
Hence, an ordinary low mass star dies as a non-burning carbon white
dwarf surrounded by a slowly expanding planetary nebula. Very low mass stars have such weak
self-gravity that not even the helium flash occurs. Hence, a very low mass star dies as a
non-burning helium white dwarf surrounded by a slowly expanding planetary
nebula. In either case, low mass stars
die as a slowly expanding planetary nebula surrounding a non-burning white
dwarf. If there are no nuclear reactions
occurring in a white dwarf, what is providing the outward pressure to balance
the inward self-gravity to keep a white dwarf in gravitational
equilibrium? As we discussed, white
dwarfs have densities roughly one million times normal densities. At such incredible densities, electrons are squeezed close to each other. However, electrons obey the Pauli Exclusion
Principle, named for the Austrian physicist Wolfgang Pauli who first formulated
this fundamental statement of Quantum Mechanics. According to the Pauli Exclusion Principle,
certain quantum-mechanical particles are forbidden
from occupying the same quantum state at the same time. Thus, any attempt to squeeze such particles
into the same quantum state will result in a pressure against this
compression. This pressure is called degeneracy pressure. Electrons are one type of quantum-mechanical
particle that obey the Pauli Exclusion Principle. In other words, electrons are
forbidden from occupying the same quantum state at the same time. It is because of this exclusion that
electrons within atoms must occupy higher energy states when lower energy
states happen to be already filled with
electrons. It is the
electrons in the higher energy quantum states of atoms that participate in
chemical reactions and chemical bonding.
Therefore, all of chemistry, including all of the
biochemistry essential for all life, would not occur if electrons did
not obey the Pauli Exclusion Principle. Also since electrons obey the Pauli Exclusion Principle, it
is electron degeneracy pressure that provides the outward pressure to balance
the inward self-gravity of a white dwarf.
Electron degeneracy pressure also provides the outward pressure to
balance the inward self-gravity of brown dwarfs. Many students argue that this electron
degeneracy pressure must come from the electromagnetic repulsion of the
electrons. As we discussed earlier in
the course, like charges repel, and unlike charges attract. Since electrons are negatively charged, they
must repel each other electromagnetically, and students argue that this is the
source of the electron degeneracy pressure.
Although this argument seems reasonable, it is nevertheless wrong. Electron degeneracy pressure has nothing to
do with electromagnetic repulsion. Of
course, the electromagnetic repulsion of the electrons provides some extra
pressure in addition to the electron degeneracy pressure. However, electron degeneracy pressure has
nothing to do with the charge of electrons.
The source of electron degeneracy pressure is the spin of the
electrons. The spin of any
quantum-mechanical particle is its intrinsic angular momentum. As a crude picture, we can imagine that the
electron is spinning or turning around an axis.
According to Quantum Mechanics, it is this spinning of
the electron around an axis that is the source of the electron degeneracy
pressure. We will discuss another
type of degeneracy pressure shortly that will beautifully emphasize how
degeneracy pressure has nothing to do with electromagnetic repulsion. To summarize, white dwarfs (as well as brown
dwarfs) remain in gravitational equilibrium not due to nuclear reactions but
due to electron degeneracy pressure, which arises because the Pauli Exclusion
Principle prevents electrons (and certain other quantum-mechanical particles)
from occupying the same quantum state at the same time. Degeneracy pressure has nothing to do with
electromagnetic repulsion; degeneracy pressure arises from the intrinsic
angular momentum (the spin) of certain quantum-mechanical particles.
It is instructive to discuss
how our particular Solar System will die.
Our Sun is an ordinary low mass star with a main sequence
(hydrogen-burning) lifetime of roughly ten billion years. Our Sun has spent roughly five billion years
fusing hydrogen into helium in its core, and our Sun will spend an additional
roughly five billion years fusing hydrogen into helium in its core. After exhausting the hydrogen in its core,
our Sun will begin to die. Gradually
over millions of years (which is brief as compared with its ten-billion-year
main-sequence lifetime), our Sun’s helium core will collapse and become hotter
while its outer layers expand and become cooler, turning our Sun from a yellow
main sequence star to an orange subgiant star to a red giant star. The helium flash will then occur, and our Sun
will become a helium-burning star. Our
Sun’s helium-burning lifetime will last millions of years, which is again brief
as compared with its ten-billion-year main-sequence lifetime. After exhausting the helium in its core, our
Sun’s carbon core will collapse and become hotter, while its outer layers
expand and become cooler. Our Sun will
become a red giant a second time. When
the outer layers of our Sun expand to become a red giant the second time, its
outer layers will consume the inner planets (Mercury, Venus, Earth, and
Mars). However, the outer layers of a
red giant are cool, only one or two thousand kelvins in temperature. Although this temperature is hot by human
standards, it is not hot enough to melt most rocks, and it is certainly not hot
enough to melt most metals. Hence, the
inner planets will not immediately be destroyed when
our Sun’s second red giant phase consumes them.
In fact, the inner planets will at first continue
to orbit the red giant Sun while being inside the red giant Sun! This will not continue long however, since
the outer layers of the red giant Sun will cause drag as the inner planets
orbit within these outer layers of gas.
This drag will cause the inner planets to spiral inward toward the red
giant Sun’s core, which is certainly hot enough to melt metal and rock. This is how the inner planets will be destroyed.
The outer layers of the red giant Sun will continue to expand and
cool. By the time these gases reach the
outer planets, they will be so tenuous (low density) that they will have a
negligible effect on the outer planets.
These outer gas layers will pass the outer planets, continuing to become
cooler and cooler and more and more diffuse (less and
less dense). These outer gas layers will
eventually become a planetary nebula, returning these gases to the surrounding
interstellar medium. Now the only
gravitational attraction the outer planets will feel is from the carbon white
dwarf, the naked core of the former red giant Sun. However, the Sun has lost most of its mass,
since it injected its outer gas layers which became an
expanding planetary nebula. The carbon
white dwarf was once the Sun’s core, which is only a small fraction of the
Sun’s original mass. With significantly
less mass, the carbon white dwarf will not have sufficient gravitational
attraction to hold the outer planets in orbit.
Hence, the outer planets will leave their orbits, becoming rogue planets
(or orphan planets). A rogue (or orphan)
planet does not orbit any particular star but instead moves along its own
trajectory through our Milky Way Galaxy.
Finally, all that will remain of our Solar System will be a carbon white
dwarf, which was once our Sun’s core.
All of these processes will begin in roughly five billion years, and
they will take many millions of years to occur.
If we could return to our Solar System roughly six billion years from
now, all of these processes would be complete, and a carbon white dwarf would
be all that remains of our Solar System.
Billions of years from now, intelligent life may evolve on another
planet orbiting another star. These
intelligent lifeforms may even build telescopes and discover the rogue (or
orphan) planet Jupiter moving through the Milky Way Galaxy. However, these intelligent lifeforms will
have no direct evidence that Jupiter once orbited our Sun, since our Sun will
have long since died. Hence, these
intelligent lifeforms will probably mistakenly believe
that Jupiter is a brown dwarf star.
Perhaps some of the brown dwarf stars we observe today were once
gas-giant planets that were once orbiting an ancient star that has long since
died. In other words, perhaps some of
the brown dwarf stars we observe today are not brown dwarf stars at all but are
actually rogue (or orphan) planets.
High mass main sequence stars
have masses greater than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses). In terms of spectral types, high mass main
sequence stars are either O-type stars or B-type stars. High mass death is somewhat similar to low
mass death but more violent. Since high
mass stars are rare, the vast majority of main sequence stars die gently, while
only a small fraction of main sequence stars die violently. Nevertheless, we must discuss high mass
death, since we owe our very existence to violent high mass death, as we will
discuss shortly.
A high mass main sequence
star spends a short amount of time fusing hydrogen into helium in its core,
only several million years. After
exhausting the hydrogen in its core, the helium center collapses and becomes
hotter, while a new layer of hydrogen fusion causes the outer layers of the
star to expand further outward and become cooler. The core is compressed
until the triple alpha process 3 → energy + begins, and the star becomes a helium-burning star, having a core where helium fuses into carbon
surrounded by a layer where hydrogen fuses into helium. The helium-burning lifetime of the star is
hundreds of thousands years, shorter than the star’s hydrogen-burning
(main-sequence) lifetime, since helium fusion occurs at hotter temperatures
than hydrogen fusion. Eventually, the
central helium is exhausted, the carbon center collapses and becomes hotter,
while two surrounding fusion layers cause the outer layers of the star to
expand further outward and become cooler.
Thus far, high mass death seems nearly identical with low mass death,
but now the differences begin. High mass
stars have such strong self-gravity that their cores are
compressed until they attain the threshold temperature where carbon
nuclei fuse into even heavier nuclei, in particular oxygen nuclei. More strictly, carbon nuclei fuse with helium
nuclei (alpha particles) to yield oxygen nuclei. This nuclear reaction is more properly
written + → energy + . Note that the electromagnetic repulsion
between electrical charges is directly proportional to the product of the
charges. Hence, the temperature
necessary to overpower the electromagnetic repulsion between a carbon nucleus
with six positive protons and a helium nucleus (an alpha particle) with two
positive protons is not as hot as the temperature necessary to overpower the
electromagnetic repulsion between two carbon nuclei, each having six positive
protons. In the case of carbon-helium
fusion, the electromagnetic repulsion is proportional to six times two, which
is twelve. In the case of carbon-carbon
fusion, the electromagnetic repulsion is proportional to six times six, which
is thirty-six. Twelve is significantly
less than thirty-six, meaning less electromagnetic repulsion and hence a less
hot temperature is required for carbon-helium fusion as compared with
carbon-carbon fusion. Although there is
an even weaker electromagnetic repulsion between a hydrogen nucleus (a proton)
and a carbon nucleus, the nuclear fusion of a hydrogen nucleus with any other
nucleus is slow, since it involves the weak nuclear force. The star is now a carbon-burning star, having
a core where carbon fuses into oxygen surrounded by two less hot fusion
layers. The carbon-burning lifetime of
the star is tens of thousands of years, even shorter than its helium-burning
lifetime, since carbon burning occurs at even hotter temperatures than helium
burning, since carbon-helium fusion temperatures are proportional to twelve
(six times two), a larger number as compared with helium-helium fusion
temperatures which are proportional to four (two times two). Eventually, the central carbon is exhausted,
the oxygen center collapses and becomes hotter, while three surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where oxygen nuclei fuse into even
heavier nuclei, in particular neon nuclei.
More strictly, oxygen nuclei fuse with helium nuclei (alpha particles)
to yield neon nuclei. This nuclear
reaction is more properly written + → energy + . Again, the electromagnetic repulsion between
electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower
the electromagnetic repulsion between an oxygen nucleus with eight positive
protons and a helium nucleus (an alpha particle) with two positive protons is
not as hot as the temperature necessary to overpower the electromagnetic
repulsion between two oxygen nuclei, each having eight positive protons. In the case of oxygen-helium fusion, the
electromagnetic repulsion is proportional to eight times two, which is
sixteen. In the case of oxygen-oxygen
fusion, the electromagnetic repulsion is proportional to eight times eight,
which is sixty-four. Sixteen is
significantly less than sixty-four, meaning less electromagnetic repulsion and
hence a less hot temperature is required for oxygen-helium fusion as compared
with oxygen-oxygen fusion. Although
there is an even weaker electromagnetic repulsion between a hydrogen nucleus (a
proton) and an oxygen nucleus, the nuclear fusion of a hydrogen nucleus with
any other nucleus is again slow, since it involves the weak nuclear force. The star is now an oxygen-burning star,
having a core where oxygen fuses into neon surrounded by three less hot fusion
layers. The oxygen-burning lifetime of
the star is several thousand years, even shorter than its carbon-burning
lifetime, since oxygen burning occurs at even hotter temperatures than carbon
burning, since oxygen-helium fusion temperatures are proportional to sixteen
(eight times two), a larger number as compared with carbon-helium fusion
temperatures which are proportional to twelve (six times two). Eventually, the central oxygen is exhausted,
the neon center collapses and becomes hotter, while four surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where neon nuclei fuse into even
heavier nuclei, in particular magnesium nuclei.
More strictly, neon nuclei fuse with helium nuclei (alpha particles) to
yield magnesium nuclei. This nuclear
reaction is more properly written + → energy + . The star is now a neon-burning star, having a
core where neon fuses into magnesium surrounded by
four less hot fusion layers. The
neon-burning lifetime of the star is several hundred years, even shorter than
its oxygen-burning lifetime, since neon burning occurs at even hotter
temperatures than oxygen burning, since neon-helium fusion temperatures are
proportional to twenty (ten times two), a larger number as compared with
oxygen-helium fusion temperatures which are proportional to sixteen (eight
times two). Eventually, the central neon
is exhausted, the magnesium center collapses and becomes hotter, while five
surrounding fusion layers cause the outer layers of the star to expand further
outward and become cooler. These high
mass stars have such strong self-gravity that their cores are
compressed until they attain the threshold temperature where magnesium
nuclei fuse into even heavier nuclei, in particular silicon nuclei. More strictly, magnesium nuclei fuse with
helium nuclei (alpha particles) to yield silicon nuclei. This nuclear reaction is more properly
written + → energy + . The star is now a magnesium-burning star,
having a core where magnesium fuses into silicon surrounded by five less hot
fusion layers. The magnesium-burning
lifetime of the star is several decades, even shorter than its neon-burning
lifetime, since magnesium burning occurs at even hotter temperatures than neon
burning, since magnesium-helium fusion temperatures are proportional to
twenty-four (twelve times two), a larger number as compared with neon-helium
fusion temperatures which are proportional to twenty (ten times two). Eventually, the central magnesium is
exhausted, the silicon center collapses and becomes hotter, while six
surrounding fusion layers cause the outer layers of the star to expand further
outward and become cooler. These high
mass stars have such strong self-gravity that their cores are
compressed until they attain the threshold temperature where silicon
nuclei fuse into even heavier nuclei, in particular sulfur nuclei. More strictly, silicon nuclei fuse with
helium nuclei (alpha particles) to yield sulfur nuclei. This nuclear reaction is more properly
written + → energy + . The star is now a silicon-burning star,
having a core where silicon fuses into sulfur surrounded by six less hot fusion
layers. The silicon-burning lifetime of
the star is several years, even shorter than its magnesium-burning lifetime,
since silicon burning occurs at even hotter temperatures than magnesium
burning, since silicon-helium fusion temperatures are proportional to twenty-eight
(fourteen times two), a larger number as compared with magnesium-helium fusion
temperatures which are proportional to twenty-four (twelve times two). Eventually, the central silicon is exhausted,
the sulfur center collapses and becomes hotter, while seven surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where sulfur nuclei fuse into even
heavier nuclei, in particular argon nuclei.
More strictly, sulfur nuclei fuse with helium nuclei (alpha particles)
to yield argon nuclei. This nuclear reaction
is more properly written + → energy + . The star is now a sulfur-burning star, having
a core where sulfur fuses into argon surrounded by seven less hot fusion
layers. The sulfur-burning lifetime of
the star is several months, even shorter than its silicon-burning lifetime,
since sulfur burning occurs at even hotter temperatures than silicon burning,
since sulfur-helium fusion temperatures are proportional to thirty-two (sixteen
times two), a larger number as compared with silicon-helium fusion temperatures
which are proportional to twenty-eight (fourteen times two). Eventually, the central sulfur is exhausted,
the argon center collapses and becomes hotter, while eight surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where argon nuclei fuse into even
heavier nuclei, in particular calcium nuclei.
More strictly, argon nuclei fuse with helium nuclei (alpha particles) to
yield calcium nuclei. This nuclear
reaction is more properly written + → energy + . The star is now an argon-burning star, having
a core where argon fuses into calcium surrounded by eight less hot fusion
layers. The argon-burning lifetime of the
star is several days, even shorter than its sulfur-burning lifetime, since
argon burning occurs at even hotter temperatures than sulfur burning, since
argon-helium fusion temperatures are proportional to thirty-six (eighteen times
two), a larger number as compared with sulfur-helium fusion temperatures which
are proportional to thirty-two (sixteen times two). Eventually, the central argon is exhausted,
the calcium center collapses and becomes hotter, while nine surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where calcium nuclei fuse into even
heavier nuclei, in particular titanium nuclei.
More strictly, calcium nuclei fuse with helium nuclei (alpha particles)
to yield titanium nuclei. This nuclear
reaction is more properly written + → energy + . The star is now a calcium-burning star,
having a core where calcium fuses into titanium surrounded by nine less hot
fusion layers. The calcium-burning
lifetime of the star is even shorter than its argon-burning lifetime, since
calcium burning occurs at even hotter temperatures than argon burning, since
calcium-helium fusion temperatures are proportional to forty (twenty times
two), a larger number as compared with argon-helium fusion temperatures which
are proportional to thirty-six (eighteen times two). Eventually, the central calcium is exhausted,
the titanium center collapses and becomes hotter, while ten surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where titanium nuclei fuse into
even heavier nuclei, in particular chromium nuclei. More strictly, titanium nuclei fuse with
helium nuclei (alpha particles) to yield chromium nuclei. This nuclear reaction is more properly written
+ → energy + . The star is now a titanium-burning star,
having a core where titanium fuses into chromium surrounded by ten less hot
fusion layers. The titanium-burning
lifetime of the star is even shorter than its calcium-burning lifetime, since titanium burning occurs at even hotter
temperatures than calcium burning, since titanium-helium fusion temperatures
are proportional to forty-four (twenty-two times two), a larger number as
compared with calcium-helium fusion temperatures which are proportional to
forty (twenty times two). Eventually,
the central titanium is exhausted, the chromium center collapses and becomes
hotter, while eleven surrounding fusion layers cause the outer layers of the
star to expand further outward and become cooler. These high mass stars have such strong
self-gravity that their cores are compressed until
they attain the threshold temperature where chromium nuclei fuse into even
heavier nuclei, in particular iron nuclei and nickel nuclei. More strictly, chromium nuclei fuse with
helium nuclei (alpha particles) to yield iron nuclei, and iron nuclei fuse with
helium nuclei (alpha particles) to yield nickel nuclei. These nuclear reactions are more properly
written + → energy + and + → energy + . The star is now a chromium-burning star,
having a core where chromium fuses into iron and nickel surrounded by eleven
less hot fusion layers. The
chromium-burning lifetime of the star is even shorter than its titanium-burning
lifetime, since chromium burning occurs at even hotter temperatures than
titanium burning, since chromium-helium fusion temperatures are proportional to
forty-eight (twenty-four times two), a larger number as compared with
titanium-helium fusion temperatures which are proportional to forty-four
(twenty-two times two). Eventually, the
central chromium is exhausted, the iron-nickel center collapses and becomes
hotter, while twelve surrounding fusion layers cause the outer layers of the
star to expand further outward and become cooler. In brief, each successive nuclear reaction
occurs at hotter and hotter temperatures.
The first hydrogen-burning stage occurs at tens of millions of kelvins. Helium burning, carbon burning, and oxygen
burning each occurs at hundreds of millions of kelvins. All the remaining burning (fusion) stages
occur at a few billion kelvins! Also, each successive lifetime of the star is shorter and
shorter, again since each successive nuclear reaction occurs at hotter and
hotter temperatures. The first
hydrogen-burning stage (the main-sequence lifetime) is itself relatively short
for these high-mass stars, lasting only millions of years. Helium burning, carbon burning, and oxygen
burning each last only thousands of years, neon burning lasts only centuries,
and magnesium burning lasts only decades.
Silicon burning lasts only years, sulfur burning only months, and argon
burning only days! Calcium burning and
titanium burning last only hours, and chromium burning lasts only minutes!
Many students now conclude
that successively hotter and hotter nuclear reactions continue to occur,
synthesizing heavier and heavier nuclei all the way to the end of the Periodic
Table of Elements. However, this nuclear
reaction chain actually ends at iron and nickel, which is roughly halfway
through the Periodic Table of Elements.
As we discussed, nuclear fission is the splitting of
more massive (heavier) nuclei into less massive (lighter) nuclei, while
nuclear fusion is the merging or fusing of less massive (lighter) nuclei into
more massive (heavier) nuclei. Both of
these types of nuclear reactions generate energy because atoms of intermediate
mass (roughly halfway through the Periodic Table of Elements) have the most stable
nuclei among all atoms. The most massive
(heaviest) nuclei attain greater stability by splitting into less massive
(lighter) nuclei, hence releasing energy.
The least massive (lightest) nuclei attain greater stability by merging
or fusing into more massive (heavier) nuclei, again releasing energy. Hence, attempting to merge or fuse nuclei of
intermediate mass (roughly halfway through the Periodic Table of Elements) into
more massive (heavier) nuclei would result in those more massive (heavier)
nuclei spontaneously splitting back into the intermediate nuclei. Similarly, attempting to split nuclei of
intermediate mass (roughly halfway through the Periodic Table of Elements) into
less massive (lighter) nuclei would result in those less massive (lighter)
nuclei spontaneously merging or fusing back into the intermediate nuclei. Iron and nickel are intermediate-mass atoms,
roughly halfway through the Periodic Table of Elements. In fact, iron and nickel nuclei are among the
most stable of all the nuclei in the universe.
Thus, the nuclear reaction chain at the center of a high mass star ends
at iron and nickel. Caution: the
physical strength of iron has nothing to do with its nuclear stability; the
physical strength of iron arises from interactions among its electrons that
reside in atomic states around the nucleus, not nuclear states within the
nucleus. The core of the high mass star
now has several layers, rather like the layers of an onion. Starting at the center of
the many-layered core, we have non-burning iron and nickel surrounded by a layer
of chromium burning (fusing) into iron and nickel surrounded by a layer of
titanium burning (fusing) into chromium surrounded by a layer of calcium
burning (fusing) into titanium surrounded by a layer of argon burning (fusing)
into calcium surrounded by a layer of sulfur burning (fusing) into argon
surrounded by a layer of silicon burning (fusing) into sulfur surrounded by a
layer of magnesium burning (fusing) into silicon surrounded by a layer of neon
burning (fusing) into magnesium surrounded by a layer of oxygen burning
(fusing) into neon surrounded by a layer of carbon burning (fusing) into oxygen
surrounded by a layer of helium burning (fusing) into carbon surrounded by a
layer of hydrogen burning (fusing) into helium. Surrounding this many-layered core is the
rest of the star, which is not hot enough for any nuclear fusion reactions to
occur. Hence, most of the star is
composed of roughly seventy-five percent (three-quarters) hydrogen and roughly
twenty-five percent (one-quarter) helium.
With each core collapse, these outer layers of the star have expanded
further outward, becoming cooler and therefore redder. Since the outer layers of the star have
expanded many times with each of the many collapses of the core, the star has
become enormous. The star has become a
red supergiant. While low mass stars
begin to die by becoming red giants, high mass stars begin to die by becoming
red supergiants.
Since non-burning iron and nickel constitutes the center of the
many-layered core of this supergiant star, the non-burning iron and nickel
center must be supported by electron degeneracy
pressure. Note that the center of the
core has compressed many times, squeezing the electrons closer and closer to
each other. As we discussed, white dwarfs
are supported by electron degeneracy pressure. Therefore, we may regard the center of the
many-layered core of this supergiant star as an iron-nickel white dwarf. This iron-nickel white dwarf core has
collapsed many times, making it small and hot.
In brief, at this stage of the life of a high mass star, it has become a
supergiant star with a many-layered core, and the center of that many-layered
core of the supergiant star is an iron-nickel white dwarf supported by electron
degeneracy pressure.
The iron-nickel white dwarf
that comprises the center of the many-layered core of a red supergiant star is
under such tremendous pressure that exotic nuclear reactions can occur. One such exotic nuclear reaction is called electron capture, where a proton devours an
electron thus transmuting itself into a neutron and emitting a neutrino. This nuclear reaction is more properly
written + e–
→ + νe.
Caution: in nuclear physics, the symbol is used for the
hydrogen-1 nucleus, which is simply a proton.
Also note that is the symbol of the neutron in nuclear
physics, as we discussed. Also as we discussed, e–
is the symbol of the (ordinary) electron, and νe is the symbol
of the neutrino. Neutrinos are extremely
weakly interacting particles, as we also discussed. Hence, the neutrinos generated by this
nuclear reaction simply fly out of the center of the many-layered core, passing
through all the other layers of the core, flying through the outer layers of
the red supergiant, and propagating into the surrounding outer space at nearly
the speed of light. The iron-nickel
white dwarf center was being supported by electron
degeneracy pressure. If protons are
devouring electrons, then the electron degeneracy pressure that was supporting
the center of the many-layered core vanishes.
The neutrons that were synthesized by this
nuclear reaction go into free fall, since there is no pressure to balance
self-gravity. According to Quantum
Mechanics, neutrons obey the Pauli Exclusion Principle, just as electrons obey
the Pauli Exclusion Principle. In other
words, no two neutrons can occupy the same quantum state at the same time, and
thus attempting to squeeze neutrons together results in a repulsion called
neutron degeneracy pressure. This
beautifully illustrates that degeneracy pressure has nothing to do with
electromagnetic repulsion. Neutrons are
neutral; they do not attract or repel each other electromagnetically. However, neutrons do repel each other through
neutron degeneracy pressure if they are squeezed too
close to each other. The neutrons
therefore stop collapsing when neutron degeneracy pressure halts their
collapse. It is not difficult to
calculate that neutron degeneracy pressure halts the collapse when the
iron-nickel white dwarf has collapsed from the size of the Earth (the white
dwarf size scale) down to a radius of roughly ten kilometers. This is roughly the size of a city! This incredibly small and dense sphere of
neutrons supported by neutron degeneracy pressure is called
a neutron star. The existence of white
dwarfs is already difficult to comprehend, since they have compressed roughly
the mass of our Sun into roughly the size of the Earth, with
a density roughly one million times normal densities. Now imagine compressing roughly the mass of
our Sun into roughly the size of a city!
The resulting density of a neutron star is hundreds of millions of times
more dense than even a white dwarf, making a neutron star hundreds of trillions
of times more dense than normal densities! These densities are fantastic, far beyond
human comprehension. As
a result of these fantastic densities, the gravity near a neutron star
significantly warps the fabric of space and time around it, as we will discuss
shortly in the context of Einstein’s theory of gravity, the General Theory of
Relativity. Although the density of a
neutron star is far beyond human imagination, its density is actually roughly
equal to the density of every nucleus of every atom composing everything in the
universe, including our own bodies.
Therefore, we may regard a neutron star as an enormous atomic
nucleus! The most massive atoms at the
end of the Periodic Table of Elements have atomic masses of nearly three
hundred, but far far far beyond those atoms are neutron stars with atomic
masses of roughly one octillion nonillion or one septillion decillion! It is not difficult to calculate that the
free-fall collapse of the iron-nickel white dwarf from roughly the size of the
Earth to a neutron star roughly the size of a city occurs in roughly one
millisecond, one-thousandth of one second!
It is also not difficult to calculate the amount of energy liberated
when roughly one solar mass collapses from roughly the size of the Earth to
roughly the size of a city. The energy
liberated is comparable to the total energy radiated by our Sun over its entire
ten billion year lifetime! The resulting
luminosity of this high mass star is in the billions of solar
luminosities! This is roughly the total
power output of an entire galaxy of stars!
Such fantastic quantities of energy liberated in such an incredibly
short amount of time is obviously a cataclysmic explosion. This is how high mass stars die; they
obliterate themselves in a spectacularly violent explosion called a supernova. Strictly, this is a Type II (Roman numeral)
supernova. We will discuss Type I (Roman
numeral) supernovae later in the course.
To summarize, high mass stars live short main-sequence
lifetimes, swell to become red supergiants, and
explode as Type II supernovae. The
violence of this explosion throws the outer layers of the star away from the
explosion at very high speeds and heats these gases to millions of kelvins of
temperature. This rapidly expanding, hot
gas is called a supernova remnant, which
astrophysicists always abbreviate SNR.
Beautiful examples of supernova remnants include the Crab Nebula in the
constellation Taurus (the bull), the Tycho Nebula in
the constellation Cassiopeia (the queen of Aethiopia),
and the Kepler Nebula in the constellation Ophiuchus (the serpent bearer).
The Type II supernova of a
high mass star is so violent that all the nuclei across the
entire Periodic Table of Elements are synthesized by this cataclysmic explosion. The nuclear reactions do not end at iron and
nickel, roughly halfway through the Periodic Table of Elements. The nuclear reactions actually proceed all
the way to the end of the Periodic Table of Elements, synthesizing even the
most massive (heaviest) of all nuclei, such as uranium and plutonium. As we will discuss toward the end of the course,
the universe was essentially pure hydrogen and helium when it was born in the
fires of the Big Bang. If the universe
was born pure hydrogen and helium, where did all the other atoms of the
Periodic Table of Elements come from?
Most stars are born low mass, and these low mass stars fuse hydrogen
into helium. At best, they can fuse
helium into carbon. However, high mass
stars synthesize all the elements up to iron and nickel within their cores, and
then synthesize all the elements through to the end of the Periodic Table of
Elements within their violent Type II supernovae. The rapidly expanding supernova remnant
throws all these nuclei into the surrounding outer space. As the supernova remnant expands, it becomes
cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the supernova
remnant return to the interstellar medium, enriching or polluting the
interstellar medium with these new nuclei.
These enriched or polluted gases may someday form a new diffuse nebula
from which new stars will be born, but now this diffuse nebula has been enriched or polluted with new nuclei. We now realize why we owe our very existence
to high mass death. Our bodies are
composed of these atoms, such as the iron in our blood, the sodium and
potassium in our nerves, the calcium in our bones, and the oxygen that composes
the water that makes up most of our bodies.
All the terrestrial planets, including our own planet Earth, are also
composed of these atoms, such as iron and nickel and silicon and oxygen, as we
discussed earlier in the course. Without
high mass stellar death, there would be no terrestrial planets and no
life. If the universe was essentially
pure hydrogen and helium when it was born in the fires of the Big Bang, then the
first generation of stars born in the universe were essentially pure hydrogen
and helium, and therefore they could not have had terrestrial planets orbiting
them. At best, they had jovian, gas-giant planets orbiting
them. The deaths of these first generation
stars were essential to creating future generations of stars that could have
terrestrial planets orbiting them and therefore the potential for life on some
of these terrestrial planets, in particular our planet Earth.
We subdivide high mass main sequence
stars into two subcategories: ordinary high mass stars and very high mass
stars. Ordinary high mass stars have
masses from 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses) up to 20M☉ to 25M☉ (twenty to twenty-five solar masses). Very high mass stars have masses from 20M☉ to 25M☉ (twenty
to twenty-five solar masses) all the way up to the Eddington limit of roughly 100M☉ (one
hundred solar masses). In terms of
spectral types, we will regard ordinary high mass stars as B-type stars and
very high mass stars as O-type stars.
The stellar death we have discussed strictly applies to ordinary high
mass stars. A very high mass star also
lives a very short main-sequence lifetime and also
swells to become a red supergiant with an iron-nickel white dwarf center
surrounded by a many-layered core. The
red supergiant also explodes with a Type II supernova, again initiated by
electron capture in the core that emits neutrinos, which again throws outward a
very hot and rapidly expanding supernova remnant. Very high mass death seems identical to
ordinary high mass death, but the difference is as follows. Very high mass stars have such tremendous
self-gravity that not even neutron degeneracy pressure can halt the core collapse. If neutron degeneracy pressure cannot halt
the collapse of the core, then nothing can halt the collapse of the core. The core continues collapsing all the way
down to a mathematical point. This is
the ultimate triumph of gravity. This
mathematical point is called a black hole, which we
will discuss in more detail shortly in the context of Einstein’s theory of
gravity, the General Theory of Relativity.
Recall that the main sequence is a population-abundance sequence, with
higher mass main sequence stars being less abundant (more rare) than lower mass
main sequence stars, which are more abundant (more common). Thus, very high mass stars are more rare than ordinary high mass stars. Therefore, most Type II supernovae leave
behind a hot supernova remnant rapidly expanding away from a neutron star. On rare occasions, a Type II supernova will
leave behind a hot supernova remnant rapidly expanding away from a black
hole. To summarize high mass death, the
star spends a short time as a hydrogen-burning (main sequence) star, swells to
become a red supergiant with an iron-nickel white dwarf center surrounded by a
many-layered core, and explodes as a Type II supernova. The Type II supernova is
triggered by electron capture in the core that emits neutrinos,
collapses the core, and throws outward a hot supernova remnant that rapidly
expands away from either a neutron star or a black hole. Notice that high mass death is similar to low
mass death, just more violent. As we
discussed, a low mass star spends a long time as a hydrogen-burning (main sequence)
star, swells to become a red giant, and finally dies as a planetary nebula
slowly expanding away from a white dwarf.
The supernova remnant for high mass death is analogous to the planetary
nebula for low mass death, and the neutron star or the black hole for high mass
death is analogous to the white dwarf for low mass death. We can turn this logic completely around and
claim that low mass death is similar to high mass death, just more gentle. The planetary nebula for low mass death is
analogous to the supernova remnant for high mass death, and the white dwarf for
low mass death is analogous to the neutron star or the black hole for high mass
death.
Supernovae are rare, since
only a tiny fraction of all stars are high mass that
die with Type II supernova explosions.
In a typical galaxy like our Milky Way Galaxy that is composed of
roughly one hundred billion stars, there is only one supernova per century on
average. If a supernova occurs in a
typical galaxy roughly once every century (once every one hundred years), then
if astronomers continuously observe one hundred galaxies, we should observe
roughly one supernova every year on average.
If astronomers continuously observe one thousand galaxies, we should
observe roughly ten supernovae every year on average; this would be roughly
once a month. If astronomers
continuously observe ten thousand galaxies, we should observe roughly one
hundred supernovae every year on average; this would be roughly twice a week. Over the past few decades, astronomers have
used telescopes to continuously observe tens of
thousands of galaxies. Thus, we observe
several hundred supernovae every year; this is roughly once every day. However, these supernovae are in distant
galaxies, far beyond our own Milky Way Galaxy.
These supernovae cannot be observed with the
naked eye; they can only be observed with very powerful telescopes. The procedure for discovering a supernova in
a distant galaxy is as follows. We use a
powerful telescope to photograph a galaxy night after night after night. One night, we see a point of light in the
galaxy that is as bright as the entire galaxy.
We conclude that one of the stars in that galaxy has suffered a supernova
explosion. The point of light remains
bright for a couple weeks, and the point of light eventually fades away over
the next several months.
Our Sun will never suffer
from a supernova, since our Sun is a low mass star. This is fortunate, since if our Sun were to
suffer from a supernova, the explosion would obliterate our entire Solar
System! There are no
nearby high mass stars that may suffer a supernova that could harm us in any
way, which stands to reason since high mass stars are rare. There are however some high mass stars close
enough to be visible with the naked eye that have already entered the red
supergiant phase, such as Betelgeuse in the constellation Orion (the hunter)
and Antares in the constellation Scorpius (the scorpion). How would Betelgeuse or Antares appear in the
sky if they were to suffer a supernova? A supernova has a luminosity of billions of
solar luminosities. Hence, the star
would appear to become billions of times brighter. This would be so bright that we could see the
star in the daytime! The star would
remain this bright for a couple weeks. Over the next several months, the star would
remain fairly bright but would gradually fade in
intensity. Within roughly one year, the
star would vanish from our sky, forever changing the appearance of the
constellation Orion (the hunter) or Scorpius (the scorpion), since a bright
star in the constellation has now been forever erased from our sky! Again, this sequence of events would be
visible to the naked eye, making nearby supernovae within our own Milky Way
Galaxy exciting to observe. Over the
past millennium (one thousand years), we have observed roughly one supernova
per century within our own Milky Way Galaxy.
Warning: astronomers have observed supernovae roughly once every day
over the past few decades, but these are supernovae in distant galaxies that can only be observed with very powerful telescopes. Only a handful of naked-eye supernovae over
the past millennium have been observed, including in
April 1006, July 1054 creating the Crab Nebula, August 1181, November 1572
creating the Tycho Nebula, and October 1604 creating
the Kepler Nebula. Note that the last
supernova on this list, the most recent naked-eye supernova from within our
Milky Way Galaxy, occurred more than four hundred years ago. If a supernova occurs in a typical galaxy
roughly once per century, then we are long overdue for a naked-eye supernova
from within our Milky Way Galaxy. We
could almost guarantee that we will observe one or perhaps
two or perhaps even three naked-eye supernovae from within our Milky Way
Galaxy within our lifetimes.
Frustratingly, the last naked-eye supernova from within our Milky Way
Galaxy occurred before Galileo Galilei made his historic observations of the
sky with his primitive telescope in the year 1609, as we discussed earlier in
the course. Thus, the model we have
presented of a Type II supernova being triggered by
the core collapse and subsequent explosion of a high mass star remained an
untested theoretical model for many years.
This all changed in the historic year 1987. As we discussed, working at a neutrino
detector is the most boring job in the world, since a neutrino detector only
detects a single neutrino per day.
However, on Monday, February 23, 1987, at 07:35:35 universal time,
neutrino detectors around the world detected twenty-five neutrinos within a
time span of less than thirteen seconds!
Physicists all around the world had no idea how to explain this
incredible burst of neutrinos. The
source of these neutrinos was revealed a couple hours
later, when astronomers witnessed the star named CPD-69
402 (also named GSC 09162-00821) violently explode,
becoming extraordinarily more luminous.
This star was not within our own Milky Way Galaxy however; this star was
within a nearby galaxy called the Large Magellanic
Cloud, nearly two hundred thousand light-years from our Solar System. It suddenly became clear what caused the
neutrino burst. Nearly two hundred
thousand years ago, the high mass star CPD-69 402 (GSC 09162-00821) swelled to become a supergiant star until
electron capture was initiated in its iron-nickel
white dwarf center, triggering a supernova explosion. Neutrinos flew out of the core, with the
light from the explosion following right behind the neutrinos. Over the next nearly two hundred thousand
years, the neutrinos propagated spherically outward, with the light from the
explosion also propagating spherically outward.
On the 23rd day of February in the historic year 1987, the neutrinos
from this supernova passed through planet Earth, and neutrino detectors around
the world detected twenty-five of them.
A couple hours later, the light from the supernova arrived at the Earth,
and this light was not only observed by astronomers
through telescopes but was actually witnessed by humans (in the southern
hemisphere) with the naked eye. This is
the closest supernova to occur in roughly four hundred years. The name of this supernova is SN1987A, since the name of a supernova always begins with
the letters SN (for supernova) followed by the year astronomers first observed
the supernova followed by the letter of the English alphabet indicating
numerically which observed supernova it was in that year. For example, the first supernova astronomers
observed in the year 2017 was named SN2017A, the
second supernova astronomers observed that same year was named SN2017B,
the third supernova astronomers observed that same year was named SN2017C, and so on and so forth. There are only twenty-six letters in the
English alphabet. Therefore, the
twenty-seventh supernova astronomers observed in the year 2017 was named SN2017aa, the
twenty-eighth supernova astronomers observed in that same year was named SN2017ab, the twenty-ninth supernova astronomers observed
in that same year was named SN2017ac, and so on and
so forth. Again, astronomers observe
hundreds of supernovae every year from distant galaxies. However, SN1987A
was the closest supernova observed in roughly four hundred years. This supernova was close enough and hence
bright enough to be visible to the naked eye (but only from the southern
hemisphere). This supernova provided
strong evidence that our theoretical model of supernova explosions is
correct. In summary, a Type II supernova is caused by a dying high mass star that swells
to become a supergiant star. The
nuclear reaction electron capture in the core triggers the Type II
supernova. Neutrinos fly out of the
core, the core collapses, and the energy of the collapse is
liberated in a cataclysmic explosion with a brightness in the billions
of solar luminosities. The final result of a Type II supernova is a very hot supernova
remnant rapidly expanding away from either a neutron star or a black hole. The next time neutrino detectors around the
world detect a burst of neutrinos, every astronomical telescope in the world
will immediately point to supergiant stars such as Betelgeuse or Antares to
witness the actual explosion of the supergiant star. Over the past few decades since SN1987A, astronomers have witnessed the formation of the
supernova remnant that resulted from that supernova. Astrophysicists will continue to study SN1987A for many centuries, just as astrophysicists
still continue to study the Crab Nebula for example, which resulted from
a supernova observed in July 1054, almost one thousand years ago. By making many observations over several
decades, astronomers have measured the growing size of several supernova
remnants. We can calculate the speed
with which the supernova remnant expands from these observations, and we can
then extrapolate backwards to calculate how long ago the supernova
occurred. In the cases where astronomers
from previous centuries actually witnessed the supernova occur,
our extrapolated date of the supernova is always roughly equal to the date that
was recorded by astronomers centuries ago.
When an ordinary high mass
star suffers a supernova explosion, the iron-nickel white dwarf core collapses
to a neutron star, as we discussed. By
the Law of Conservation of Angular Momentum, the collapsing core must spin faster. Since the iron-nickel white dwarf roughly the
size of the Earth collapses to a neutron star roughly the size of a city, the
rotational speed of the neutron star after the collapse is hundreds of
thousands of times faster than the rotational speed of the iron-nickel white
dwarf from which it collapsed! If the
iron-nickel white dwarf rotated once a day for example, the neutron star that
formed from it must rotate once in less than one second! Furthermore, the magnetic field lines of the
star are pulled with the collapsing core. Hence, the magnetic field lines tighten,
increasing the strength of the magnetic field by a tremendous amount. The magnetic field at the surface of a
neutron star can be trillions of times stronger than the Earth’s magnetic field! It would be improbable for the magnetic poles
of the neutron star to precisely coincide with its
rotational poles, just as the Earth’s magnetic poles do not coincide with its
own rotational poles, as we discussed earlier in the course. Hence, as the neutron star rotates, its
magnetic axis precesses around its rotational
axis. The incredibly strong magnetic
field that precesses at the incredibly fast
rotational speed causes electromagnetic waves to be radiated away from the
neutron star, and these radiated electromagnetic waves also
rotate with the neutron star. If the precessing magnetic axis of the neutron star happens to
direct these emissions in our general direction, we will observe regular pulses
of electromagnetic waves as the neutron star rotates, rather like the rotating
light from a lighthouse. These neutron
stars are called pulsars. The first pulsars ever discovered were the
Crab Pulsar at the center of the Crab Supernova Remnant in the constellation
Taurus (the bull) and the Vela Pulsar at the center of the Vela Supernova Remnant
in the constellation Vela (the sails).
The discovery of these and other pulsars at the center of supernova
remnants provides further strong evidence that our theories of supernova
explosions are correct. Presumably, all
neutron stars are born as pulsars, but the continuous emission of
electromagnetic waves carries energy and angular momentum away from the pulsar,
thus slowing the rotation of the neutron star.
Eventually, the neutron star would no longer emit pulses. Astronomers have measured the gradual
rotational slowing of several pulsars to be a few microseconds per year, and
several non-pulsar neutron stars have been discovered. Note however that some of these non-pulsar
neutron stars may in fact be pulsars. If
a neutron star happens to have an axis of rotation that precesses
its magnetic axis to radiate pulses that do not happen to be
emitted in our general direction, then we would not observe its
pulses. Hence, we would incorrectly
conclude that the pulsar neutron star is instead a non-pulsar neutron star, and
it would be virtually impossible for us to discover that this particular
neutron star is in fact a pulsar.
Neutron stars can also have their rotational speed increased. As we will discuss shortly, gases may fall
toward a neutron star, and these gases may add angular momentum to the neutron
star, speeding up its rotation. These are called millisecond pulsars, since they rotate once in
only a few milliseconds! These
millisecond pulsars are also called recycled pulsars,
since they were at first rotationally slowing from a pulsar neutron star to a
non-pulsar neutron star, but the additional angular momentum gave the neutron
star a second life as a pulsar.
As we discussed, the cutoff
between low mass main sequence stars and high mass main sequence stars is 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses). Many
students demand to know the exact cutoff: is it 7M☉ (seven
solar masses), 8M☉ (eight
solar masses), or 9M☉ (nine
solar masses)? Unfortunately, we cannot
specify this cutoff more precisely, since there is uncertainty in our
theoretical calculations. The Type II
supernova of a high mass star is triggered by the
failure of electron degeneracy pressure to support the white dwarf core of the
supergiant. Therefore, we might be able to
specify an exact cutoff between a low mass star and a high mass star by
calculating the maximum mass electron degeneracy pressure is able to
support. Caution: this would be the
cutoff mass for only the core of the star, not the cutoff mass for the entire
star. The maximum mass that electron
degeneracy pressure is able to support is called the
Chandrasekhar limit, named for the Indian astrophysicist Subrahmanyan
Chandrasekhar who first performed this calculation. The Chandra observatory, NASA’s great X-ray
space telescope as we discussed earlier in the course, is
also named for this astrophysicist.
The Chandrasekhar limit is equal to 1.4M☉ (1.4
solar masses or 1.4 times the mass of our Sun).
This is the maximum mass that electron degeneracy pressure is able to
support. Therefore, this is the
core-mass cutoff between a low mass star and a high mass star. Again, this is the cutoff mass for the core
only; the cutoff mass for the entire star is 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses). In other
words, a star with a total mass of 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) has a core mass
equal to 1.4M☉ (1.4
solar masses). If the mass of the entire
star less than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses), then the mass of its core is less than 1.4M☉ (1.4
solar masses). In this case, electron
degeneracy pressure will be able to support the core. Therefore, the star must be a low mass star,
and it will die gently as a slowly expanding planetary nebula surrounding a
white dwarf. If the mass of the entire
star is greater than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses), then the mass of its core is greater than 1.4M☉ (1.4
solar masses). In this case, electron
degeneracy pressure will not be able to support the core. Therefore, the star must be a high mass star,
and it will die violently as a Type II supernova resulting in a very hot
supernova remnant rapidly expanding away from a neutron star that is supported by neutron degeneracy pressure. Since the Chandrasekhar limit is the maximum
mass that electron degeneracy pressure is able to support, it is not only the
core-mass cutoff between low mass stars and high mass stars. The Chandrasekhar limit is also the maximum possible
mass of a white dwarf. This has been verified by observations. No white dwarf has ever
been discovered with a mass greater than the Chandrasekhar limit of 1.4M☉ (1.4
solar masses). The brightest star in the
constellation Canis Major (the big dog) is Sirius the
dog star, as we discussed earlier in the course. Sirius is actually a binary star, and the two
stars are named Sirius A and Sirius B.
Whereas Sirius A is a main sequence star, Sirius B is a white dwarf, the
closest white dwarf to our Solar System and one of the first white dwarfs ever
discovered. The mass of the Sirius B
white dwarf is roughly 1.0M☉ (1.0 solar masses), less than the 1.4M☉ (1.4
solar mass) Chandrasekhar limit.
As we discussed, the cutoff
between ordinary high mass main sequence stars and very high mass main sequence
stars is 20M☉ to 25M☉ (twenty
to twenty-five solar masses). Many
students demand to know the exact cutoff: is it 20M☉, 21M☉, 22M☉, 23M☉, 24M☉, or 25M☉? Unfortunately, we cannot specify this cutoff
more precisely, since there is uncertainty in our theoretical
calculations. The formation of a black
hole results from the failure of neutron degeneracy pressure to support the
core. Therefore, we might be able to
specify an exact cutoff between an ordinary high mass star and a very high mass
star by calculating the maximum mass neutron degeneracy pressure is able to
support. Caution: this would be the
cutoff mass for only the core of the star, not the cutoff mass for the entire
star. The maximum mass that neutron
degeneracy pressure is able to support is called the Tolman-Oppenheimer-Volkoff limit
or the TOV limit for short, named for the three physicists who first attempted
this calculation: American physicist Richard Tolman,
American physicist J. Robert Oppenheimer, and Russian physicist George Volkoff. The
American physicist J. Robert Oppenheimer is most famous for being the father of
the nuclear bomb, since he was the head physicist of the secret Manhattan
Project during the Second World War. We
are not certain of the precise value of the Tolman-Oppenheimer-Volkoff limit.
Although these three physicists (and other physicists) have attempted
this calculation, neutron stars have such fantastically high densities that the
precise properties of the state of matter within neutron stars is unknown. Presumably, the outer layers of the neutron
star are composed of neutrons; this is called the
crust of the neutron star. However, the
interior of a neutron star is under such incredible pressures that the quarks
and gluons (which compose both protons and neutrons) are
squeezed out of the neutrons.
Hence, we no longer have individual neutrons toward the center of a
neutron star. The core of a neutron star
is actually composed of a fantastically dense soup of quarks and gluons all
colliding with each other. This new
state of matter at the core of a neutron star is called
a quark-gluon plasma, about which we know very little. Therefore, calculating the exact value of the
Tolman-Oppenheimer-Volkoff
limit remains elusive. Nevertheless,
theoretical estimates have revealed its approximate value. The Tolman-Oppenheimer-Volkoff limit is very roughly equal to 2.4M☉ (2.4
solar masses), and it is definitely less than 3M☉ (three solar masses).
Although we do not know the precise value of the Tolman-Oppenheimer-Volkoff limit, for the purposes of this discussion we will
use roughly 2.4M☉ (2.4
solar masses). This is the maximum mass
that neutron degeneracy pressure is able to support. Therefore, this is the core-mass cutoff
between an ordinary high mass star and a very high mass star. Again, this is the cutoff mass for the core
only; the cutoff mass for the entire star is 20M☉ to 25M☉ (twenty
to twenty-five solar masses). In other
words, a star with a total mass of 20M☉ to 25M☉ (twenty to twenty-five solar masses) has a core mass
equal to roughly 2.4M☉ (2.4
solar masses). To summarize, if the mass
of the entire star is less than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) and of course
greater than the lower limit of 0.08M☉ (0.08 solar masses) of all main sequence stars, then
the mass of its core is less than 1.4M☉ (1.4 solar masses).
In this case, electron degeneracy pressure will be able to support the
core. Therefore, the star must be a low
mass star, and it will die gently as a slowly expanding planetary nebula
surrounding a white dwarf. If the mass
of the entire star is greater than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) but less than 20M☉ to 25M☉ (twenty
to twenty-five solar masses), then the mass of its core is greater than 1.4M☉ (1.4
solar masses) but less than roughly 2.4M☉ (2.4 solar masses).
In this case, electron degeneracy pressure will not be able to support
the core, but neutron degeneracy pressure will be able to support the
core. Therefore, the star must be an
ordinary high mass star, and it will die violently as a Type II supernova
resulting in a very hot supernova remnant rapidly expanding away from a neutron
star that is supported by neutron degeneracy
pressure. If the mass
of the entire star is greater than 20M☉ to 25M☉ (twenty to twenty-five solar masses)
and of course less than the Eddington limit of roughly 100M☉
(one hundred solar masses) of all main sequence stars, then the mass of its
core is greater than roughly 2.4M☉ (2.4 solar masses)
and probably less than roughly 10M☉ (ten solar
masses), the approximate core mass of the most massive stars at the Eddington
limit. In this case, not even
neutron degeneracy pressure is able to support the core. Therefore, the star must be a very high mass
star, and it will die violently as a Type II supernova resulting in a very hot
supernova remnant rapidly expanding away from a black hole. As we discussed, since the Chandrasekhar
limit is the maximum mass that electron degeneracy pressure is able to support,
it is not only the core-mass cutoff between low mass stars and ordinary high
mass stars; the Chandrasekhar limit is the maximum possible mass of a white
dwarf. We also now realize that this
Chandrasekhar limit is also the minimum mass of a neutron star. Since the Tolman-Oppenheimer-Volkoff limit is the maximum mass that neutron degeneracy
pressure is able to support, it is not only the core-mass cutoff between
ordinary high mass stars and very high mass stars; the Tolman-Oppenheimer-Volkoff limit is the maximum possible mass of a neutron
star. It is also the minimum mass of a
black hole. In
conclusion, the mass of a white dwarf must be less than the 1.4M☉
Chandrasekhar limit, the mass of a neutron star must be greater than the 1.4M☉ Chandrasekhar limit but less than the
roughly 2.4M☉ Tolman-Oppenheimer-Volkoff limit, and the mass of a black hole must be greater
than the roughly 2.4M☉ Tolman-Oppenheimer-Volkoff limit but less than the 10M☉
rough estimate for the core mass of the most massive stars at the Eddington
limit. Caution: we will discuss
shortly that since nothing can escape from a black hole, a black hole will gain
more and more mass over time. After
billions of years, a black hole may attain a mass of millions and even billions
of solar masses. These are called supermassive black holes, which we will discuss
later in the course. We will use the
term stellar black holes for black holes recently born from the Type II
supernova of a very high mass star, and some stellar black holes grow over
billions of years to become supermassive black holes. We will also discuss toward the end of the
course that there may be microscopic black holes in our universe. These microscopic black holes are also called primordial black holes, since they were born
in the fires of the Big Bang, the creation of the entire universe.
At normal densities, solids
and liquids are highly incompressible, resulting in solids and liquids having
roughly constant densities. In other
words, at normal densities the volume of solids and liquids is directly
proportional to their mass, meaning more mass will occupy a proportionally
larger volume. For example, twice as
much metal or twice as much rock or twice as much
liquid water will all occupy twice as much volume. However, white dwarfs and neutron stars are both supported by degeneracy pressure. Therefore, more massive white dwarfs and more massive neutron stars must in fact have smaller volumes
to provide greater pressure to balance the significantly stronger self-gravity
from their higher mass. A particular
white dwarf or a particular neutron star will actually shrink in volume (shrink
in size) if it happens to gain mass, as we will discuss shortly.
Stars are born from within a
diffuse nebula, a giant cloud of gas many light-years across composed primarily
of hydrogen and helium. Since a diffuse
nebula is so enormous, many stars are born within a diffuse nebula
simultaneously. Therefore, stars are
born in clusters. However, most stars do
not remain in clusters indefinitely.
After a star cluster is born from a diffuse nebula, the individual stars
move apart from one another, moving along their own trajectories through our
Milky Way Galaxy. Therefore, most stars
are not members of star clusters. For
example, our Sun is not presently a member of a star cluster, although it was
presumably born a member of an ancient star cluster that has long since
dispersed. Although most stars are not
members of star clusters, it is the study of star clusters
that has truly revealed that our models of stellar evolution are
correct. In our discussion of star
clusters, we will see the power of the Hertzsprung-Russell
diagram in explaining and predicting stellar properties and stellar evolution.
When we construct the Hertzsprung-Russell diagram for a star cluster, we can see
the main sequence, the red giants, and the white dwarfs on the diagram. Astronomers often abbreviate the main
sequence MS. The red giants appear along
the asymptotic giant branch, which astronomers often abbreviate AGB. The asymptotic
giant branch connects the main sequence with another collection of stars called
the horizontal branch, which astronomers often abbreviate HB. The horizontal branch connects the asymptotic
giant branch with another grouping of stars called the clump. In almost every Hertzsprung-Russell
diagram for almost every star cluster, the early part of the main sequence is
missing. This confirms that high mass
main sequence stars live shorter lifetimes than low mass main sequence stars,
which live longer lifetimes. The star
cluster is sufficiently old that the stars from the missing early part of the
main sequence have already died, since they live short lifetimes. However, the star cluster is not sufficiently
old for the stars in the late part of the main sequence to have died as of
yet. These stars are still
hydrogen-burning main sequence stars, since they have longer lifetimes. The earliest main sequence star in the Hertzsprung-Russell diagram for a star cluster is called the main-sequence turnoff, since it connects the
main sequence to the asymptotic giant branch.
In other words, the hottest, most luminous, largest, and most massive
(earliest) main sequence star in the Hertzsprung-Russell
diagram for a star cluster is at the main-sequence turnoff. The main-sequence turnoff reveals the age of
the star cluster. If the main-sequence
turnoff is early, then the star cluster must be young, since there are still
short-lifetime main sequence stars that have not yet evolved into red
giants. If the main-sequence turnoff is
late, then the star cluster must be old, since there are only long-lifetime
stars remaining on the main sequence.
For example, if the main-sequence turnoff is in the spectral type B,
then the star cluster must be roughly ten million years old, since the
main-sequence lifetime of a B-type star is roughly ten million years. As another example, if the main-sequence
turnoff is in the spectral type F, then the star cluster must be roughly one
billion years old, since the main-sequence lifetime of an F-type star is
roughly one billion years. The
main-sequence lifetime of a G-type star like our Sun is roughly ten billion
years. No star cluster has ever been discovered with a main-sequence turnoff later
than the G spectral type. This is one way we know the age of the entire universe. The universe cannot be much older than ten
billion years since we have never discovered a star cluster with a
main-sequence turnoff later than spectral type G. At the very end of this course, we will
discuss that the age of the universe is more precisely fourteen billion years,
which we have determined from the expansion of the entire universe. Notice these two different methods of
determining the age of the universe are fairly consistent
with each other. Since the asymptotic
giant branch connects with the main sequence at the main-sequence turnoff, the
red giants along the asymptotic giant branch must be expanding to become red
giants after ending their main-sequence lifetimes. The stars near the beginning of the
asymptotic giant branch are orange subgiant stars; they have only recently left
the main sequence. The stars suffering
from the helium flash are at the end of the asymptotic giant branch, where the
asymptotic giant branch connects with the horizontal branch. We also find Cepheid variable stars along the
asymptotic giant branch, since Cepheid variable stars suffer from the
instability of transitioning from a main sequence star to a red giant
star. We will discuss Cepheid variable
stars later in the course. The stars
along the horizontal branch are in the process of attaining gravitational
equilibrium from the new pressure provided by the helium fusion in the stellar
core. We also find Lyrae
variable stars along the horizontal branch, since Lyrae
variable stars suffer from the instability of transitioning from a red giant
star to a helium-burning star. We will
discuss Lyrae variable stars later in the
course. The clump is the collection of
helium-burning stars that have attained gravitational equilibrium. There is often a second asymptotic giant
branch connected to the clump. The stars
along this second asymptotic giant branch have exhausted the helium in their
cores. The carbon core collapses, while
the outer layers of the star expand.
Hence, these stars have become red giants for the second time. We find Mira variable stars along the second
asymptotic giant branch, since Mira variable stars suffer from the instability
of transitioning from a helium-burning star to a red giant star. We will discuss Mira variable stars later in
the course. Astronomers informally refer
to the upper-middle part of the Hertzsprung-Russell
diagram as the instability strip, since we find Cepheid variable stars, Lyrae variable stars, Mira variable stars, and even Tauri variable stars on that part of the diagram. Eventually, the slowly expanding outer layers
of the star will divorce themselves from the core. The slowly expanding outer layers will become
a planetary nebula, while the naked core will become a white dwarf. Indeed, we see white dwarfs in the Hertzsprung-Russell diagram for star clusters. If we plot the stars of a newly born star
cluster on the Hertzsprung-Russell diagram, we would
see the entire main sequence with no red giants and no white dwarfs, since a
newly born cluster has not had time for any main sequence stars to die. If we could wait millions of years as the
stars within this newly born star cluster evolve and if we could plot these
stars accordingly on the Hertzsprung-Russell diagram,
we would see the early-type main sequence stars become red supergiant stars and
then disappear from the Hertzsprung-Russell diagram
as they live their short lifetimes and explode as Type II supernovae. As a result, the main-sequence turnoff would
appear to advance from spectral type O to spectral type B, thus shrinking the
appearance of the main sequence on this Hertzsprung-Russell
diagram. As the main-sequence turnoff
advances to spectral type A, we would see these main sequence stars evolve into
orange subgiant stars and then into red giant stars, forming the first
asymptotic giant branch within the instability strip. When these stars eventually suffer from the
helium flash, we would then see the formation of the horizontal branch within
the instability strip, ultimately forming the clump on the Hertzsprung-Russell
diagram. When these stars exhaust the
helium in their cores, we would then see the formation of the second asymptotic
giant branch within the instability strip.
We would then see white dwarfs begin to appear on the Hertzsprung-Russell diagram. If we could wait billions
of more years and if we could continue to plot these stars accordingly on the Hertzsprung-Russell diagram, we would see the main-sequence
turnoff continue to advance from spectral type A to spectral type F to spectral
type G as more and more main sequence stars begin the process of stellar death,
thus further shrinking the main sequence on the Hertzsprung-Russell
diagram. We would continue to see
stars move from the main sequence toward and along the first asymptotic giant
branch within the instability strip, along the horizontal branch within the
instability strip, through the clump, and along the second asymptotic giant
branch within the instability strip. We
would also see more and more white dwarfs appear on this Hertzsprung-Russell
diagram as these stars reach the very end of their evolution.
The Hertzsprung-Russell
diagram for a star cluster can be used to determine
the distance to the cluster. Suppose a
star cluster is significantly beyond the solar neighborhood. Therefore, the star cluster is too distant
for parallax to be used to determine its
distance. Hence, we need another method
to determine the distance to this star cluster.
The procedure to determine the distance to this cluster is as
follows. First, we first construct the Hertzsprung-Russell diagram for the star cluster. At this suggestion, we should all
protest. The vertical axis of the Hertzsprung-Russell diagram is the luminosity
or the absolute magnitude or the intrinsic brightness, and we must have
the distances to stars to determine their luminosities or absolute magnitudes
or intrinsic brightnesses. Suppose we instead use the apparent magnitude
as the vertical axis instead of the absolute magnitude. At this suggestion, we should all protest
even more strongly. The apparent
magnitude or the apparent brightness of a star depends upon its distance from
us; the apparent magnitude is not an intrinsic property of a star! Here is the crux of the argument. The star cluster is distant enough for all of
the stars within the cluster to be roughly the same distance from us;
therefore, all of their apparent brightnesses are directly related to their intrinsic brightnesses. A concrete example will make this clear. Suppose we observe two stars in the sky. We will name these two stars Star Alpha and
Star Beta. Suppose Star Alpha appears to
be brighter than Star Beta; that is, suppose Star Beta appears to be dimmer
than Star Alpha. We cannot draw any
conclusion about the intrinsic brightness or the luminosity of these two stars
without knowing the distance of each star from us. Star Alpha could be intrinsically brighter
than Star Beta, but Star Beta might in fact be intrinsically brighter than Star
Alpha. In this case, Star Beta only
appears dimmer since it is further from us, and Star Alpha only appears
brighter since it is closer to us.
However, now suppose Star Alpha appears to be brighter than Star Beta,
and in addition suppose we have determined using whatever method
that both stars are the same distance from us. We can now be certain that Star Alpha is
indeed intrinsically brighter than Star Beta; that is, we can be certain that
Star Beta is intrinsically dimmer than Star Alpha. Again, without knowing distances, we cannot
draw any conclusions, but if we happen to know that two stars are the same distance from us, then the apparently brighter
star is indeed intrinsically brighter and the apparently dimmer star is indeed
intrinsically dimmer. If a star cluster
is distant enough, which we are certain is the case if parallax angles are too
small to measure, then all the stars within the cluster are roughly the same
distance from us. Of course, the stars
in front of the cluster are somewhat closer to us; of course, the stars in the
back of the cluster are somewhat further from us. Nevertheless, these are small variations if
the entire star cluster is distant enough from us. If all of the stars in the star cluster are
roughly the same distance from us, then the stars that appear to be brighter
truly are more luminous, and the stars that appear to be dimmer truly are less
luminous. Therefore, we can construct
the Hertzsprung-Russell diagram for a distant star
cluster using the apparent magnitude instead of the absolute magnitude as the
vertical axis. After constructing the Hertzsprung-Russell diagram, we should see the main
sequence on the diagram, among other features such as the asymptotic giant
branch, the horizontal branch, and the clump.
We already know the absolute magnitudes of main sequence stars as a
function of their spectral types from studying nearby stars within the solar
neighborhood. Thus, we assign these
absolute magnitudes to the corresponding main sequence stars we see on the Hertzsprung-Russell diagram for the star cluster. Essentially, we are sliding the star
cluster’s entire Hertzsprung-Russell diagram
vertically (up and down) until all main sequence stars on the diagram attain
their appropriate absolute magnitudes. Now
that we have both the absolute magnitudes and the apparent magnitudes of the
main sequence stars in the cluster, the only unknown remaining in the equation I = ℒ / 4πr2 is the distance r, meaning that we have successfully determined the distance to the
star cluster. This procedure is called the main sequence fitting method, since we are
determining the distance to the cluster by combining the apparent magnitudes of
the main sequence stars with their established absolute magnitudes from nearby
main sequence stars in the solar neighborhood.
The main sequence fitting method is the next major rung of the
Cosmological Distance Ladder above the parallax method, which is the lowest
rung of the Cosmological Distance Ladder.
As we discussed, most star
systems are binary star systems. This is
reason enough to devote some discussion to binary star systems. Most binary star systems are detached
binaries, meaning that the two stars orbit each other sufficiently far from one
another that they do not significantly affect each other’s evolution. Whichever star is more massive will live a
shorter main-sequence lifetime. That star will then swell to become a red
giant star. The helium flash will occur,
and that star will then become a helium-burning star. After the star’s helium-burning lifetime, the
star will swell to become a red giant star a second time, eventually ejecting a
slowly expanding planetary nebula and leaving behind a white dwarf. Eventually, the other star will experience
the same sequence of stages of stellar death.
If one of the stars in a binary star system is high mass, it will live
an extremely short main-sequence lifetime. That star will then swell to become a
supergiant star and suffer from a Type II supernova. Extraordinarily, the other star survives this
supernova explosion. The supernova
ejects a hot supernova remnant rapidly expanding away from either a neutron
star or a black hole. The other star is
usually a low mass star that will eventually experience its own appropriate
sequence of stages of stellar death. In
conclusion, since most binary star systems are detached binaries where the two
stars orbit each other sufficiently far from one another that they do not
significantly affect each other’s evolution, all of the stellar evolution we
have discussed applies to most binary star systems. However, if the two stars in a binary star
system are orbiting sufficiently close to each other, they can affect each
other’s evolution. Therefore, much of
the stellar evolution we have discussed requires modifications. Such binary star systems are
called close binaries. Caution:
the term close binary does not mean the binary star system is close to our
Solar System; the term close binary means the two stars in the binary star
system orbit close to each other. In a
close binary, whichever main sequence star is more massive will live a shorter main-sequence lifetime.
That star will then swell to become a giant star. However, the two stars orbit sufficiently
close to each other that the outer layers of the giant star approach the other
less massive star. These outer layers
will then feel more gravitational attraction from the less massive star. Hence, the outer layers of the giant star
will fall toward the less massive star.
The gas does not fall directly toward the second star, since the gas has
angular momentum from the orbital motion of both stars around each other. Therefore, these gases settle into an orbit
around the less massive star, forming a flat disk. Gases within the disk that are closer to the
star orbit faster while gases within the disk that are further from the star
orbit slower, in accordance with Kepler’s laws.
As a result, neighboring gases within the disk move at different speeds;
hence, the gases within the disk rub against each other, resulting in friction
that heats the disk. This increase in
thermal energy (heat energy) must come at the expense of gravitational orbital
energy, since energy must be conserved. Therefore, the gas within the disk migrates
inward, toward the less massive star.
Eventually, the gas collides onto the surface of the star. The less massive star therefore gains mass
through these collisions. The gaining of
mass from collisions is called accretion, as we
discussed earlier in the course. Hence,
the flat disk around the less massive star is called
an accretion disk. In summary, there is
a mass transfer from the giant star to the less massive star through an
accretion disk around the less massive star.
Eventually, the less massive star may gain so much mass that it becomes
more massive than the giant star, which has now lost so much mass that the
giant star is less massive than the second star! This explains why we sometimes discover
binary star systems with a giant star that is less massive than the other
star. More massive
stars live shorter main-sequence lifetimes; therefore, the giant star
should be the more massive star. Indeed,
the giant star was formerly the more massive star, but it lost much of its mass
through a mass transfer to the other star.
A giant star in a close binary may lose so much mass from its outer
layers through this mass transfer that it may become an exotic subgiant star
with a disproportionately large core.
Eventually, the other star may gain so much mass that it begins to
evolve off of the main sequence prematurely. It swells to become a giant star, and its
outer layers approach the first star.
These gases will then feel more gravitational attraction to the first
star, eventually falling toward the first star.
Hence, there is a second mass transfer from the second star back to the
first star!
Usually, both stars in a
close binary are low mass stars.
Eventually, one of the stars will reach the very end of its life,
ejecting a planetary nebula and leaving behind a white dwarf. The star has lost most of its mass when it ejects the planetary nebula, and hence the gravitational
attraction between the white dwarf and the second star is
greatly weakened. As a result,
the center of mass of the close binary is displaced to
be much closer to the second star, and the trajectories of both stars around
that new center of mass is greatly altered.
Often, the gravitational attraction of the two stars is
sufficiently weakened that both stars subsequently move on unbound
trajectories; the stars leave each other, and the binary system ends. The two stars may however continue to orbit
each other, and the two stars may even continue to orbit close to each other,
maintaining the close binary system.
Eventually, the second star ends its main-sequence lifetime and expands
to become a giant star. In the case
where the two stars continue to orbit close to each other, the outer layers of
the giant star approach the white dwarf.
These gases will then feel more gravitational attraction from the white
dwarf. Hence, the outer layers of the
giant star will fall toward the white dwarf.
Again, the gas does not fall directly toward the white dwarf, since the
gas has angular momentum from the orbital motion of both stars around each
other. Therefore, these gases settle
into an orbit around the white dwarf, forming an accretion disk. Again, the gases within the disk rub against
each other, resulting in friction that heats the disk. This increase in thermal energy (heat energy)
must come at the expense of gravitational orbital energy, since energy must be conserved.
Therefore, the gas within the disk migrates inward, toward the white
dwarf. Eventually, the gas collides onto
the surface of the white dwarf, causing the white dwarf to gain mass. In summary, there is a mass transfer from the
giant star to the white dwarf through an accretion disk around the white
dwarf. However, a white dwarf is small,
roughly the size of the Earth, as we discussed.
Hence, the gravitational well of a white dwarf is sufficiently deep that
the gas that collides onto the surface of the white dwarf is strongly
compressed and significantly heated.
This gas is predominantly hydrogen, since it came from the outer layers
of the giant star. As gas continues to
fall onto the white dwarf, the hydrogen on its surface may
eventually be heated to millions of kelvins, causing it to fuse into
helium. This causes the white dwarf to suddenly increase in brightness to thousands of solar
luminosities for a few weeks. This is called a nova. The
sudden increase in luminosity ejects material from the surface of the white
dwarf, resulting in an expanding shell of hot gas away from the close binary
system. This is called
a nova remnant. The nova and the ejected
nova remnant do not stop the mass transfer from the giant star to the white
dwarf from continuing. Eventually,
another nova may occur, ejecting another nova remnant. In other words, novae and nova remnants from
a white dwarf in a close binary are periodic, occurring regularly. Novae and nova remnants from a white dwarf in
a close binary may occur once every few decades, once every few centuries, or
once every few millennia. To summarize,
there are several important differences between a nova and a supernova. Firstly, novae from a white dwarf in a close
binary occur regularly, while the Type II supernova of a high mass star occurs
only once. Secondly, novae last a few
weeks, while a supernova lasts a few months.
Thirdly, novae have luminosities of thousands of solar luminosities,
while supernovae have luminosities of billions of solar luminosities. Note however that observationally a nova and
a supernova may appear identical, at least at first glance. A nova that occurs sufficiently close to us
may appear just as bright (same apparent magnitude) as a supernova that
occurred much further from us. We can
discriminate between a nova and a supernova by determining the distance to the
event and then using that distance to calculate the luminosity (the absolute
magnitude or the intrinsic brightness) of the event. If the absolute magnitude is thousands of
solar luminosities, the event was a nova, not a supernova. If instead the absolute magnitude is billions
of solar luminosities, the event was a supernova, not a nova. We may also discriminate between a nova and a
supernova from the duration of time of the event. If the increase in brightness lasts for a few
weeks, we may conclude that the event was a nova, not a supernova. If instead the increase in brightness lasts
for a few months, we may conclude that the event was a supernova, not a
nova. We may also discriminate between a
nova and a supernova by observing the space surrounding the event. If we observe a large slowly expanding
planetary nebula around the event, we may conclude that the event was a nova,
not a supernova. In this case, the
surrounding large planetary nebula was ejected when
the white dwarf first formed. If instead
we observe a small very hot (in the millions of kelvins) supernova remnant
rapidly expanding away from the event, we may conclude that the event was a
supernova, not a nova. In this case, the
small very hot rapidly expanding supernova remnant was just
ejected when the supernova occurred.
Note that the word nova is derived from a Latin
word meaning new. Observationally, a
nova simply appears to be a new star. A
supernova also appears to be a new star, but with much greater luminosity or
absolute magnitude or intrinsic brightness.
Caution: a white dwarf in a close binary may suffer from its own unique
type of supernova, as we will discuss later in the course.
If one of the stars in a
close binary is a high mass star, it will live a short main-sequence
lifetime. It then swells to become a
supergiant star and explodes as a Type II supernova, throwing out a hot
supernova remnant rapidly expanding away from a neutron star or a black
hole. Extraordinarily, the other star
survives the supernova, even though the two stars orbit close to each other. We now have a neutron star or a black hole,
called the compact object, orbiting a main sequence star, called the primary
object. The primary object will
eventually end its main-sequence lifetime and swell to become a giant
star. The outer layers of the giant star
approach the compact object and hence feel stronger gravitational attraction
from the compact object. Thus, the outer
layers of the giant star fall toward the compact object. Again, the gas does not fall directly toward
the compact object, since the gas has angular momentum from the orbital motion
of both stars around each other.
Therefore, these gases settle into an orbit around the compact object,
forming an accretion disk where friction heats the disk causing the gas to
migrate inward toward the compact object.
However, the gravitational well of a neutron star or a black hole is so
incredibly deep that the gas is heated to millions of
kelvins of temperature as it falls toward the compact object. At these very hot temperatures, the accretion
disk radiates X-rays. These binary star
systems are called X-ray binaries, which
astrophysicists often abbreviate XRBs. The incredibly deep gravitational well of the
compact object also accelerates the falling gas to nearly the speed of
light. Some of this gas may be ejected as narrow columns or jets near the rotational
angular momentum axis of the accretion disk around the compact object. For all of these reasons, some types of X-ray
binaries are often called microquasars. We will discuss quasars later in the
course. For now, we simply mention that
the accretion disk of an X-ray binary together with the high-speed jets of gas
ejected along the rotational angular momentum axis of the accretion disk around
the compact object makes these X-ray binaries similar to quasars, but on a much
smaller size scale than quasars. This is
why some types of X-ray binaries are often called microquasars.
The compact object of an
X-ray binary is either a neutron star or a black hole. A neutron star has a solid surface. Therefore, very hot gas falling toward a
neutron star that has been accelerated to nearly the
speed of light will eventually collide onto the surface of the neutron star,
causing sudden and intense X-ray bursts.
These X-ray bursts can have luminosities of many thousands of solar
luminosities, entirely in X-rays! Black
holes however do not have a solid surface, as we will discuss shortly. Therefore, very hot gas falling toward a
black hole that has been accelerated to nearly the
speed of light will not collide with a solid surface; the gas rather quietly
disappears from the observable universe as it falls into the black hole. Therefore, there are no sudden and intense
X-ray bursts from a black hole. This is one way astrophysicists determine whether the compact
object in an X-ray binary is a neutron star or a black hole. If we detect sudden and intense X-ray bursts,
then the compact object is a neutron star.
If we do not detect sudden and intense X-ray bursts, then the compact
object is a black hole. Another way
astrophysicists make this determination is by calculating the mass of the
compact object using Kepler’s third law.
If the mass of the compact object is greater than the Tolman-Oppenheimer-Volkoff limit,
then the compact object must be a black hole.
If the mass of the compact object is less than the Tolman-Oppenheimer-Volkoff limit but greater than the Chandrasekhar limit,
then the compact object must be a neutron star.
The first black hole ever discovered was the compact object in an X-ray
binary in the constellation Cygnus (the swan).
This X-ray binary was named Cygnus X-1. The primary object in this binary star system
is a supergiant star. The compact object
in this binary star system was calculated to have a
mass significantly greater than the Tolman-Oppenheimer-Volkoff limit, revealing that it is indeed a black
hole. Yet another way
astrophysicists determine whether the compact object in an X-ray binary
is a neutron star or a black hole is the observation of pulses from the compact
object. As we discussed, a pulsar is a
neutron star, and the observation of electromagnetic pulses from the X-ray
binary would reveal that the compact object is a neutron star. The mass transferred from the primary object
to the neutron star through the accretion disk may add angular momentum to the
neutron star, thus speeding up its rotation.
The result is a millisecond pulsar, since it rotates once in only a few
milliseconds. These millisecond pulsars are also called recycled pulsars, since they were at first
rotationally slowing from a pulsar neutron star to a non-pulsar neutron star
through the loss of angular momentum carried away by its pulses, but the
additional angular momentum from the accreting gases gave it a second life as
pulsar.
The Theories of Relativity: Galilean-Newtonian Relativity, Einsteinian Special Relativity, and Einsteinian General Relativity
Galilean-Newtonian Relativity
theory was formulated between three hundred and four
hundred years ago. This relativity
theory may also be called common-sense relativity
theory, since many of us understand this relativity theory intuitively from our
daily experiences. Fundamental to
Galilean-Newtonian Relativity theory is the Galilean-Newtonian velocity
addition law, which states that the velocity of Object A relative to Object B (written )
plus the velocity of Object B relative to Object C (written )
is equal to the velocity of Object A relative to Object C (written ). This law is more properly written = + . This Galilean-Newtonian velocity addition law
may seem intimidating at first, but in fact many of us already understand this law intuitively from
our daily experiences, even if we cannot state this law mathematically. Let us discuss several examples to illustrate
that this law is indeed consistent with our common sense. As our first example, suppose a train is
moving at ten miles per hour to the right relative to the ground, and suppose
someone on the train fires a bullet moving at one hundred miles per hour to the
right relative to the train. Then, the
velocity of the bullet relative to the ground is one hundred and ten miles per
hour to the right. As our second
example, suppose that a train is moving at ten miles per hour
to the right relative to the ground, and suppose someone on the ground
fires a bullet moving at one hundred miles per hour to the right relative to
the ground. Then, the velocity of the
bullet relative to the train is ninety miles per hour to the right. As our third example, suppose a car is moving
at seventy miles per hour on one side of a highway relative to the ground, and
suppose another car is moving at fifty miles per hour on the same side of the
highway and hence is moving in the same direction relative to the ground. Then, the seventy-car is moving at twenty
miles per hour relative to the fifty-car.
Also, the fifty-car is moving at twenty miles
per hour backwards relative to the seventy-car.
As our fourth example, suppose a car is moving at seventy miles per hour
on one side of a highway relative to the ground, and suppose another car is
moving at fifty miles per hour on the opposite side of the highway and hence is
moving in the opposite direction relative to the ground. Then, either car is moving at one hundred and
twenty miles per hour relative to the other car. Why did we subtract seventy miles per hour
and fifty miles per hour to obtain twenty miles per hour in our third
example? Why did we add seventy miles
per hour and fifty miles per hour to obtain one hundred and twenty miles per hour
in our fourth example? The simple, common-sense
arguments are as follows. If we are
driving at fifty miles per hour on a highway and if a car on the same side of
the highway comes up from behind us at seventy miles per hour and collides with
us, the collision will be mild, since the relative speed between the two cars
is only twenty miles per hour. This
collision is exactly as if our car were parked and we were
hit by a car moving at twenty miles per hour. However, if we are driving at fifty miles per
hour on a highway and if a car moving in the opposite direction at seventy
miles per hour collides with us (a head-on collision),
we would be dead, since the relative speed between the two cars is one hundred
and twenty miles per hour. This
collision is exactly as if our car were parked and we were
hit by a car moving at one hundred and twenty miles per hour. As a fifth example, suppose a car is moving
at sixty miles per hour on one side of a highway relative to the ground, and
suppose another car is moving at sixty miles per hour on the same side of the
highway and hence is moving in the same direction relative to the ground. Then either car is not moving (is at rest)
relative to the other car.
After centuries of physicists believing that Galilean-Newtonian Relativity
theory (common-sense relativity theory) is correct, new physics was discovered
that began to reveal that these laws, although seemingly indisputable, are
nevertheless incorrect. In the 1860s, the brilliant Scottish physicist James Clerk Maxwell
formulated classical electromagnetic theory with four equations, later named
the Maxwell equations in his honor.
These four Maxwell equations are mathematically beautiful. These four Maxwell equations completely
summarize classical electromagnetism.
These four Maxwell equations even revealed that light is an
electromagnetic wave, and indeed the entire wave theory of light can be derived from these four Maxwell equations, including
all the laws of classical optics.
However, these four Maxwell equations also stated that the vacuum speed
of light is always the same number, written c
and equal to roughly three hundred thousand kilometers per second or roughly
one hundred and eighty-six thousand miles per second. This cannot be true, can it? According to common sense, the vacuum speed
of light cannot always be the same number, as the following few examples will
illustrate. Suppose a train is moving at
velocity V to the right relative to
the ground, and suppose someone on the train with a flashlight sends a light
beam moving at velocity c to the
right relative to the train. Then, the
velocity of the light beam relative to the ground is c plus V, isn’t it? Please
review our first example from Galilean-Newtonian Relativity theory for help
with this example, since they are in fact identical examples. Suppose a train is moving at velocity V to the right relative to the ground,
and suppose someone on the ground with a flashlight sends a light beam moving
at velocity c to the right relative
to the ground. Then, the velocity of the
light beam relative to the train is c
minus V, isn’t
it? Please review our second example
from Galilean-Newtonian Relativity theory for help with this example, since
they are in fact identical examples.
Suppose a train is moving at velocity c to the right relative to the ground, and suppose someone on the
ground with a flashlight sends a light beam moving at velocity c to the right relative to the
ground. Then, the light beam is not
moving (is at rest) relative to the train, isn’t
it? Please review our fifth example from
Galilean-Newtonian Relativity theory for help with this example, since they are
in fact identical examples. Suppose a
train is moving at velocity V
relative to the ground, and suppose someone on the ground with a flashlight
sends a light beam moving at velocity c relative to the ground at right angles to the train’s velocity. Then, the light beam is moving at a speed relative to the train, isn’t
it? All of these examples persuade us
that according to the common sense of our daily
experiences, the vacuum speed of light should depend upon which direction we
are moving, how fast we are moving, and in which direction the light itself is
moving. Our examples, using the common
sense of our daily experience, tell us that sometimes the vacuum speed of light
might be c, but other times it might
be c plus V, sometimes it could be c
minus V, sometimes it could be zero,
sometimes it could be ,
and so on and so forth. The two American
physicists Albert A. Michelson and Edward W. Morley set out to show that this
is the case in the 1880s with what was
later called the Michelson-Morley experiment in their honor. However, their experiment showed that the
vacuum speed of light does not depend upon which direction we are moving or how
fast we are moving or even in which direction the light itself is moving! Their measurements showed that the vacuum
speed of light is always the same number, always c! Our common sense tells us
that this cannot be true, and indeed many physicists believed that Michelson
and Morley performed their experiment incorrectly. Some physicists did believe the result, but
they could not explain how this can possibly be true.
This brings us to the person
who would explain all of these mysteries.
Albert Einstein was a mediocre physicist who struggled with
mathematics. One of his
elementary-school math teachers told young Albert’s father, “Nothing good will
ever come from this boy!” In the year
1905, Albert Einstein worked at a patent office in Switzerland. Although many physicists would feel
humiliated working in such a position, this job gave Einstein plenty of time to
think about fundamental physics.
Einstein was so enraptured by the beauty of the
Maxwell equations that he became convinced that they must be true. Most physicists would have responded that the
Maxwell equations cannot be completely true, since
they state that the vacuum speed of light is always c, which common sense says is impossible. Only Einstein would dare assert the
following. The Maxwell equations are so
beautiful that they must be true.
Therefore, if they state that the vacuum speed of light is always c, then the vacuum speed of light is
always c! This is Special Relativity theory in one
sentence. Einstein’s Special Relativity
theory states that the vacuum speed of light does not depend upon which
direction we are moving or how fast we are moving or even in which direction
light happens to be moving. Einstein’s
Special Relativity theory states that the vacuum speed of light is always the
same number, written c and equal to
roughly three hundred thousand kilometers per second or roughly one hundred and
eighty-six thousand miles per second. In
other words, Einstein’s Special Relativity theory states that the vacuum speed
of light is an invariant.
Einstein’s Special Relativity
theory is simple to state, but this theory confounds our common sense. How can the vacuum speed of light possibly be
an invariant? The basic argument is as
follows. If the vacuum speed of light is
always the same number c, then space
and time must change to ensure that the vacuum speed of light c does not change. For example, Einstein made the following
incredible deduction from his new theory.
Time slows down when we move; moving clocks actually run slow! This is called time
dilation. Consider any clock whatsoever,
such as a mechanical clock or the electronic clock within our mobile
telephones. According to Einstein’s
Special Relativity theory, a clock must run slower if it moves. Suppose all of us had identical mobile
telephones, and suppose we synchronized their clocks. If one of us walks with our mobile telephone,
our mobile telephone runs slower than everyone else’s mobile telephones! As a result, our time runs behind everyone
else’s time! Is Einstein actually
claiming that whenever we walk or ride a bicycle or drive a car or ride a train
or ride an airplane that our time slows down?
Yes! But then
why do we never notice in our daily experience that our time slows down? The time dilation effect is very tiny for
objects moving at speeds very slow compared with c, and everything in our daily experiences does indeed move very
slowly compared with the vacuum speed of light c. We would only notice
these temporal changes if we moved incredibly fast, close to the vacuum speed
of light c. The implications of this time dilation effect
are staggering. For example, consider
two identical twins who have lived together in the same house their entire
lives. Hence, they are the same
age. However, if one of them walks down
the street, that twin will age a tiny amount slower, since their time is now
running slower. When that twin returns
home, that twin will be a tiny amount younger than the twin who remained at
home! Time dilation was considered
outrageous a century ago, but this effect has actually been observed in recent
decades. For example, suppose we
synchronize two extraordinarily accurate atomic clocks. Then, suppose we place one of these atomic
clocks on an airplane. After the flight,
physicists have actually experimentally measured that
the airplane’s atomic clock is behind the ground’s atomic clock by a tiny
amount! As another example, consider an
unstable subatomic particle that decays after a short lifetime. If this particle is
accelerated close to the vacuum speed of light c, it lives much longer before decaying since its lifetime is much
longer. When the particle moves, its
time slows down, permitting it to live a longer lifetime before decaying. Not only is time dilation real, but in fact computers,
mobile telephones, and the global positioning system (GPS) would all not
function correctly without taking into account the fact that all of their
clocks run at different rates since they all move at different speeds!
Einstein drew another incredible
conclusion from his new theory: space contracts when we move; moving objects
actually contract! This is called length contraction. Consider any object whatsoever. According to Einstein’s Special Relativity
theory, the object must contract in the direction it is moving. While we are walking, we are skinnier than
usual, and not because we are getting exercise!
When we stop moving, our shape returns to normal. Is Einstein actually claiming that moving
cars and moving trains are shorter than normal?
Yes! But then
why do we never notice in our daily experience that moving cars and moving
trains are shorter than normal? The
length contraction effect is very tiny for objects moving at speeds very slow
compared with c, and everything in
our daily experiences does indeed move very slowly compared with the vacuum
speed of light c. We would only notice these spatial changes if
we moved incredibly fast, close to the vacuum speed of light c.
We now begin to have some
understanding how it could possibly be true that the vacuum speed of light is
an invariant, always equal to the same number c. Speed is equal to
distance divided by time. Distance is measured with graduated rods such as meter sticks, and
time is measured with clocks. However,
moving objects contract and moving clocks run slow! To ensure that the vacuum
speed of light is always equal to the same number c, every graduated rod in the universe contracts by just the right
amount and every clock in the universe slows down by just the right amount to
ensure that the distance traveled by light divided by the time for light to
travel always equals the same number c. Space and time must change to ensure that c does not change!
When Einstein deduced length
contraction and time dilation from his new theory, he realized that the
Galilean-Newtonian velocity addition law could no longer be correct. Physicists believed that the
Galilean-Newtonian velocity addition law was correct for centuries, and even today the common sense of our daily experience tells us that
it seems to be true. We must keep in
mind that time dilation and length contraction are very tiny effects for
objects moving at speeds very slow compared with c, such as walking people, driving cars, moving trains, and flying
airplanes. This makes the
Galilean-Newtonian velocity addition law almost correct, but still not exactly correct, at these slow speeds. At very fast speeds approaching c, we would actually notice that this
law is severely wrong. Einstein deduced
the correct velocity addition law by taking time dilation and length
contraction into account. This new law is called the Lorentz-Einstein velocity addition law, named
for both Albert Einstein and Dutch physicist Hendrik Lorentz. The Lorentz-Einstein velocity addition law
states that . This new velocity addition law correctly
ensures that the vacuum speed of light is an invariant, always equal to the
same number c.
Einstein drew yet another
incredible conclusion from his new theory: the mass of an object increases as
it moves faster. For centuries,
physicists believed that the mass of an object is fixed, and even today the common sense of our daily experience tells us that
the mass of an object is fixed. We must
realize that the additional mass is very tiny at speeds very slow compared with
c, such as the speeds of walking
people, driving cars, moving trains, and flying airplanes. We would need to move at very fast speeds
approaching c to
actually notice this extra mass.
If we stand still, we have a certain amount of mass, but while we are walking we have more mass!
The next time someone urges us to go jogging to lose some weight, we should respond that we will gain mass if we
jog! The extra mass is tiny at such a
slow speed, but it is nevertheless real!
The equation for this extra mass (which we will not present in this
course) reveals yet another outrageous consequence of this theory: the vacuum
speed of light c is the speed limit
of the universe. An object gains mass
when it moves faster, but this means that we would then require more force to
speed it up further. If a force does
speed the object up further, then the object would gain even more mass, and
thus we would require even more force to speed it up further still. According to the extra-mass equation, the
mass of an object approaches infinity as its speed approaches c.
This means that we would need an infinite force to speed the object up
to c, but it is impossible to exert
an infinite force. Not only does the
universe forbid anything from moving faster than c, the universe forbids any object to even reach
c!
We could accelerate a spaceship faster and faster, making it move closer
and closer to c, but the spaceship
can never reach c. The only things permitted to actually move at
c are things already moving at c, such as light or any electromagnetic
wave (composed of photons) from across the entire Electromagnetic
Spectrum. We will discuss shortly that
according to Einstein’s General Relativity theory, gravity also moves at c.
Any object moving slower than c
is forever constrained to move slower than c. Such an object can move faster and faster
approaching c, but actually reaching c is forbidden. Moving faster than c is out of the question.
These conclusions were considered outrageous a century ago, but they
have been proven in recent decades. In
particle accelerators, we can speed up subatomic particles, and physicists have
experimentally verified that these particles do indeed gain more and more mass as they move faster and faster, in precise
accordance with the extra-mass equation that Einstein discovered. Moreover, physicists have experimentally
verified that the vacuum speed of light c
acts as a bottleneck, precisely as predicted by the extra-mass equation that
Einstein discovered. We can accelerate
particles very close to c, but we
cannot accelerate particles to actually reach c.
Speeds faster than c are out
of the question. Modern particle
accelerators can accelerate protons to speeds faster than 99.9999% of c, but nevertheless
still slower than c itself. In summary, Einstein’s Special Relativity
theory states that our universe has a speed limit, the vacuum speed of light c!
Einstein also deduced his
famous mass-energy relation from his new theory, which states that energy is
equal to mass multiplied by the square of the vacuum speed of light. This law is most commonly written E = mc2. The consequences of this equation are
staggering. For example, consider
nuclear reactions. An exothermic nuclear
reaction liberates energy, while an endothermic nuclear reaction absorbs
energy. These two terms exothermic and
endothermic are used to describe chemical reactions as
well. If an exothermic nuclear reaction
liberates energy, then the reaction must liberate mass as well. Thus, the products of an exothermic reaction
have less mass than the reactants! If an
endothermic nuclear reaction absorbs energy, then the reaction must absorb mass
as well. Thus, the products of an
endothermic reaction have more mass than the reactants! Einstein stated this mass-energy relation in
the year 1906. It would be a few years
later before physicists even discovered that an atom has a nucleus, and it
would be almost forty years later before the first nuclear weapons were built.
Nevertheless, Einstein actually stated in the year 1906 that his
mass-energy relation could be proven by studying
radioactive materials. It would be years
before physicists even realized that radioactivity is a type of nuclear
reaction! Let us spend a moment
reflecting upon Einstein’s genius.
Almost forty years before nuclear weapons were built
and even a few years before the nucleus of an atom was discovered, Einstein
discovered the mass-energy equation and applied it to radioactivity, a type of
nuclear reaction!
The mass-energy relation may
be the last of Einstein’s contributions to Special Relativity theory, but one
of his former math teachers, the German-Polish-Russian mathematician Hermann Minkowski, realized what this new theory is really trying
to tell us. According to Special
Relativity theory, we live in a four-dimensional universe. According to the common sense of our daily
experience, we live in a three-dimensional universe. These three dimensions are length, width, and
height, mathematically written as x, y, and z. However, time is the
fourth dimension according to relativity theory. Time is usually written
as t, but time is written as ct in relativity
theory. In other words, we live in a
four-dimensional universe: three spatial dimensions (x, y, and z) and one temporal dimension (ct). Moreover, these four dimensions mix into one
another, and the mixing of the temporal dimension with the three spatial
dimensions is the fundamental cause of time dilation, length contraction, the
invariance of c, the universal speed
limit of c, and even the mass-energy
relation. Minkowski
invented a new word to describe our four-dimensional universe. Minkowski took the
word space and the word time, and he put them together to form a new word: spacetime. Notice
that there is no space or even a hyphen between the two words used to construct
this new word. To summarize Einstein’s
Special Relativity theory, we live in a four-dimensional spacetime
with three spatial dimensions (x, y, and z) and one temporal dimension (ct) that all mix into one another
thus causing time dilation, length contraction, the invariance of c, the universal speed limit of c, and the mass-energy relation.
Fictitious forces or pseudoforces are forces that do not actually exist; they
only seem to exist in certain frames of reference. For example, suppose we are in a stationary
car waiting at a red traffic light. When
the red traffic light turns green, we place our foot upon the car’s accelerator
pedal. As the car accelerates forward,
everyone and everything in the car feels a backward force. We actually feel ourselves pulled backward
into the backrest of our chair. Anything
hanging from the rearview mirror also swings backward. This backward force is a fictitious force or
a pseudoforce.
It does not exist; it only seems to exist within the car as the car
accelerates forward. Although everyone
and everything within the car feels this backward force, it nevertheless does
not actually exist. In actuality,
everyone and everything within the car remains stationary for a moment as the
car and its chairs accelerate forward, and hence the backrests of the chairs
accelerate forward and collide with our own backs. This is amusing: within the car we feel pulled backward into the backrests of the
chairs, but in actuality we remain stationary while the backrests of the chairs
accelerate forward into our backs!
Although we feel a backward force within the car, we nevertheless
conclude that this backward force is a fictitious force or a pseudoforce. It does
not actually exist; it only seems to exist within the car as the car
accelerates forward. As another example,
suppose we are in a moving car when we see a green traffic light turn yellow,
and so we place our foot upon the car’s brake pedal. As the car slows down, everyone and
everything in the car feels a forward force.
We actually feel ourselves pulled forward off of
the backrest of our chair. Anything
hanging from the rearview mirror also swings forward. In extreme cases, we may feel pulled forward
so strongly that our heads may collide with the windshield. This forward force is a fictitious force or a
pseudoforce.
It does not exist; it only seems to exist within the car as the car
slows down. Although everyone and
everything within the car feels this forward force, it nevertheless does not
actually exist. In actuality, everyone
and everything within the car remains in motion for a moment as the car and its
chairs and its windshield slow down, and hence the
backrests of the chairs move away from our own backs while the windshield moves
toward our heads. This is amusing:
within the car we feel pulled forward off of the
backrests of the chairs and toward the windshield, but in actuality the
backrests of the chairs move away from our backs and the windshield moves
toward our heads! Although we feel a
forward force within the car, we nevertheless conclude that this forward force
is a fictitious force or a pseudoforce. It does not actually exist; it only seems to
exist within the car as the car slows down.
As yet another example, suppose we are in a moving car when we see that
the highway ramp ahead curves to the left, and so we turn the steering wheel to
the left so that the car will remain on the highway ramp. As the car turns left, everyone and
everything in the car feels a rightward force.
We actually feel ourselves pulled rightward away from the driver’s side
of the car and toward the passenger’s side of the car. Anything hanging from the rearview mirror
also swings rightward and continues to remain suspended rightward in apparent
defiance of the Earth’s downward gravity as the car turns left! This rightward force is a fictitious force or
a pseudoforce.
It does not exist; it only seems to exist within the car
as the car turns left. Although everyone
and everything within the car feels this rightward force, it nevertheless does
not actually exist. In actuality,
everyone and everything within the car remains in forward motion as the car
turns left, and hence the driver’s side of the car turns away from us while the
passenger’s side of the car turns toward us.
This is amusing: within the car we feel pulled
rightward toward the passenger’s side of the car, but in actuality we remain in
forward motion while the passenger’s side of the car turns leftward toward
us! Although we feel a rightward force
within the car, we nevertheless conclude that this rightward force is a
fictitious force or a pseudoforce. It does not actually exist; it only seems to
exist within the car as the car turns left. As a fourth example, projectiles will appear
to suffer from deflections within a rotating frame of reference. This deflecting force is a fictitious force
or a pseudoforce.
It does not exist; it only seems to exist within the rotating frame of
reference. In actuality, the projectiles
are not deflected; the projectiles in fact continue
moving along straight paths. The frame
of reference is rotating, and the rotation of the entire frame of reference
seems to cause projectiles to deviate from straight trajectories. This particular fictitious force or pseudoforce is called the Coriolis
force, named for the French physicist Gaspard-Gustave de Coriolis who first
derived the mathematical equations describing this particular fictitious force
or pseudoforce.
The Coriolis force appears to cause rightward deflections in frames of
reference rotating counterclockwise, and the Coriolis force appears to cause
leftward deflections in frames of reference rotating clockwise. The Coriolis force appears to cause stronger
deflections if the frame of reference is rotating faster and appears to cause
weaker deflections if the frame of reference is rotating slower. The Coriolis force appears to vanish if the
frame of reference stops rotating. The
Coriolis force only appears to cause deflections; it does not cause projectiles
to speed up or slow down.
As we discussed earlier in
the course, physicists use the word acceleration for the rate at which an
object’s motion changes, where the object could be suffering from any change in
motion whatsoever. An object that is
speeding up is said to be accelerating, but an object
that is slowing down is also said to be accelerating. (In colloquial English, we would use the word
decelerating instead.) Moreover, an
object that is neither speeding up nor slowing down but only changing the
direction that it moves is also said to be
accelerating. In all four of our
examples of fictitious forces or pseudoforces, notice
that the frame of reference is accelerating.
In the first example, the car was accelerating forward. In the second example, the car was slowing
down, which again is a form of acceleration.
In the third example, the car was changing the direction that it was moving,
which again is a form of acceleration.
In our fourth example, the entire frame of reference was rotating, which
is also a form of acceleration. A frame
of reference where there are no fictitious forces or pseudoforces
is called an inertial frame of reference, while a
frame of reference where fictitious forces or pseudoforces
appear to exist is called a non-inertial frame of reference. It is not difficult to prove mathematically
that all inertial frames of reference (where there are no
fictitious forces or pseudoforces) are not
accelerating relative to one another. It
is also not difficult to prove mathematically that all non-inertial frames of
reference (where fictitious forces or pseudoforces
appear to exist) are accelerating relative to all inertial frames of reference,
as our four examples illustrate. Since
fictitious forces or pseudoforces appear to exist
within non-inertial (accelerating) frames of reference, the laws of physics
require exotic modifications when used within non-inertial frames of
reference. Since there are no fictitious
forces or pseudoforces within inertial
(non-accelerating) frames of reference, the laws of physics do not require
these exotic modifications when used within inertial frames of reference. More plainly, the laws of physics apply
naturally from within inertial (non-accelerating) frames of reference, but the
laws of physics do not naturally apply from within non-inertial (accelerating)
frames of reference. All of the laws of
physics we have discussed thus far in this course apply naturally from within
inertial (non-accelerating) frames of reference. In particular, Galilean-Newtonian Relativity
theory, Newton’s laws of motion, Newton’s theory of gravitation, Maxwell’s
electromagnetic theory, Quantum Mechanics, and even Einstein’s Special
Relativity theory all apply naturally from within inertial (non-accelerating)
frames of reference. All
of the laws of physics we have discussed thus far in this course do not
apply naturally from within non-inertial (accelerating) frames of
reference. Note that this is why
Einstein’s Special Relativity theory is called Special
Relativity. This theory only applies
naturally from within special frames of reference, inertial (non-accelerating)
frames of reference, just as all the laws of physics we have discussed thus far
in this course apply naturally from within inertial (non-accelerating) frames
of reference.
Einstein was
extremely bothered by this restriction upon the laws of physics, in
particular upon his Special Relativity theory.
If the laws of physics are the mathematical equations that describe the
universe, then we should feel free to apply them from within any frame of
reference whatsoever. Consequently,
Einstein realized that he must generalize his Special Relativity theory to a
new theory of physics that could be applied from within any frame of reference
whatsoever, whether inertial (non-accelerating) or non-inertial
(accelerating). This new more general
theory Einstein called General Relativity theory, since it is more general than
his Special Relativity theory and indeed more general than all other laws of
physics. Einstein insisted that this new
General Relativity theory must apply naturally from within not only inertial
(non-accelerating) frames of reference but from within
non-inertial (accelerating) frames of reference as well. Fictitious forces or pseudoforces
appear to act upon all objects from within non-inertial (accelerating) frames
of reference. Einstein then realized that there is another force that acts upon all objects:
gravitation. Einstein began to imagine
that fictitious forces or pseudoforces must act like
gravitational forces, and therefore his General Relativity theory must
ultimately be a theory of gravity. To
illustrate how fictitious forces or pseudoforces act
like gravitational forces, consider a spaceship far from all stars and planets
or any other large gravitating objects.
The astronauts within this spaceship would feel weightless as long as
the spaceship were not accelerating.
However, suppose the spaceship had sufficient fuel to thrust the
spaceship, causing an acceleration.
While the spaceship accelerates, everyone and everything within the
spaceship would feel fictitious forces or pseudoforces,
and hence these fictitious forces or pseudoforces
would feel like gravitational forces, even though the spaceship is far from all
stars and planets or any other large gravitating objects. In fact, if the spaceship had sufficient fuel
to thrust the spaceship with an acceleration of 9.8 meters per second per
second, then the astronauts would feel the same gravity within the spaceship
that they would feel if they were standing on the surface of the Earth. As long as the spaceship has sufficient fuel
to continue to accelerate the spaceship, everyone and everything within the
spaceship would feel gravity as if they were standing on Earth instead of in a
spaceship in outer space! This example
persuades us that we can turn gravity on within non-inertial (accelerating)
frames of reference. We can also turn
gravity off within non-inertial (accelerating) frames of reference. For example, suppose we are standing within
an elevator on planet Earth. Now suppose
the elevator cable breaks, causing the elevator to fall. We present two arguments to persuade us that
everyone and everything within this falling elevator would feel weightless
while falling. Firstly, everything falls
toward the Earth with the same acceleration ignoring non-gravitational forces
such as air resistance, as we discussed earlier in the course. Hence, everyone and everything within the
elevator accelerates downward together.
Consequently, if we were to take our keys out of our pocket for example
and let go, our keys would not appear to fall down but would instead appear to simply float in front of us, since we ourselves and our
keys and everything within the elevator are accelerating downward along with
the elevator with the same acceleration.
Secondly, since the elevator is accelerating downward, it is a
non-inertial frame of reference.
Therefore, everyone and everything within the elevator should feel a
fictitious force or a pseudoforce upward that would
exactly cancel the Earth’s downward gravity.
These two arguments persuade us that everyone and everything within the
falling elevator feels weightless. More
generally, gravity is always turned off within all
freely falling frames of reference.
Caution: just as physicists use the word acceleration for any change in
motion whatsoever, physicists use the term freely falling for any frame of
reference moving only under the influence of gravity. Someone who is falling downward is said to be freely falling, but someone who is shot upward
out of a cannon is also said to be freely falling even while they are moving
upward. Someone who is shot out of a cannon at an angle is also said to be freely falling
even though they are moving along a trajectory that at first takes them upward
and then later takes them downward.
Moons orbiting planets are freely falling even if the moon and the planet
are not actually approaching each other.
Planets orbiting stars are also freely falling even if the planet and
the star are not actually approaching each other. In all such cases, gravity is
turned off within freely falling frames of reference. For example, astronauts feel weightless while
orbiting the Earth even though astronauts almost always
orbit close enough to the Earth that its gravity is essentially as strong as
the gravity on the surface of the Earth.
As a counterintuitive example of this principle, consider a spaceship
falling toward a planet. Most students
believe that the astronauts within the spaceship would feel stronger and
stronger gravity as their spaceship approaches the planet, but this is false. In actuality, the astronauts feel weightless
during their entire journey falling toward the planet, since they are in a
freely falling frame of reference.
Assuming the planet has no atmosphere that would slow the spaceship down
or burn the spaceship up, the astronauts within the spaceship would feel
weightless during their entire journey, right up to the moment just before they
crash upon the planet. Other astronauts
right next to the crash site who are standing upon the planet feel the planet’s
gravity, but the astronauts within the spaceship feel weightless, even
immediately before crashing even though they are right next to the other
astronauts standing upon the planet who do feel the planet’s gravity! Einstein struggled for roughly ten years to
mathematically express all of these ideas, and in the year 1915
he finally formulated his General Relativity theory. Firstly, this new General Relativity theory
states that we live in a four-dimensional spacetime
with three spatial dimensions and one temporal dimension that all mix into one
another. Although this is precisely what
Special Relativity theory already asserts, this new theory in addition states
that gravity is the curvature of our four-dimensional spacetime. According to Special Relativity theory, our
four-dimensional spacetime has a flat (uncurved) geometry because Special Relativity does not
include the effects of gravity.
According to General Relativity theory, our four-dimensional spacetime has a curved geometry, since this new theory
states that gravity is the curvature of our four-dimensional spacetime. To the
present day, Einstein’s General Relativity is the only theory in all of physics
that places all frames of reference, both inertial (non-accelerating) and
non-inertial (accelerating), on equal footing.
Einstein’s General Relativity theory may be applied
from within any frame of reference whatsoever, whether or not there appear to
be fictitious forces or pseudoforces from within the
frame of reference.
All of the outrageous
conclusions that Einstein deduced from Special Relativity are still true in General Relativity, but these conclusions are even more
outrageous in this new theory. For
example, does time dilation still occur?
Does a clock still run slow when it moves according to General
Relativity theory? Yes, but this effect
is now even worse. According to General
Relativity theory, a clock does not even need to be moving for it to run slow
because gravity itself slows down time!
In particular, stronger gravity will slow down time more, while weaker
gravity will slow down time less. Time
dilation that is caused by motion is called kinematic
time dilation, which is predicted by both Special Relativity theory and General
Relativity theory. However, the slowing
of time by gravity is called gravitational time dilation,
which is predicted only by General Relativity theory. This gravitational time dilation was considered outrageous a century ago, but this effect has
actually been observed in recent decades.
For example, suppose we place one atomic clock on the ground floor of a
building, and suppose we place another atomic clock on the roof of that
building. Even after synchronizing these
two atomic clocks, they do not remain synchronized! The atomic clock on the ground floor is
closer to the Earth and thus feels stronger gravity than the atomic clock on
the roof, which is further from the Earth and thus feels weaker gravity. Therefore, the atomic clock on the ground
floor will run slower and will lag further and further behind the atomic clock
on the roof! Is Einstein actually
claiming that whenever we walk upstairs or downstairs that our clocks are not
synchronized with everyone else’s clocks?
Yes! But then
why do we never notice in our daily experience that all of our clocks read
different times? The Earth’s gravity is
sufficiently weak that this gravitational time dilation is so tiny that we do
not notice it in our daily experience.
Even the Sun’s gravity causes only tiny amounts of this gravitational
time dilation. We would only notice
these temporal changes if we were subjected to
incredibly strong gravity, such as near a neutron star or a black hole. We will discuss black holes in detail
shortly. The implications of this
gravitational time dilation effect are staggering. For example, consider two identical twins who
have lived together on the second floor of a building their entire lives. Hence, they are the same age. However, if one of these twins walks
downstairs to the ground floor, that twin will age a tiny amount slower, since
that twin’s time is now running slower.
After walking back upstairs, that twin will now be a tiny amount younger
than the twin who remained on the second floor!
Our feet are younger than our head, since our feet are a little closer
to the Earth than our head, thus causing our feet to age more slowly! As we discussed, the satellites orbiting the
Earth all move at different speeds, resulting in kinematic time dilation. Moreover, all of the satellites orbiting the
Earth are at various distances from the Earth.
Hence, the satellites orbiting the Earth are subjected
to varying gravitational field strengths from the Earth. Our own mobile telephones are with us on the
surface of the Earth and therefore feel a stronger gravitational field strength
than all satellites in orbit. As a
result, all satellites as well as all of our mobile telephones suffer from
gravitational time dilation. Indeed, the
global positioning system (GPS) would not function correctly without taking
into account both kinematic time dilation and gravitational time dilation.
Just as the vacuum speed of
light c is an invariant according to
Special Relativity theory, the vacuum speed of light c is still an invariant according to General Relativity
theory. If we deduced kinematic time
dilation from the invariance of the vacuum speed of light c in Special Relativity theory, we may deduce kinematic time
dilation from the invariance of the vacuum speed of light c in General Relativity as well.
Length contraction caused by motion is called
kinematic length contraction, in correspondence with kinematic time
dilation. If we deduced kinematic length
contraction from the invariance of the vacuum speed of light c in Special Relativity theory, we may
deduce kinematic length contraction from the invariance of the vacuum speed of
light c in General Relativity as
well. However, this effect is now even
worse. According to General Relativity
theory, an object does not even need to be moving for it to contact because
gravity itself causes length contraction!
In particular, stronger gravity will contract objects more, while weaker
gravity will contract objects less. Just
as the slowing of time by gravity is called
gravitational time dilation, the contraction of space by gravity is called
gravitational length contraction, which is predicted only by General Relativity
theory.
Consider light that is emitted from the roof of a building that propagates to
its ground floor. Because of
gravitational length contraction, the wavelength of the light must contract as
it approaches the ground floor, since the lower floors are closer to the Earth
where gravity is stronger. However, the
light must continue to propagate at the same speed, the vacuum speed of light c.
The speed of any wave with wavelength λ and frequency f is determined by the equation v = f
λ, where v is the speed (the
velocity) of propagation of the wave, as we discussed earlier in the
course. If the speed remains fixed and
if the wavelength is contracted, then the frequency must increase by an
appropriate amount to keep the product of the larger frequency f with the shorter wavelength λ
equal to a fixed speed v
(specifically c in this case). We may interpret this increased frequency as
a blueshift.
Conversely, consider light that is emitted from
the ground floor of a building that propagates to its roof. Because of gravitational length contraction,
the wavelength of the light must become less contracted (hence stretched) as it
approaches the roof, since the higher floors are further from the Earth where
gravity is weaker. However, the light
must continue to propagate at the same speed, the vacuum speed of light c.
Again, the speed of any wave with wavelength λ and frequency f is determined by the equation v = f
λ, where v is the speed (the
velocity) of propagation of the wave. If
the speed remains fixed and if the wavelength is less contracted (hence
stretched), then the frequency must decrease by an appropriate amount to keep
the product of the smaller frequency f
with the longer wavelength λ equal to a fixed speed v (specifically c in this
case). We may interpret this decreased
frequency as a redshift. As we discussed
earlier in the course, motion causes the Doppler-Fizeau
shift for any wave to occur. We now
rename this Doppler-Fizeau shift the kinematic
redshift (as well as the kinematic blueshift). The kinematic redshift (and blueshift) is predicted by both
Special Relativity theory and General Relativity theory. However, we have just presented an argument
for the gravitational redshift (as
well as the gravitational blueshift), which is predicted only by General Relativity
theory. More precisely, light that
propagates from stronger gravitational fields toward weaker gravitational
fields suffers from a gravitational redshift, while light that propagates from
weaker gravitational fields toward stronger gravitational fields suffers from a
gravitational blueshift. This gravitational redshift (and
gravitational blueshift) has
actually been observed. When an
electron in an atom undergoes a transition from a higher-energy quantum state
to a lower-energy quantum state, it must emit a photon with a specific
frequency and a specific wavelength in accordance with the spectrum of the
atom, as we discussed earlier in the course.
If an atom on the ground floor of a building emits a
photon that propagates toward the roof of the building, anyone on the roof will
measure the frequency of that photon to be lower (or its wavelength to be
longer) than the photon emitted from the same transition in an identical atom
that happens to be located at the roof instead! Conversely, if an atom on
the roof of a building emits a photon that propagates toward the ground floor
of the building, anyone on the ground floor will measure the frequency of that
photon to be higher (or its wavelength to be shorter) than the photon emitted
from the same transition in an identical atom that happens to be located at the
ground floor instead! Of course,
the Earth’s gravity is sufficiently weak to cause only tiny amounts of this
gravitational redshift (and gravitational blueshift). Even the Sun’s gravity causes only tiny
amounts of this gravitational redshift (and gravitational blueshift). This gravitational redshift (and
gravitational blueshift) only becomes severe with
incredibly steep changes in gravity, such as near a neutron star or a black
hole. Again, we will discuss black holes
in detail shortly. Einstein’s General
Relativity theory actually predicts a third type of redshift caused by the
expansion of the universe called cosmological redshift, as we will discuss
toward the end of the course. In
summary, Einstein’s General Relativity theory predicts three different types of
redshift: kinematic redshift caused by motion, gravitational redshift caused by
the curvature of spacetime, and cosmological redshift
caused by the expansion of the universe.
All three of these redshifts have been observed
for several decades, providing further evidence of the validity of Einstein’s
General Theory of Relativity.
Just as the vacuum speed of
light c is the speed limit of the
universe according to Special Relativity theory, the vacuum speed of light c is still the speed limit of the
universe according to General Relativity theory. If c
is still the speed limit of the universe, then nothing can move faster than that
speed. We now realize that not even
gravity can move faster than c! In fact, Einstein’s General Relativity theory
states that gravity itself moves at the speed c, just as light moves at the speed c. The implications of this
are shocking. For example, suppose that
the Sun were abruptly removed from the Solar System
right now at this very moment. Since it
takes light roughly eight minutes to propagate from the Sun to the Earth, we
would continue to see the Sun shining in the sky for roughly eight minutes
after its removal. Then, we would see
the Sun removed. Furthermore, since
gravity also propagates at the same vacuum speed of light c, the Earth would continue moving along its elliptical orbit as if
the Sun still attracted it for roughly eight minutes after the Sun’s
removal! Then, the Earth would
gravitationally feel that the Sun has been removed and would finally move off of its elliptical orbit!
As we discussed earlier in
the course, light is electromagnetic radiation or electromagnetic waves. More precisely, light is a propagating
disturbance through an electromagnetic field.
If gravity also moves at the same speed c, then there must be gravitational waves that are propagating
disturbances through a gravitational field.
According to Einstein’s General Relativity theory, gravity is actually
the curvature of our four-dimensional spacetime, and
hence this new theory predicts the existence of gravitational waves that are
propagating disturbances through the curvature of our four-dimensional spacetime. In the
year 1974, the American astrophysicists Russell Alan Hulse
and Joseph Hooton Taylor discovered a binary neutron star system. These two neutron stars are orbiting
sufficiently close to each other and orbiting sufficiently fast that they
should be radiating significant amounts of gravitational waves. As these two neutrons stars radiate
gravitational waves, they must lose orbital energy, since energy must be conserved.
Therefore, these two neutrons stars must approach each other. Indeed, Hulse and
Taylor measured that these two neutron stars are approaching each other by the
precise amount that Einstein’s General Relativity theory predicts. Hulse and Taylor
received the Nobel Prize for their achievement, and this binary neutron star
system was named the Hulse-Taylor
system in their honor. Nevertheless,
this is not a direct detection of the gravitational wave itself. A direct detection of gravitational waves
would require extraordinarily sensitive measurements of varying time dilation
and varying length contraction as the crests and the troughs of the
gravitational waves pass through the detector.
The technology to make such measurements was not
achieved until the year 2015, the one hundredth anniversary of
Einstein’s General Relativity theory!
Ever since that historic year, astronomers have directly detected
several gravitational waves passing through the Earth. Most of these gravitational waves that
astrophysicists have directly detected since the year 2015 were
radiated from the collision and merger of binary black holes into single
black holes in distant galaxies. This is
a splendid manifestation of Einstein’s genius.
His theory predicted the existence of gravitational waves, but it took
one century for technologies to be developed that could directly detect their
existence! Just as there is an entire
Electromagnetic Spectrum of different wavelengths or frequencies of
electromagnetic waves, there is an entire Gravitational Spectrum of different
wavelengths or frequencies of gravitational waves. Although astrophysicists have spent decades
observing the universe using electromagnetic waves from different bands across
the Electromagnetic Spectrum to gain a more complete understanding of our
universe, astrophysicists have just barely begun to observe the universe using
gravitational waves from different bands across the Gravitational
Spectrum. A completely new window has now been opened for astrophysicists to explore to gain
an even more complete understanding of our universe.
If a certain amount of mass
were concentrated into a single mathematical point of zero volume, then this
point-mass would have infinite density.
According to General Relativity theory, the gravity near this point-mass
would be incredibly strong, since the curvature of the four-dimensional spacetime near this point-mass would be incredibly
severe. The gravity would be so strong
because of the severe spacetime curvature that an
object too close to this point-mass would need to move faster than c to escape its gravity, but moving
faster than c is
forbidden. Mathematically, there
is a sphere surrounding this point-mass that marks the boundary of no
return. An object outside of this
mathematical sphere may have hope of escaping the gravity of the point-mass,
but an object that crosses inside of this mathematical sphere would have no
hope of escaping the incredibly strong gravity of the point-mass. The object’s light would not even be able to
escape from within the mathematical sphere.
Thus, it would appear as if the object disappeared from our universe, as
if it fell into a hole. This hole would
appear black, since light cannot escape from within the mathematical
sphere. So,
objects falling toward the infinite-density point-mass would appear as if they
are falling into a hole that is black.
For several decades, these fantastic objects have been
called black holes. The center of
the black hole is its singularity, the point-mass of infinite density. The mathematical sphere surrounding the
singularity is the event horizon, the boundary of no return. The radius of the event horizon is sometimes called the black hole radius but is more often
called the Schwarzschild radius, named for the German physicist Karl
Schwarzschild who mathematically derived the simplest black-hole solution to
Einstein’s General Relativity theory. Karl Schwarzschild derived the following equation for the
Schwarzschild radius (black hole radius) of the event horizon of a black hole: rs = 2GM / c2,
where rs
is the Schwarzschild radius (black hole radius) of the event horizon, G is Newton’s gravitational constant of
the universe, M is the mass of the
black hole, and c is as usual the
vacuum speed of light. Using this
equation, we can easily calculate that the Schwarzschild radius (black hole
radius) of a typical stellar black hole born from the Type II supernova of a
very high mass star is very roughly eight kilometers. So, any object further than very roughly
eight kilometers from the singularity of such a black hole may have hope of
escaping its gravity, but any object closer than this distance from the
singularity of such a black hole has crossed the event horizon and has no hope
of escaping its gravity.
Nothing can escape from
within the event horizon of a black hole, not even light. Hence, the event horizon of a black hole
appears black, hence the name black hole.
Many students believe that outer space is also black, thus preventing us
from ever imaging the event horizon of a black hole against the surrounding
space. Although this would indeed be the
case for a completely isolated black hole, in actuality diffuse gas fills the
entire universe. Hence, it should be
possible to see the black event horizon of a black hole against the gases of
the surrounding space. Some black holes
have accretion disks around them, such as in X-ray binaries, and it should be
possible to see the black event horizon of such a black hole against the
surrounding accretion disk.
Unfortunately, the Schwarzschild radius of a typical stellar black hole
is very roughly eight kilometers, and no telescope is large enough to provide
sufficient resolution (magnification) to image such a small size, even if the
black hole resided as near as within the solar neighborhood. However, there are supermassive black holes
in our universe, as we will discuss later in the course. The Schwarzschild radius of a typical
supermassive black hole is at least one million kilometers! As we discussed earlier in the course, two
telescopes on opposite sides of planet Earth used together as a single
interferometer would in principle have the same
resolving power as a single telescope the size of planet Earth. Using many radio telescopes working together
as a single interferometer, astronomers succeeded in the year 2019 in imaging
the event horizon of a supermassive black hole at the center of a distant
galaxy. Although this galaxy is roughly
sixteen megaparsecs (roughly fifty million
light-years) distant, the supermassive black hole at its center has a
Schwarzschild radius of roughly fifteen billion kilometers. This is roughly the size of our Solar System,
from the Sun all the way out to the Kuiper belt just beyond the orbit of
Neptune! In the radio images of this
supermassive black hole, we actually see its black event horizon against the
gases of the accretion disk that surrounds the supermassive black hole. In the year 2022, astronomers published a
radio image of the event horizon of the supermassive black hole at the center
of our own Milky Way Galaxy. Again, this image was produced by many radio telescopes working together
as a single interferometer.
Although astronomers have
been certain for decades that black holes actually exist, the first black hole
ever discovered, the compact object in the X-ray binary Cygnus X-1, was not discovered until after Einstein died. In fact, Einstein himself did not believe
that these strange objects actually existed in our universe. Nevertheless, even while Einstein was alive,
physicists recognized an important application of Karl Schwarzschild’s
mathematical discovery of this point-mass solution (black hole solution) to
Einstein’s General Relativity theory.
The spacetime curvature (the gravity) outside of an isotropic (spherically
symmetric) distribution of mass is exactly the same as
if all of its mass were concentrated at its center. More plainly, the spacetime
curvature (the gravity) outside of a
spherical distribution of mass (beginning at its surface and extending outward)
would be the same spacetime curvature (the same
gravity) as a black hole of the same total mass placed at the center of the
spherical distribution. This statement is called the Birkhoff theorem,
named for the American mathematician George David Birkhoff
who first mathematically proved this important result. As an application of the Birkhoff
theorem, the Earth is a spherical distribution of mass to an excellent
approximation. Therefore, the gravity outside of the Earth (beginning at its
surface and extending outward) is nearly exactly equal to the gravity of a
black hole with the same mass as the Earth placed at the center of the
Earth. Students often ask for a
description of how the gravity of a black hole would feel if we were near the
black hole but still outside its event horizon.
According to the Birkhoff theorem, every moment
of our lives we feel nearly precisely the same gravity from the Earth as we
would feel from a black hole equal in mass to the Earth and placed at the
center of the Earth, roughly 6400 kilometers beneath our feet! As another application of the Birkhoff theorem, the Sun is also a spherical distribution
of mass to an excellent approximation.
Therefore, the gravity outside
of the Sun (beginning at its surface and extending outward) is nearly exactly
equal to the gravity of a black hole with the same mass as the Sun placed at
the center of the Sun. More plainly, the
gravity with which the Sun attracts the planets and everything else in the
Solar System is nearly exactly the same gravity as a black hole with the same
mass as the Sun placed at the center of the Sun. Although the Sun is a low mass star and will
never suffer from a supernova, imagine that the entire Sun were to suddenly
collapse into a black hole with no change in mass. Most students believe that its gravity would
now become so strong that it would begin to suck in the planets one by one,
beginning with Mercury then Venus then Earth and so on and so forth, but this
is false. According to the Birkhoff theorem, the Sun already creates nearly the same
gravity as a black hole of the same mass placed at its center. Therefore, virtually nothing would happen to
the orbits of the planets if the Sun were to collapse into a black hole. The planets would continue to orbit that black
hole with almost precisely the same orbits that they have always enjoyed! Of course, all of the planets would also
begin to cool, since they would no longer receive any light from this
hypothetically collapsed Sun. All of us
on Earth would eventually freeze to death, although we would die long before
then, since nearly all life on Earth depends entirely upon sunlight, as we
discussed earlier in the course.
Nevertheless, the orbits of the planets as well as the orbits of
everything else in the Solar System would remain almost
exactly the same. Note that the Birkhoff theorem is also true in Newton’s theory of
gravitation: the gravity outside of a
spherical distribution of mass (beginning at its surface and extending outward)
would be the same gravity as a point-mass of the same total mass placed at the
center of the spherical distribution.
For example, if we were to calculate the gravitational force between the
Earth and the Moon according to Newton’s theory of gravitation, we would use
Newton’s law of universal gravitation: ,
where r is the distance between the
Earth and the Moon, as we discussed earlier in the course. However, what value do we use for this
distance? After all, different parts of
the Earth are different distances from different parts of the Moon. However, both the Earth and the Moon are
spherical distributions of mass to an excellent approximation. Therefore, the gravity outside of each of them (beginning at their surfaces and extending
outward) is nearly exactly equal to the gravity of point-masses placed at their
centers, with the same masses as the Earth and the Moon of course. Hence, the Birkhoff
theorem reveals that we must always use the center-to-center distance whenever
calculating the gravitational force between the Earth and the Moon or between
almost any pair of objects in the entire universe.
Today, we know that black
holes do actually exist in our universe, and we also
know that black holes form when the core of a very high mass star is able to
overcome neutron degeneracy pressure, since its mass is greater than the Tolman-Oppenheimer-Volkoff
limit. If neutron degeneracy pressure
cannot halt the collapse of the core, then nothing can halt the collapse of the
core. The core continues collapsing all
the way down to a mathematical point, a black hole. This is the ultimate triumph of gravity. This reveals another interpretation of the
black hole radius (the Schwarzschild radius).
As we discussed, the equation for the black hole radius (the Schwarzschild
radius) is rs
= 2GM / c2. Notice that the only variable that determines
the black hole radius (the Schwarzschild radius) according to this equation is
the mass M, since G (Newton’s gravitational constant of
the universe) and c (the vacuum speed
of light) and certainly 2 are all fixed numbers. Therefore, there is nothing stopping us from
calculating the Schwarzschild radius of any object in the universe, not just
black holes. For example, we can easily
calculate that the Schwarzschild radius of our Sun is roughly three kilometers. Many students protest this calculation, since
the Sun is not a black hole and will in fact never become a black hole, since
the Sun is a low mass star.
Nevertheless, there is nothing stopping us from using the mass of the
Sun in this Schwarzschild radius equation.
If the Sun is not a black hole and will in fact never become a black
hole, then how do we interpret this three-kilometer Schwarzschild radius for
the Sun? Firstly, we note that the
actual radius of the Sun (roughly seven hundred thousand kilometers) is much
much larger than its Schwarzschild radius (roughly three kilometers). As a result, the Sun’s gravity is weak as compared to what the Sun’s
gravity would be if we could collapse it into a black
hole. Furthermore, this three-kilometer
Schwarzschild radius for the Sun would be the size we would need to crush the
Sun down into before its own self-gravity became strong enough to crush itself
all the way down into a black hole. In
other words, if we were to crush the entire mass of the Sun down to a radius of
less than three kilometers, the Sun would not be able to escape from its own
self-gravity, and the Sun would crush itself all the way down to a black
hole. The Earth’s Schwarzschild radius
is roughly nine millimeters. The actual
radius of the Earth is nearly 6400 kilometers, which is much much larger than
nine millimeters. Hence, the Earth’s
gravity is weak as compared to what
the Earth’s gravity would be if we could collapse it into a black
hole. Furthermore, this nine-millimeter
Schwarzschild radius for the Earth would be the size we would need to crush the
Earth down into before its own self-gravity became strong enough to crush
itself all the way down into a black hole.
In other words, if we were to crush the entire mass of the Earth down to
a radius of less than nine millimeters, the Earth would not be able to escape
from its own self-gravity, and the Earth would crush itself all the way down to
a black hole. Note that nine millimeters
is almost ten millimeters, which is equal to one centimeter. This would be a Schwarzschild diameter of roughly two centimeters,
which is roughly one inch since one inch is exactly 2.54 centimeters. Therefore, we would have to crush the entire
mass of the Earth down to a size of roughly one inch to turn it into a black
hole! The Schwarzschild radius of a
typical human is ten billion times smaller than the nucleus of an atom! In other words, we would have to crush our
bodies down to this size before our own self-gravity becomes strong enough to
crush our bodies all the way down into a black hole!
Einstein was
ridiculed for Special Relativity theory, and he was even more harshly
ridiculed for General Relativity theory, but it would only be a couple years
after he proposed General Relativity theory that his theories would be
successfully tested. For example,
although the orbits of the planets around the Sun are ellipses, those
elliptical orbits do not remain fixed. A
planet’s elliptical orbit around the Sun actually suffers a very slow orbital
precession. This orbital precession is caused mostly by the gravitational tugs of the other
planets, primarily Jupiter as we discussed earlier in the course. However, Mercury’s orbit has a very tiny
anomalous orbital precession that could not be explained
using Newton’s theory of gravity. The
amount of Mercury’s anomalous orbital precession is roughly forty-three arcseconds per century.
This is a fantastically tiny orbital shift. According to General Relativity theory, the
curvature of spacetime caused by the Sun’s gravity
causes a planet’s orbit to suffer a tiny orbital precession in addition to the
orbital precession caused by the gravitational tugs from the other planets,
primarily Jupiter. This tiny extra
precession is called general-relativistic orbital
precession. When Einstein calculated the
amount of this general-relativistic orbital precession for Mercury’s orbit
caused by the spacetime curvature of the Sun’s
gravity, he obtained exactly forty-three arcseconds
per century! Although this superb
achievement convinced Einstein that his General Relativity theory was superior
to Newton’s theory of gravity, most physicists were still not convinced, and so
Einstein proposed the following experiment.
According to his General Relativity theory, everything in the universe
feels gravity, including light itself.
More correctly, the curvature (the gravity) of our four-dimensional spacetime deflects the trajectory of everything, including
light itself. More plainly, light falls
in gravity just as everything else falls in gravity. In recent decades, astronomers have observed
that the gravity of galactic clusters bends the light from distant galaxies,
thus distorting the image of the distant galaxies. Since this is rather like the glass of a lens
bending light, the gravity of a galactic cluster is called
a gravitational lens. If the distant
galaxy, the gravitational lens, and our own Milky Way Galaxy happen to form a
nearly straight line, the light of the distant galaxy bends into the shape of a
ring around the galactic cluster. This is called an Einstein ring.
If the gravitational lens happens to be slightly displaced from the line
connecting the distant galaxy and our own Milky Way Galaxy, the light of the
distant galaxy bends into two duplicate images in the shape of arcs around the
galactic cluster. These are called Einstein arcs.
If the gravitational lens happens to be even more displaced from the
line connecting the distant galaxy and our own Milky Way Galaxy, the light of
the distant galaxy bends into four duplicate images around the galactic
cluster. This is
called an Einstein cross. The
amount by which light is deflected by the Earth’s weak
gravity is fantastically tiny. This is
why we do not notice light falling downward in our daily experience. The Sun’s gravity is stronger than the
Earth’s gravity, but even the Sun’s gravity is so weak that no one ever noticed
the deflection of light around the Sun before Einstein. Using his General Relativity theory, Einstein
calculated that light should be deflected by roughly
1.75″ (1.75 arcseconds or 1.75 seconds of arc)
around the surface of the Sun. Although
this is an incredibly small angle, it was measurable a century ago. However, we cannot see any stars in the daytime,
besides the Sun of course! So, measuring the deflection of starlight around the surface
of the Sun seemed hopeless. However,
during the totality of a total solar eclipse, the sky becomes sufficiently dark
that stars become visible, as we discussed earlier in the course. A total solar eclipse was scheduled to occur
in the year 1919, and many physicists gathered for this eclipse for the purpose of proving Einstein wrong. When totality occurred, starlight was indeed
deflected around the surface of the Sun, and astronomers measured the
deflection to be 1.75″ (1.75 arcseconds or 1.75
seconds of arc), in precise agreement with Einstein’s General Relativity
theory! Practically overnight, Einstein
went from being ridiculed to being considered one of
the most brilliant men, if not the most brilliant man, who ever lived. A mediocre physicist who struggled with
mathematics had discovered correct theories of our universe using only the
power of his own genius.
Einstein once claimed that
when he studied physics, “I to know how God created this world … I want to know
His thoughts.” In other words, Einstein
believed that God not only created the universe, but God also authored the
mathematical equations that describe the universe, the laws of physics. Moreover, Einstein believed that God authored
a single ultimate mathematical equation that completely describes the
universe. If the universe is described by a single ultimate mathematical equation,
Einstein believed that that equation should be deducible from pure logic, from
pure mathematics. Einstein once claimed
that when he studied physics that he wanted to know “whether God had any choice
in the creation of the world.” In other
words, Einstein believed that if the ultimate mathematical equation that
describes the universe is deducible from pure logic, from pure mathematics,
then the laws of the universe could not possibly be different from what they
actually are. Einstein also believed
that this ultimate equation must be mathematically beautiful, just as he
believed his own General Relativity theory was mathematically beautiful. In fact, Einstein claimed that if the
deflection of starlight around the Sun was not as his
theory predicted, then he would have “pitied the Lord because it would have
proven that He did not create the universe correctly.” Although this quotation seems to imply that
Einstein was so arrogant that he believed himself to be more intelligent than
God, this quotation actually reveals that Einstein was
humbled by the mathematical beauty of the universe that God created. This is summarized by another quotation by
Einstein, “Subtle is the Lord, but malicious He is not.” In other words, the laws of physics that
describe the universe may not be obvious and thus may require a genius to
discover them, but God is not evil and hence God would not create a universe
that was so complicated that humans would not be able to discover the
mathematical equations that describe it.
On the other hand, Einstein also once said, “The most incomprehensible
thing about the universe is that it is comprehensible.” In other words, why did God decide to create
the universe governed by beautiful mathematical equations? We could ask this question the other way
around. What is it about
the human mind that it is able to not only study and to understand the universe
but beyond this to actually discover the mathematical equations that describe
the universe? What is it about a genius
such as Albert Einstein that he is able to capture the mind of God, to actually discover the mathematical equations that God
authored when He created the universe?
Einstein spent the last few decades of his life living in New Jersey
trying to discover the ultimate theory of the universe that he believed God authored
when He created the universe. Today, we
would call such a theory a Super Unification Theory or a Theory of Everything,
as we will discuss toward the end of the course. Einstein did not succeed in his quest, but
other physicists in the decades after Einstein have succeeded in bringing us
much closer to this ultimate theory than Einstein could have ever dreamt, as we
will also discuss toward the end of the course.
Nevertheless, many physicists agree that no other person single-handedly
advanced our understanding of the universe more than Albert Einstein.
Libarid A. Maljian homepage at the Department of Physics at CSLA at NJIT
Libarid A. Maljian profile at the Department of Physics at CSLA at NJIT
Department of Physics at CSLA at NJIT
College of Science and Liberal Arts at NJIT
New Jersey Institute of Technology
This webpage was most recently modified on Wednesday, the twentieth day of November, anno Domini MMXXIV, at 03:45 ante meridiem EST.