This is one of the webpages of Libarid A. Maljian at the Department of Physics at CSLA at NJIT.
New Jersey Institute of Technology
College of Science and Liberal Arts
Department of Physics
Introductory Astronomy and Cosmology
Spring 2023
Third Examination lecture notes
Our Star, the Sun
The Sun is a star. We know more about the Sun than any other
star, since it is by far the closest star to the Earth. The Sun is roughly one hundred and fifty
million kilometers from the Earth. This
seems distant by human standards, but in fact this is
extremely close by astronomical standards.
The nearest stars besides the Sun are more than two hundred thousand
times further from the Earth as compared with the Sun. Therefore, the Sun is indeed extremely close
to the Earth by astronomical standards, enabling astrophysicists to learn much
more about the Sun than any other star in the universe.
Astrophysicists use the
symbol R☉ to
denote the radius of the Sun, as we discussed earlier in the course. In fact, the radius of the Sun R☉ is such a fundamental unit in stellar astrophysics
that it is called a solar radius. The Sun is enormous; one solar radius R☉ is roughly equal to seven hundred thousand
kilometers. This is roughly one hundred
times the Earth’s radius. In other
words, one solar radius R☉ is
roughly equal to 100R⊕, where
astrophysicists use the symbol R⊕ to
denote the radius of the Earth, as we discussed earlier in the course. Since one solar radius R☉ is roughly equal to 100R⊕, the Sun
has a volume roughly one million times the Earth’s volume, since the volume of
a sphere is directly proportional to the cube of its radius and one hundred
cubed is one million. In other words, we
could fit roughly one million Earths inside the Sun! Astrophysicists use the symbol M☉ to denote the mass of the Sun, as we discussed
earlier in the course. In fact, the mass
of the Sun M☉ is such
a fundamental unit in stellar astrophysics that it is called
a solar mass. The mass of the Sun is
tremendous; one solar mass M☉ is
roughly one thousand times the mass of Jupiter, which is itself more massive
than the rest of the mass of the Solar System combined. Therefore, one solar mass M☉ is roughly one thousand times the mass of the rest of
the Solar System combined. More
precisely, one solar mass M☉ is
roughly two nonillion kilograms. (Please
refer to the following multiplication table, where each number is one thousand
times the previous number: one, one thousand, one million, one billion, one
trillion, one quadrillion, one quintillion, one sextillion, one septillion, one
octillion, one nonillion, one decillion.
Caution: this multiplication table is only correct in American
English. These same words are used for different numbers in British English.) Astrophysicists have determined the mass of
the Sun using Kepler’s third law. There
are eight planets and millions of asteroids and millions of comets orbiting the
Sun, and astrophysicists use their orbital parameters together with Kepler’s
third law to determine the mass of the Sun.
Even though all these different objects have completely different
orbits, Kepler’s third law always yields the same result for the mass of the
Sun. We can combine the distance to the
Sun with the intensity of sunlight we receive from the Sun to calculate the
luminosity of the Sun. The luminosity of
any object is the total amount of energy it radiates every second, commonly
known as the power output. The
luminosity or the power output of any object is measured
in watts. Astrophysicists use cursive
(script) ℒ for luminosity. For any object
with luminosity ℒ, the intensity of the light I
at a distance r from the object is
given by the equation I = ℒ / 4πr2. This equation is true because the object
radiates energy isotropically (equally in all
directions). Thus, the total energy
radiated by the object cuts through a sphere centered on the object, and the
surface area of a sphere of radius r
is 4πr2. This equation also reveals why a lightbulb
for example looks brighter when closer and dimmer when further. Doesn’t the
lightbulb radiate a constant luminosity (constant power output) regardless of
distance? Indeed it does, but that same
luminosity has spread out over a large sphere if we are far from the lightbulb. Hence, that constant luminosity is diluted over the large sphere, and thus a smaller
fraction of that luminosity enters our eye.
Conversely, that same luminosity is concentrated over a small sphere if
we are close to the lightbulb, and thus a larger fraction of that luminosity
enters our eye. We know our distance
from the Sun, and we know the intensity of sunlight at our distance from the
Sun. Thus, the only unknown remaining in
the equation I = ℒ / 4πr2 is
the luminosity of the Sun.
Astrophysicists use the symbol ℒ☉ to denote the luminosity of the Sun. In fact, the luminosity of the Sun ℒ☉
is such a fundamental unit in stellar astrophysics that it is
called a solar luminosity. The
luminosity of the Sun is enormous; one solar luminosity ℒ☉
is roughly four hundred septillion watts.
(Again, please refer to the above multiplication table.) The Sun has been radiating roughly four
hundred septillion watts every second for roughly five billion years, and it
will continue to do so every second for the next roughly five billion
years! This begs the following question:
what is the source of this incredible luminosity? More plainly, why does the Sun shine? We will reveal the answer to this question
shortly.
The surface temperature of
the Sun is roughly six thousand kelvins.
Astrophysicists have determined the Sun’s surface temperature using two
different methods. Firstly, we can graph
the amount of light from the Sun as a function of the wavelength of the
light. The resulting graph is a continuous
blackbody spectrum, although there is an absorption spectrum superimposed upon
that continuous blackbody spectrum as we will discuss
shortly. From the peak of this
continuous blackbody spectrum, we can calculate the surface temperature of the
Sun. Essentially, we are calculating the
surface temperature of the Sun from its color.
As we discussed earlier in the course, the amount of energy radiated
from a hot, dense object often follows the blackbody spectrum, which is a
continuous spectrum with its peak radiation within a band of the
Electromagnetic Spectrum determined by the temperature of the object. In particular, hotter temperatures correspond
to higher photon energies (which are also at higher frequencies and shorter
wavelengths), while cooler temperatures correspond to lower photon energies
(which are also at lower frequencies and longer wavelengths). In other words, a hot, dense object’s primary
radiation is displaced as its temperature
changes. This is the Wien displacement
law. More precisely, the Wien
displacement law states that the wavelength of a hot, dense object’s primary
radiation is inversely proportional to its temperature, assuming we measure
temperature with correct units such as kelvins or rankines. At one or two thousand kelvins, objects
radiate primarily red visible light. At
three or four thousand kelvins, objects radiate primarily orange visible
light. At five or six thousand kelvins,
objects radiate primarily yellow visible light.
At roughly ten thousand kelvins, objects radiate primarily blue visible
light. Notice how hotter temperatures
displace the primary radiation to higher and higher photon energies (which are
also higher and higher frequencies and shorter and shorter wavelengths), while
cooler temperatures displace the primary radiation to
lower and lower photon energies (which are also lower and lower frequencies and
longer and longer wavelengths). It is commonly known that the Sun is a yellow star. For example, every young child will use a
yellow crayon when asked to draw the Sun.
From that yellow color, we can use the Wien displacement law to
calculate that the surface temperature of the Sun is roughly six thousand
kelvins. We can also calculate the
surface temperature of any hot, dense object using the Stefan-Boltzmann law,
which states that the luminosity of any hot, dense object is directly
proportional to both its surface area and the fourth power of its surface
temperature. Since the shape of the Sun
is very nearly a sphere and the surface area of a sphere of radius R is 4πR2, the
Stefan-Boltzmann law for the Sun states ℒ = σ(4πR2)T 4, where T is
the surface temperature. Also, σ (the
lowercase Greek letter sigma) is a fixed number called the Stefan-Boltzmann
constant. Warning: we use lowercase r for the distance from the hot object,
and we use uppercase (capital) R for
the actual radius of the hot object. In
particular for the Sun, r is roughly
one hundred and fifty million kilometers (our distance from the Sun), while R is roughly seven hundred thousand
kilometers (the actual radius of the Sun).
We already determined the luminosity of the Sun, and we certainly know
the radius of the Sun. Therefore, the
only unknown remaining in the Stefan-Boltzmann law ℒ = σ(4πR2)T 4 is the surface temperature of the Sun, which we again
calculate to be roughly six thousand kelvins, consistent with the
Wien-displacement method.
From the absorption spectral
lines (the Fraunhofer lines, as we discussed earlier
in the course) superimposed upon the Sun’s continuous blackbody spectrum, we
can determine the composition of the Sun.
We discover that the Sun is composed of all the atoms on the Periodic
Table of Elements, but not in equal amounts.
Only two atoms account for close to one hundred percent of the Sun’s
mass; all the other atoms on the Periodic Table of Elements account for only a
tiny fraction (tiny percentage) of the Sun’s mass. Roughly seventy-five percent (three-quarters)
of the Sun’s mass is hydrogen, and roughly twenty-five percent (one-quarter) of
the Sun’s mass is helium. Again, all the
other atoms on the Periodic Table of Elements make up a tiny fraction (tiny
percentage) of the Sun’s mass.
The Sun radiates roughly four
hundred septillion watts every second.
The Sun has been radiating this tremendous luminosity for roughly five
billion years, and it will continue to do so for the next roughly five billion
years. What is the source of this
incredible luminosity? More plainly, why
does the Sun shine? This question was
one of the great scientific debates of the 1800s (the
nineteenth century). Chemical reactions
provide nowhere nearly enough energy to account for the Sun’s luminosity over
its long lifetime. It is not difficult
to calculate that the Sun would consume all of its mass in only several
thousand years if it derived its luminosity from chemical reactions, but the
Sun has been shining for roughly five billion years. Gravitational contraction is not the Sun’s
energy source either. Although
gravitational contraction does convert gravitational energy into heat and
light, it is not difficult to calculate that the Sun would need to collapse in
several million years to account for its incredible luminosity. Although this several-million-year lifespan
is an improvement over the several-thousand-year lifespan chemical reactions
could provide, it is nevertheless still nowhere near the Sun’s actual lifetime,
which is in the billions of years.
Gravitational contraction is also called
Kelvin-Helmholtz contraction, named for the British physicist William Thomson
Lord Kelvin and the German physicist Hermann von Helmholtz, the two physicists
who developed the mathematical details of gravitational contraction. Note that while the Sun was being born as a
collapsing cloud of gas from within a diffuse nebula, it did derive its energy
from Kelvin-Helmholtz (gravitational) contraction as we will discuss shortly,
and indeed the Sun collapsed in only several million years while it was being
born. However, the Sun eventually attained
gravitational equilibrium, meaning outward pressure balances inward
self-gravity. The Sun has been in
gravitational equilibrium for roughly five billion years, and so
Kelvin-Helmholtz (gravitational) contraction does not explain why the Sun has
been shining for most of its lifetime.
The 1800s (the nineteenth century) ended with
this fundamental question unanswered.
Why does the Sun shine? At the
beginning of the 1900s (the twentieth century), the
atomic theory of matter became firmly established. Moreover, physicists discovered that atoms
are composed of even smaller particles: the nucleus at the center of the atom
and electrons around the nucleus.
Physicists discovered that chemical reactions involve the electrons
around the nucleus, but physicists also discovered nuclear reactions, which
involve the nuclei themselves. These
nuclear reactions can generate thousands, even millions, of times more energy
than chemical reactions. Perhaps the Sun
derives its energy from nuclear reactions.
Before we explore this idea further, we must discuss some fundamental
physics.
There are four fundamental
forces in the universe. Starting with
the strongest force in the universe, we have the strong nuclear force, the
electromagnetic force, the weak nuclear force, and finally the gravitational
force is the weakest force in the universe.
Actually, the gravitational force is by far by far by far by far the
weakest force in the entire universe.
The gravitational force is much much much much weaker than the other
three forces. All of us have some
familiarity with gravity. As we
discussed earlier in the course, gravity causes everything in the universe to
attract everything else in the universe.
All of us also have some familiarity with the electromagnetic
force. As we discussed earlier in the
course, there are both positive and negative electrical charges in our
universe. Positive and positive repel,
negative and negative repel, and positive and negative attract. In other words, like charges repel, and
unlike charges attract. Protons are positively charged, while electrons are negatively
charged. Since unlike charges attract,
the positive protons within the atomic nucleus attract the negative electrons
around the atomic nucleus. This is what
holds the atom together, the attraction between the positive protons within the
nucleus and the negative electrons around the nucleus. However, what holds the nucleus of an atom
together? The atomic nucleus is composed
of protons and neutrons. The neutrons
are neutral; this is why they are called
neutrons! Since neutrons are neutral,
they are not attracted to or repelled from anything electromagnetically. More importantly, the protons are
positive. Hence, they repel each other
electromagnetically. What holds the
atomic nucleus together if the neutrons feel no electromagnetic attraction and
all the protons feel electromagnetic repulsion from each other? We deduce that there must
be another force in the nucleus that is stronger than the electromagnetic force
so that it can overpower the electromagnetic repulsion of the protons,
thus holding the nucleus together. This
force is the strongest force in the entire universe, and it is
called the strong nuclear force.
The strong nuclear force must be stronger than the electromagnetic
force, since the strong nuclear force must overpower the electromagnetic
repulsion among the protons to hold the atomic nucleus together. The strong nuclear force attracts protons and
protons together, the strong nuclear force attracts neutrons and neutrons
together, and the strong nuclear force even attracts protons and neutrons
together. Protons and neutrons are
composed of even smaller particles called quarks, and the strong nuclear force is also responsible for holding quarks together to build
protons and neutrons. If the strong nuclear
force is this powerful, why don’t all the protons and
neutrons in the universe attract each other to form one giant nucleus? This does not occur because the strong
nuclear force has a limited range. The
gravitational force does not have a range.
Regardless how close or how distant two objects are from one another,
they will attract each other gravitationally with a strength that depends upon
their masses and the distance between them.
The electromagnetic force also does not have a range. Regardless how close or how distant two
objects are from one another, they will attract or
repel each other electromagnetically with a strength that depends upon their
charges and the distance between them.
However, the strong nuclear force does have a limited range. Protons and neutrons will not feel the strong
nuclear force if they are further than a certain
limited range. The range of the strong
nuclear force is roughly the size of the nucleus of an atom, which is a few
quadrillionths of a meter or a few trillionths of a millimeter or a few
billionths of a micrometer. Hence, the
strong nuclear force does overpower the electromagnetic force within the
nucleus of an atom, but the strong nuclear force vanishes outside of the
nucleus of an atom. Hence, the limited
range of the strong nuclear force prevents all the protons and neutrons in the
universe to attract each other to form a giant nucleus. The limited range of the strong nuclear force
only permits this force to hold quarks together within protons and neutrons and
to hold protons and neutrons together within the nucleus of an atom. The weak nuclear force is responsible for
certain weak nuclear reactions, hence its name.
It also has a limited range, like the strong nuclear force.
The incredible strength of
the strong nuclear force reveals why nuclear reactions generate so much more
energy than chemical reactions. We will
focus on two particular types of nuclear reactions: nuclear fission reactions
and nuclear fusion reactions. A nuclear
fission reaction is the splitting of a larger, more massive (or heavier)
nucleus into smaller, less massive (or lighter) nuclei. In fact, to fission anything means to split
it in colloquial English. A nuclear
fusion reaction is the merging of two smaller, less massive (or lighter) nuclei
into a larger, more massive (or heavier) nucleus. In fact, to fuse anything means to merge them
together in colloquial English. Nuclear
fission reactions generate thousands of times more energy than chemical
reactions, and nuclear fusion reactions generate thousands of times more energy
than nuclear fission reactions, meaning that nuclear fusion reactions generate
millions of times more energy than chemical reactions. To initiate a nuclear fission reaction, we
must fire a particle that will collide with a more massive (heavier) nucleus;
the collision causes the nucleus to split.
We cannot use a proton as the projectile, since protons are positively
charged, and the target nucleus is itself positively charged. Hence, the proton and the target nucleus will
repel each other electromagnetically. We
cannot use an electron as the projectile either. Although electrons are
negatively charged and would be attracted to the positively charged nucleus
that we are trying to split, an electron is almost two thousand times
less massive (lighter) than a proton or a neutron. Hence, the electron has too little mass to
split the target nucleus. If we wanted
to demolish a condemned building, firing a bullet at the building would be
fruitless. Regardless how fast the
bullet may be moving, it has such little mass that it will not have sufficient
momentum to demolish the condemned building.
However, a wrecking ball is so massive that it carries sufficient
momentum to demolish the condemned building, even if the wrecking ball is not
moving particularly fast. If a proton is repelled by the target nucleus and if the electron has
insufficient mass and thus insufficient momentum to split the target nucleus,
the only particle left to try is a neutron.
Although neutrons are neutral and thus will not be attracted to the
target nucleus, they will not be repelled either. More importantly, the mass of the neutron is
comparable to the mass of a proton. In
fact, the mass of the neutron is a little bit more than the mass of a
proton. Therefore, a neutron need not be
moving particularly fast to carry enough momentum to split the target
nucleus. Examples of massive (heavy)
nuclei commonly used in nuclear fission reactions include uranium and
plutonium. A particular example of a
nuclear fission reaction is a neutron splitting a uranium nucleus into a
krypton nucleus, a barium nucleus, and three neutrons. This nuclear fission reaction is more
properly written +
→
+
+ 3
. Note that
is the symbol of the neutron in nuclear
physics. Also
note that in nuclear physics, we use the same symbol for the nucleus of an atom
as a chemist would use for the entire atom.
For example, the symbol
is used for the
barium-141 atom in a chemistry course, while the same
symbol
is used for the barium-141 nucleus in a
nuclear physics course. Note that this
reaction releases three neutrons, which can be used to
split further nuclei. The result is a
chain reaction. A chain reaction can be controlled, as is the case
in nuclear power plants. A chain
reaction can also be uncontrolled, as is the case in a nuclear fission
bomb. In a nuclear power plant, lead
rods are used to control the reaction rate. If the chain reaction is proceeding too
quickly, lead rods are inserted into the reacting
solution; these lead rods absorb some of the neutrons to reduce the splitting
of the nuclei, thus slowing down the reaction.
If the chain reaction is proceeding too slowly, lead rods are pulled out of the reacting solution, leaving more
neutrons to split more nuclei thus speeding up the reaction.
A nuclear fusion reaction is
the merging of two less massive (lighter) nuclei into a more massive (heavier)
nucleus. However, all nuclei are
positively charged. Therefore, all
nuclei repel each another electromagnetically, which should prevent a fusion (a
merging) of nuclei from ever occurring.
As we discussed earlier in the course, temperature is a measure of the
average energy of individual particles.
In this course, we may assume that the average energy of the particles
corresponds to their average speed. In
other words, particles move relatively faster at hotter temperatures, while
particles move relatively slower at cooler temperatures. Imagine incredibly hot temperatures when
nuclei are moving so fast that although they repel electromagnetically as they
approach each other, their tremendous energies at these incredibly hot
temperatures bring them within a few quadrillionths of a meter of one another
despite their electromagnetic repulsion.
It is within this range that the strong nuclear force operates. Hence, the strong nuclear force will
overpower the electromagnetic repulsion, and the nuclei will fuse
together. Hydrogen is the least massive
(the lightest) atom in the entire universe, and helium is the second least
massive (the second lightest) atom in the entire universe. So, a particular
example of a nuclear fusion reaction is hydrogen nuclei fusing into a helium
nucleus. The threshold temperature at
which hydrogen fuses into helium is several million kelvins. This is incredibly hot by human standards,
but this threshold temperature would have been even hotter if it were not for
Quantum Mechanics, the correct theory of molecules, atoms, and subatomic
particles. At the foundation of Quantum
Mechanics is the Heisenberg Uncertainty Principle, named for the German
physicist Werner Heisenberg who not only formulated this fundamental principle
but was also one of the physicists who formulated Quantum Mechanics
itself. The Heisenberg Uncertainty
Principle states that it is impossible for a subatomic particle to have a
definite position (location) and a definite velocity (speed) at the same time. Because of this Heisenberg Uncertainty
Principle, there is a fair probability for subatomic particles to overcome
energy barriers even when they have insufficient energy to overcome the
barrier. This is
called quantum-mechanical tunneling.
At first glance, this quantum-mechanical tunneling seems like
unscientific nonsense, but quantum-mechanical tunneling has
been proven for many subatomic particles, including electrons for
example. In fact, modern electronic
devices such as mobile telephones and computers would not function correctly
without the quantum-mechanical tunneling of electrons. At several million kelvins of temperature,
most hydrogen nuclei are still not moving sufficiently fast to
quantum-mechanically tunnel through the electromagnetic repulsion between them,
but temperature is a statistical measure of average speed. In other words, at any given temperature,
whereas many particles move at a certain average speed, a small number of
particles move much slower than the average speed, and a small number of
particles move much faster than the average speed. At several million kelvins of temperature, a
small fraction of hydrogen nuclei do move sufficiently fast that they are able
to quantum-mechanically tunnel through the electromagnetic repulsion between
them, enabling the strong nuclear force to fuse them together. Humans have achieved uncontrolled nuclear
fusion reactions using nuclear fusion bombs.
Humans have not yet succeeded in controlled nuclear fusion reactions. Since we do not yet have the technology to
control nuclear fusion reactions at these incredible temperatures, all nuclear
power plants today use nuclear fission reactions, not nuclear fusion reactions.
The energy yield of both
nuclear fission bombs and nuclear fusion bombs is measured
in units of tons of trinitrotoluene (TNT), a chemical explosive. One ton of TNT has an explosive yield of
roughly four billion joules of energy.
One kiloton of TNT has an explosive yield of one thousand tons of TNT,
since the prefix kilo- always means thousand.
For example, there are one thousand meters in a kilometer, and there are
one thousand grams in a kilogram. Since
one kiloton of TNT has an explosive yield of one thousand tons of TNT and since
one ton of TNT has an explosive yield of roughly four billion joules of energy,
therefore one kiloton of TNT has an explosive yield of roughly four trillion
joules of energy. One megaton of TNT has
an explosive yield of one million tons of TNT, since the prefix mega- always
means million. Since one megaton of TNT
has an explosive yield of one million tons of TNT and since one ton of TNT has
an explosive yield of roughly four billion joules of energy, therefore one
megaton of TNT has an explosive yield of roughly four quadrillion joules of
energy. The typical yield of a nuclear
fission bomb is a few kilotons of TNT, and the typical yield of a nuclear
fusion bomb is a few megatons of TNT.
These incredible yields help us appreciate the extraordinary amount of
energy released from nuclear reactions.
We can also appreciate the vast quantities of energy released from
nuclear reactions by discussing the activation energy required to detonate
these nuclear weapons. We require a
powerful chemical explosive to heat uranium or plutonium to sufficient
temperatures for neutrons to move sufficiently fast to initiate nuclear
fission. Hence, the detonator of a
nuclear fission bomb is a chemical explosive, such as TNT. We require a fission bomb to heat hydrogen to
millions of kelvins of temperature so that the hydrogen nuclei can move
sufficiently fast to fuse into helium nuclei.
Hence, the detonator of a nuclear fusion bomb is a nuclear fission
bomb! These activation energies also
give us a comparative scale. Comparing a
chemical explosion to a nuclear fission explosion is rather like comparing a
nuclear fission explosion to a nuclear fusion explosion!
As we discussed, the Sun is
roughly three-quarters hydrogen and roughly one-quarter helium. We now suspect that the Sun derives its
energy from the nuclear fusion of hydrogen into helium. Unfortunately, the surface temperature of the
Sun is only six thousand kelvins, as we discussed. This is nowhere nearly hot enough to fuse
hydrogen into helium. However, the
interior of the Sun is much hotter than six thousand kelvins. Theoretical calculations reveal that the core
temperature of the Sun is roughly fifteen million kelvins of temperature. Even at this incredibly hot
temperature, it is only a small fraction of hydrogen nuclei that move
sufficiently fast to quantum-mechanically tunnel through the electromagnetic
repulsion between them. However,
the Sun is also incredibly massive.
Although these incredibly hot temperatures are only
attained in the Sun’s core, the solar core is massive enough that a
small fraction of the enormous number of hydrogen nuclei that compose the solar
core is an appreciable number. In other
words, the Sun’s core is composed of such an incredible number of hydrogen
nuclei that a fair amount of nuclear fusion occurs, even though nuclear fusion
is somewhat improbable even at several million kelvins of temperature. In conclusion, the Sun shines because of the
nuclear fusion of hydrogen into helium in its core at roughly fifteen million
kelvins of temperature. Warning: most of
the Sun is not hot enough for any nuclear fusion to occur. Only the Sun’s core is hot enough to fuse
some hydrogen into helium. Therefore,
the Sun’s core is slowly but progressively becoming less and less hydrogen and
more and more helium, while the rest of the Sun remains roughly three-quarters
hydrogen and roughly one-quarter helium.
In roughly five billion years, the solar core will exhaust its hydrogen,
becoming nearly entirely helium. At that
point, the Sun will begin to die, as we will discuss shortly. We emphasize again that the entire Sun will
never become pure helium. Most of the
Sun will remain roughly three-quarters hydrogen and roughly one-quarter helium,
since the nuclear fusion of hydrogen into helium only occurs in the solar core.
The first step of the nuclear
fusion reactions occurring in the Sun’s core is the fusion of two protons into
a deuteron. This reaction is more
properly written +
→
+ e+
+ νe. Again, in nuclear physics we use the same
symbol for the nucleus of an atom as a chemist would use for the entire
atom. The symbol
is used for the
hydrogen-1 atom (the protium atom) in a chemistry
course, while the same symbol
is used for the hydrogen-1 nucleus (simply a
proton) in a nuclear physics course. Also, the symbol
is used for the hydrogen-2 atom (the deuterium
atom) in a chemistry course, while the same symbol
is used for the hydrogen-2 nucleus (a
deuteron) in a nuclear physics course.
The symbol νe
(the lowercase Greek letter nu) stands for a neutrino, which we will discuss
shortly. The symbol e+ stands for the antielectron, commonly known as the
positron. For every particle in the
universe, there is a corresponding antimatter particle. A particle of antimatter has the identical
mass as its corresponding particle of ordinary matter, but the antimatter
particle has the opposite electric charge as the ordinary matter particle. Other parameters are opposite as well. We have discussed that the proton is
positively charged, but there is another particle with identical mass as the
proton called the antiproton, which is negatively charged
instead of positively charged. We have
discussed that the electron is negatively charged, but there is another
particle with identical mass as the electron called the antielectron, which is positively charged instead of negatively charged. This is why the antielectron is commonly known as the positron. Notice that the symbol of the antielectron
(positron) is e+, since we
may regard this antimatter particle as a positive electron. Indeed, the symbol of the ordinary electron
is e–, since it is
negatively charged. We emphasize that
antimatter is not science fiction; antimatter is proven science fact. Physicists have synthesized antimatter
particles for many decades. Antiquarks
compose antiprotons and antineutrons.
Antiprotons and antineutrons can join to form antinuclei,
and antielectrons (positrons) can be attracted by
these antinuclei to form antiatoms. Antiatoms can even chemically bond with each
other to form antimolecules. Antimatter is extraordinarily rare in our
universe, but this is fortunate actually.
When a matter particle and its corresponding antimatter particle meet,
they completely annihilate each other, becoming pure energy. This is the complete conversion of matter
into energy. The overwhelming majority
of particles of the universe are ordinary matter particles; antimatter
particles are extraordinarily rare. All
the stars, planets, moons, asteroids, and comets in the entire universe are
composed of matter, not antimatter.
Hence, the Sun is composed of ordinary matter. Thus, when the antielectron (positron) is generated in this first step of the nuclear fusion in the
Sun’s core, the antielectron (positron) immediately annihilates with an
ordinary electron, generating energy.
The next step of the nuclear fusion reactions occurring in the Sun’s
core is the fusion of a proton and a deuteron into a helium-3 nucleus. This reaction is more properly written
+
→
. The third and final step of the nuclear
fusion reactions occurring in the Sun’s core is the fusion of two helium-3
nuclei into a helium-4 nucleus (an alpha particle) plus two protons. This reaction is more properly written
+
→
+
+
. The two protons resulting from this final
step can then fuse, bringing us back to the first
step of this nuclear reaction chain.
Hence, the overall reaction of all of these nuclear fusion reactions is
called the proton-proton cycle, since the fusion of two protons begins the
reaction chain and two protons result from the end of the reaction chain which can begin the entire reaction chain over
again. However, this may lead us to
suspect that this nuclear reaction chain continues indefinitely, but this is
false. If we construct the overall
reaction, we discover that four protons fuse into a helium-4 nucleus (an alpha
particle) plus energy plus two neutrinos.
This overall reaction is more properly written 4
→
+ energy + 2νe. Hence, hydrogen is being
converted into helium in the Sun’s core.
Therefore, the solar core is becoming less and less hydrogen and more
and more helium. Again, only the solar
core is hot enough for these nuclear fusion reactions to occur. Nuclear reactions do not occur throughout
most of the Sun. Hence, most of the Sun
remains three-quarters hydrogen and one-quarter helium. There will never come a time when the entire
Sun is pure helium. However, the solar
core will become nearly entirely helium in roughly five billion years. This will begin the death of the Sun, which
we will discuss shortly. Hence, this proton-proton
cycle will not continue indefinitely, since the solar core will eventually
exhaust its supply of hydrogen, thus ending this nuclear reaction chain. Note that the first step of this reaction
chain is governed by the weak nuclear force, which is a
slow force. This contributes to the
Sun’s long lifetime. Instead of
consuming all of the hydrogen in its core in a short amount of time, the
proton-proton cycle is slowed by the first step in the
nuclear reaction chain, stretching out the conversion of hydrogen into helium
in the solar core over a timescale of billions of years. The energy generated in the proton-proton
cycle is in the form of high-energy photons in the gamma-ray part of the
Electromagnetic Spectrum.
Although hydrogen and helium
are gases at ordinary temperatures, the interior of the Sun is so hot that the
hydrogen and helium atoms are ionized.
The composition of the Sun is actually positively-charged
nuclei, negatively-charged electrons, and high-energy photons all colliding
with one another. This hot state of
matter is called a plasma. Therefore, the high-energy photons created by
the proton-proton cycle in the Sun’s core cannot easily escape the Sun. They continuously collide with positive
nuclei and negative electrons. Therefore,
the trajectory (the path) of these photons is randomized. Of course, these photons do propagate in a
straight line at the speed of light between collisions, but their overall
trajectory (path) is not a straight line; it is a random trajectory (path) resulting
from continuous collisions with nuclei and electrons. This type of trajectory is
called a random walk, since it is rather like the path a pedestrian
would take while aimlessly walking the streets of a city. Note therefore that light cannot travel
easily through the Sun. The Sun is not
transparent; the Sun is opaque. The
layer of the Sun around the core where the photons execute this random walk is called the radiation zone. It takes somewhere between one hundred
thousand years and one million years for a typical photon to escape out of the
radiation zone. It would only take
photons roughly two seconds to travel from the core of the Sun to the surface
of the Sun if they could move in straight lines at the speed of light without
suffering any collisions. However,
photons take somewhere between one hundred thousand years and one million years
to travel out of the radiation zone, due to their random walks resulting from
their continuous collisions with nuclei and electrons. The next layer of the Sun around the
radiation zone is the convection zone, where energy is
transported much faster through rising masses of more hot plasma and
sinking masses of less hot plasma. These are convection cells similar to the convection cells in the
Earth’s asthenosphere that we discussed earlier in the course, although the
convection cells in the Sun’s convection zone are much, much hotter. The final layer of the Sun around the
convection zone is the photosphere, the actual surface of the Sun that we can
see. At the photosphere, energy leaves
the Sun in the form of electromagnetic waves (photons) from across the entire
Electromagnetic Spectrum. More
precisely, electromagnetic waves (photons) radiate from the photosphere with a
continuous blackbody spectrum, primarily in visible light (peaking in yellow
visible light), in accordance with the temperature of the photosphere (roughly
six thousand kelvins) as determined by the Wien displacement law. The photons that leave the photosphere travel
out into the surrounding outer space at the speed of light. Some of these photons spend roughly eight
minutes traveling to the Earth. Each
time we feel the warmth of sunlight, we should reflect upon the journey that
sunlight endured before finally arriving upon us. First, the energy was
created in the Sun’s core through nuclear fusion reactions (the
proton-proton cycle). Then, the energy
spent between one hundred thousand years and one million years trying to escape
from the Sun’s radiation zone. Then, the
energy was transported faster by convection through
the Sun’s convection zone. Then, the
energy escaped the photosphere (the surface of the Sun), traveling through
outer space toward the Earth for roughly eight minutes before finally bathing
us with its warmth.
Our understanding of the
interior of the Sun comes from theoretical calculations together with computer
simulations. The results of this
theoretical work can be tested through the observation
of vibrations on the photosphere. The
study of these vibrations is called helioseismology,
since we may regard these vibrations as sunquakes. It is remarkable that our understanding of
the interior of the Sun is tested through measuring sunquakes, just as our
understanding of the interior of the Earth is tested through measuring
earthquakes, as we discussed earlier in the course. Our understanding of the interior of the Sun is also confirmed through the actual appearance of the
photosphere. The surface of the Sun does
not look smooth; the surface of the Sun looks grainy or sandy. This grainy or sandy appearance of the
photosphere is called granulation. The photosphere is composed
of more bright granules and less bright granules. These granules on the photosphere reveal the
convection in the convection zone beneath the photosphere. Rising masses of more hot plasma manifest
themselves as more bright granules on the photosphere, while sinking masses of
less hot plasma manifest themselves as less bright granules on the photosphere.
The Sun creates a powerful
magnetic field. As we discussed earlier
in the course, the Earth’s magnetic field is generated
by its rotation together with circulating currents of molten metal in its outer
core. Similarly, the Sun’s magnetic
field is generated by its rotation together with
convection cells of circulating hot plasma in its convection zone. However, the dynamics of the Sun’s magnetic
field is complicated by the Sun’s differential
rotation. We use the term rigid body
rotation when every part of an object rotates together at the same rate, while
we use the term differential rotation when different parts of an object rotate
at different rates. Fluids suffer from
differential rotation. For example, the jovian, gas-giant (outer) planets
suffer from differential rotation, since their outer layers are composed
primarily of hydrogen gas and helium gas.
Solids suffer from rigid body rotation.
For example, the terrestrial (inner) planets suffer from rigid body
rotation, since they are composed primarily of metal and rock. Caution: this is actually an
oversimplification. As we discussed
earlier in the course, different parts of the Earth actually rotate at
different rates. Nevertheless, as
compared with the jovian,
gas-giant (outer) planets, we may regard the Earth and all the terrestrial (inner)
planets as suffering from rigid body rotation.
The Sun is not a solid object.
The Sun is a hot plasma, which is a type of fluid. Therefore, the Sun suffers from differential
rotation. On average, the Sun rotates
roughly once per month, but in actuality different parts of the Sun rotate at
different rates. This differential
rotation drags and stretches the Sun’s magnetic field lines. As magnetic field lines are
stretched, they increase in tension, just as strings or elastic bands
increase in tension when stretched.
Eventually, magnetic field lines may break if they have too much
tension, again just as strings or elastic bands may break if they have too much
tension. When the Sun’s magnetic field
lines break, they reconnect with complex patterns. After a magnetic break followed by a magnetic
reconnection, the Sun’s magnetic field lines often anchor themselves at two
places on the photosphere (the surface of the Sun). The magnetic field lines point out of the
photosphere at one anchor, bend above the photosphere, and point back into the
photosphere at the other anchor.
Wherever they anchor themselves on the photosphere will be regions of
very strong magnetic fields that block convection in the convection zone
beneath these anchors, causing these regions of the photosphere to be
significantly less hot than the rest of the photosphere. These less hot regions with strong magnetic
fields are called sunspots, since they appear black as
compared with the rest of the surface of the Sun (the photosphere). The temperatures of these sunspots are still
in the thousands of kelvins however; sunspots are simply not as hot as the rest
of the photosphere at six thousand kelvins.
If the temperatures of sunspots are still in the thousands of kelvins,
then these sunspots are hot enough to radiate visible light. Indeed, these sunspots are actually quite
luminous; sunspots only appear black because we are comparing them with the
rest of the surface of the Sun. Since
broken and then reconnected magnetic field lines often anchor themselves at two
places on the photosphere, sunspots often occur in pairs. One sunspot will have an outwardly directed
magnetic field, while the other sunspot will have an inwardly directed magnetic
field. Plasma eruptions on the
photosphere often follow the Sun’s magnetic field lines. As such, a plasma eruption often forms an
arch anchored at a pair of sunspots.
This arched plasma eruption is called a solar prominence. If a tremendous amount of tension in the
Sun’s magnetic field lines is finally liberated
through a magnetic break followed by a magnetic reconnection, a violent plasma
eruption will burst outward from the photosphere; this plasma eruption is
called a solar flare. Solar flares
travel outward from the Sun. Some of
these solar flares travel toward the direction of the Earth. Fortunately, the Earth’s magnetic field
shields us from most solar activities such as these solar flares. However, our artificial satellites in orbit
around the Earth are not well protected from solar activity. Our artificial satellites are continuously
bombarded, damaged, and even on occasion completely
destroyed by solar activities such as solar flares.
Astronomers have directly
observed for roughly four hundred years (since the invention of the telescope)
that the number of sunspots goes through a roughly eleven-year cycle. In one complete cycle, the number of sunspots
increases then decreases over a time period of roughly
eleven years. Furthermore, measurements
of the radioactive isotope carbon-fourteen within trees have revealed that this roughly
eleven-year solar cycle itself goes through a roughly
two-hundred-year cycle. This is the de
Vries cycle, named for the Dutch physicist Hessel de Vries, one of the pioneers
of radiocarbon dating. According to the de Vries cycle, the
Sun gradually increases in activity to what is called
a solar maximum then gradually decreases in activity to what is called a solar
minimum. Caution: the eleven-year solar
cycles continue to occur throughout each two-century de Vries
cycle. Since one complete de Vries cycle lasts for roughly two centuries, each solar
maximum and each solar minimum lasts for roughly one hundred years. Over the past twelve thousand years (since
the beginning of the current interglacial period of the Current Ice Age), there
have been roughly sixty complete de Vries cycles,
with each de Vries cycle having one solar maximum and
one solar minimum. The Modern Maximum
occurred throughout most of the twentieth century, and the Modern Minimum began
toward the beginning twenty-first century (the current century). The roughly eleven-year sunspot cycle and the
roughly two-century de Vries sunspot cycle both
strongly determine variations in global temperatures on planet Earth, as we
discussed earlier in the course. In
particular, the Modern Maximum that occurred throughout most of the twentieth
century contributed to the warming temperatures of that century, and the Modern
Minimum that began toward the beginning of the twenty-first century (the current
century) has already caused cooling temperatures that will continue for the
rest of the current century.
The Sun’s atmosphere is
composed primarily of hydrogen and helium.
As we leave the photosphere (the surface of the Sun)
and climb the solar atmosphere and ultimately travel into the
surrounding outer space, we expect the temperature to become cooler and cooler,
but this is not the case. As we leave
the photosphere, the temperature actually becomes hotter. The lower layer of the Sun’s atmosphere is
the chromosphere. The temperature
approaches one hundred thousand kelvins as we climb the chromosphere. Because of these hot temperatures, the
chromosphere radiates primarily ultraviolet light, in accordance with the Wien
displacement law. The upper layer of the
Sun’s atmosphere is the corona, the main part of the Sun’s atmosphere. The solar corona is even hotter, roughly one
million kelvins in temperature. Because
of these even hotter temperatures, the solar corona radiates primarily X-rays,
again in accordance with the Wien displacement law. It is only when we climb out of the corona
and travel into the surrounding outer space that the temperature finally
cools. We do not understand why the
solar atmosphere is so hot. Perhaps the
Sun’s atmosphere is heated by prominences, flares, and
other solar activities from the photosphere.
Although this sounds reasonable, this theory is nevertheless not well
developed. Since the solar atmosphere is
so hot, its composition is primarily not hydrogen gas and helium gas but
primarily ionized hydrogen (protons and electrons) and ionized helium (alpha
particles and electrons). Moreover, the
hot temperature of the solar atmosphere causes many of these particles to move
sufficiently fast that they can escape from the Sun’s gravitational
attraction. The result is the solar
wind, a stream of charged particles from the Sun composed primarily of protons
(hydrogen nuclei), electrons, and alpha particles (helium nuclei). The solar wind radiates outward from the
Sun. This solar wind is capable of
completely ionizing the Earth’s atmosphere in a fairly short
amount of time. Fortunately, the Earth’s
magnetic field is sufficiently strong to deflect most of the Sun’s solar
wind. Some of the charged particles in
the solar wind do however become trapped within the
Earth’s magnetic field. These charged
particles execute helical trajectories around the Earth’s magnetic field
lines. These regions of the Earth’s
magnetic field are called the Van Allen belts, named
for the American physicist James Van Allen who discovered them. The charged particles within the Van Allen
belts may create an aurora, either aurora borealis (or more commonly the
northern lights) near the Earth’s north magnetic pole or aurora australis (or more commonly the southern lights) near the
Earth’s south magnetic pole, as we discussed earlier in the course. If the Sun happens to be less active, its
solar wind would be weaker, the resulting aurorae would appear less spectacular,
and we would only be able to enjoy them near the Earth’s magnetic poles. If the Sun happens to be more active, its
solar wind would be stronger, the resulting aurorae would appear more
spectacular, and we would be able to enjoy them further from the Earth’s
magnetic poles.
Neutrinos are extremely
weakly interacting subatomic particles.
Neutrinos refuse to participate in the strong nuclear force for
example. Neutrinos also refuse to
participate in the electromagnetic force, since they are electrically
neutral. This is why they are called neutrinos!
Of course, everything in the universe feels gravity, but the mass of a
neutrino is such a tiny number that physicists have not yet succeeded in even
measuring its value. Since the mass of a
neutrino is so extraordinary tiny, neutrinos do not noticeably feel
gravity. Therefore, for all practical
purposes neutrinos also refuse to participate in the gravitational force. Whereas the photons that are
created in the Sun’s core spend between one hundred thousand years and
one million years trying to escape from within the Sun as we discussed,
neutrinos are so weakly interacting that they immediately escape from within
the Sun after being created in the Sun’s core.
Since neutrinos propagate almost at the speed of light, the neutrinos
created by the proton-proton cycle in the Sun’s core travel in straight lines
from the solar core to the photosphere in roughly two seconds. The neutrinos continue to travel outward from
the Sun, through its atmosphere and then into the surrounding outer space. Some of these neutrinos spend roughly eight
minutes traveling to the Earth.
Neutrinos are so weakly interacting that when these neutrinos arrive at
the Earth, they simply pass through the Earth.
Billions and billions of neutrinos from the Sun pass through our bodies
every second of every day! Neutrinos are
so weakly interacting that they do virtually nothing with the atoms that
compose our bodies. This is not just the
case during the daytime when we are on the side of the Earth facing toward the
Sun. This is also true at night when we
are on the side of the Earth facing away from the Sun. In this case, these solar neutrinos arrive at
the Earth, pass straight through the Earth, and pass straight through or bodies
on the nighttime side of the Earth.
Every second of every day of our lives, billions and billions of solar
neutrinos continuously pass through our bodies!
If we could detect these
solar neutrinos, this would provide nearly real-time information about the
solar core. The light we collect from
the Sun may have taken roughly eight minutes to travel from the photosphere to
the Earth, but those photons were actually created in
the solar core at least one hundred thousand years ago and even up to one
million years ago. If we only rely upon
the light from the Sun to understand the interior of the Sun, our knowledge
about the solar core is actually up to one million years out of date. Of course, one million years is actually
rather recent as compared to the Sun’s age of roughly five billion years. Nevertheless, it would be exciting to have
information about the solar core that is only eight minutes old. Unfortunately, neutrinos are so weakly
interacting that detecting them is virtually impossible. Although neutrinos refuse to participate in
the gravitational force (practically speaking) or the strong
nuclear force or the electromagnetic force, neutrinos do on occasion
participate in the weak nuclear force.
As we discussed, the first step of the proton-proton cycle is governed by the weak nuclear force, and note that that
nuclear reaction involves a neutrino.
Several decades ago, physicists built neutrino detectors using the
principles of neutrinos participating in the weak nuclear force. Nevertheless, neutrinos are so weakly
interacting that even though billions and billions of solar neutrinos pass
through these detectors every second of every day, a neutrino detector only
detects one neutrino per day! Working at
a neutrino detector is the most boring job in the world. On one day, we see a single blip on a
screen. The following day, we see
another single blip. The day after that,
we see one single blip again. Boring! This is also frustrating, since we know that
billions and billions of neutrinos are actually passing through the detector
every second of every day, but we only detect one neutrino per day! Over several decades, physicists have only
detected one-third of the number of neutrinos that we should have been
detecting from the Sun. This is called the solar neutrino problem. There have been many theories proposed over
the decades to resolve the solar neutrino problem. One such idea is the theory of neutrino
oscillations. There are three different
flavors (or varieties or types) of neutrinos.
According to the theory of neutrino oscillations, there is a certain
probability that a neutrino can spontaneously change its flavor from one type
to another type. Only one type of
neutrino is created by the proton-proton cycle in the
solar core. According to the theory of
neutrino oscillations, some of these neutrinos may spontaneously change their
flavor during their roughly eight-minute journey from the Sun to the
Earth. Perhaps we have only been
detecting one-third of the number of neutrinos we should be detecting because
our neutrino detectors can only detect one flavor of neutrino instead of all
three flavors of neutrinos. This theory of neutrino oscillations was ridiculed by some
physicists for decades until it was proven to be the correct theory to
resolve the solar neutrino problem.
Several years ago, physicists finally built neutrino detectors that
could detect all three flavors of neutrinos.
Not only have we detected all three flavors of neutrinos from the Sun, but totaling all three detected flavors has finally yielded
results consistent with theoretical calculations. Hence, the resolution of the solar neutrino
problem is the theory of neutrino oscillations.
Stellar Properties
Other stars besides the Sun
are at least two hundred thousand times further from the Earth as compared with
the Sun. Therefore, we know much less
about others stars as compared with our Sun.
We will attempt to determine the properties of other stars by applying
the same procedures we applied to our Sun.
Firstly, from the absorption spectral lines within a star’s light, we
can determine the composition of the star.
We discover that all stars are composed of all the atoms on the Periodic
Table of Elements, but not in equal amounts.
Only two atoms account for close to one hundred percent of the mass of
all stars; all the other atoms on the Periodic Table of Elements account for
only a tiny fraction (tiny percentage) of the mass of stars. All stars are composed of roughly
seventy-five percent (three-quarters) hydrogen and roughly twenty-five percent
(one-quarter) helium. Again, all the
other atoms on the Periodic Table of Elements make up a tiny fraction (tiny
percentage) of the mass of stars.
To determine the distance to
stars, we measure their parallax. As we
discussed earlier in the course, parallax is the apparent motion of an object,
not because it is moving but because the observer is in fact moving. The motion of the Earth around the Sun causes
the stars to appear to shift their positions in the sky by tiny amounts. By measuring the angle of this shift, we can
determine the distance to the star. As
we discussed earlier in the course, the orbit of the Earth around the Sun is an
ellipse with a semi-major axis equal to one astronomical unit (1 au), roughly
equal to one hundred and fifty million kilometers. We also discussed earlier in the course that
the eccentricity of the Earth’s orbit around the Sun is so close to zero that
its orbit is nearly a circle, and so we may regard one astronomical unit as the
radius of the Earth’s roughly circular orbit around the Sun. More plainly, we may regard one astronomical
unit as the distance between the Earth and the Sun. Astronomers define the parallax angle as the
apparent angular shift of a star over a baseline of the Earth’s orbital
radius. Further distances result in
smaller parallax angles. Even the
nearest stars besides the Sun are so distant that their parallax shifts are
much smaller than even a one-degree angle.
A one-degree angle is already small, since one degree is one full circle
divided into three hundred and sixty equal parts. The parallax shifts of even the nearest stars
besides the Sun are much smaller than even one degree! One sixtieth of a degree is called one arcminute or one minute of arc and is written 1′. Notice that minutes of arc are
indicated with a single prime.
Caution: the single prime is also used for feet
of length in the United States. One
sixtieth of one arcminute is called one arcsecond or one second of arc and is written 1″. Notice that seconds of arc are
indicated with a double prime.
Caution: the double prime is also used for
inches of length in the United States.
Since sixty multiplied by sixty is 3600, this means that one arcsecond is one degree divided into 3600 equal
angles. A one-degree angle is already
small, but now imagine dividing that small angle into 3600 equal angles! The nearest stars besides the Sun suffer
parallax shifts even smaller than one arcsecond! Since stars besides the Sun must be
incredibly distant to suffer such tiny parallax shifts, astronomers have
defined a new unit of distance to measure distances to stars besides the
Sun. The distance at which a star would
appear to suffer a parallax of 1″ (one arcsecond or one second of arc) is called a parsec,
abbreviated pc. This word parsec is
derived from the three words parallax, arc, and second. It is not difficult to calculate that one
parsec of distance is slightly more than two hundred thousand astronomical
units. If we multiply two hundred
thousand astronomical units by roughly one hundred and fifty million kilometers
for each astronomical unit, we deduce that one parsec is roughly thirty-one
trillion kilometers! This is an
incredible distance, and the nearest stars besides the Sun are further than
even this! One parsec is also equal to
3.26 light-years, where one light-year is the distance that light travels in a
time of one year, as we discussed toward the beginning of the course. If a star suffers a parallax of 1″ (one arcsecond or one second of
arc), then it is 1 pc (one parsec) distant, by the definition of the
parsec. If a star suffers an even
smaller parallax (as all stars besides the Sun do), then the star is at a
proportionally further distance. For
example, if a star suffers a parallax of one-half of one arcsecond,
then it is two parsecs distant. If a
star suffers a parallax of one-tenth of one arcsecond,
then it is ten parsecs distant. We can
also invert this argument and predict the parallax from the distance. For example, if a star is twenty parsecs
distant, then it must suffer a parallax of one-twentieth of one arcsecond. If a star
is fifty parsecs distant, then it must suffer a parallax of one-fiftieth of one
arcsecond.
The Cosmological Distance
Ladder is a list of methods to determine distances to astronomical
objects. Any given method can only be used over a certain range of distances. Thus, we must use other methods for further
distances. That new method can only be used over its own range of further
distances. Thus, we must use yet another
method for even further distances, and so on and so forth. The parallax method of determining distances
is the lowest rung of the Cosmological Distance Ladder, since parallax angles
are so tiny that we can only measure them for nearby stars within the so-called
solar neighborhood. Beyond distances of
a couple of thousand parsecs, parallax angles become too tiny to measure even
with modern telescopes. Therefore, we
cannot measure the parallax of most of the stars of our Milky Way Galaxy, and
measuring parallaxes beyond our Milky Way Galaxy is hopeless. We will spend the rest of this course adding
higher and higher rungs to this Cosmological Distance Ladder until have a list
of methods that will enable us to determine distances from nearby stars in the
solar neighborhood all the way to the edge of the observable universe. Nearby stars are within the so-called solar
neighborhood, nearby galaxies slightly beyond our Milky Way Galaxy are within
the so-called galactic neighborhood, and the edge of the observable universe is
called the cosmic horizon. Although we
cannot use parallax to determine distances beyond the solar neighborhood,
astrophysicists nevertheless continue to use the parsec as the unit of distance
even for astronomical objects whose distances are determined using non-parallax
methods. One thousand parsecs is called
one kiloparsec (abbreviated kpc),
since the prefix kilo- always means thousand. For example, there are one thousand meters in
one kilometer, and there are one thousand grams in one kilogram. One million parsecs is called one megaparsec (abbreviated Mpc),
since the prefix mega- always means million. One billion parsecs is called one gigaparsec (abbreviated Gpc),
since the prefix giga-
always means billion. Theoretically, one
trillion parsecs would be called one teraparsec
(abbreviated Tpc), since the prefix
tera- always means trillion. However, the entire observable universe is
only a few gigaparsecs across. Toward the end of this course, we will
discuss that the universe is expanding, as hence the observable universe is
continuously growing in size. In many
billions of years, the observable universe will eventually expand to become teraparsecs in size.
However, the observable universe is presently only a few gigaparsecs across.
Therefore, the teraparsec is not yet a
physically meaningful unit of distance.
Caution: current cosmological models suggest that the entire universe
beyond the observable universe is actually infinite in size. It is the observable
universe that is only a few gigaparsecs
across, not the entire universe. We will
make clear the distinction between the observable universe and the entire
universe toward the end of the course.
Until we discuss higher rungs
of the Cosmological Distance Ladder, for now we are only able to use the parallax
method to measure distances to stars within a couple of thousand parsecs
(within the solar neighborhood).
Nevertheless, there are still millions of stars within this
distance. Therefore, we can determine
the luminosities of these nearby stars within the solar neighborhood. As we discussed, we can calculate the
luminosity of any object from the intensity of its light I and its distance from us r
using the equation I = ℒ / 4πr2,
where ℒ is the luminosity of the object.
The intensity of a star’s light is often expressed
as an apparent magnitude, while the luminosity of the star is often expressed
as an absolute magnitude. This magnitude scale was formulated by the ancient Greek
mathematician and astronomer Hipparchus of Nicaea. Hipparchus called the brightest stars we can
see in the night sky first-magnitude stars.
Bright stars that were not as bright as first-magnitude stars were called second-magnitude stars. Stars of intermediate brightness in the night
sky were called third-magnitude stars. Dim stars were called
fourth-magnitude stars, and the dimmest stars visible to the human eye were
called fifth-magnitude stars. This
magnitude scale is rather illogical, since dimmer stars are assigned higher
magnitude numbers, while brighter stars are assigned lower magnitude
numbers. Nevertheless, modern
astrophysicists not only continue to use this magnitude scale, but modern
astrophysicists have even quantified this magnitude scale. Firstly, there are decimal magnitudes. For example, a 4.3-magnitude star is brighter
than a 4.7-magnitude star. As another
example, a 2.5-magnitude star is dimmer than a 2.1-magnitude star. Secondly, the invention of the telescope
enables us to observe stars much dimmer than even the dimmest stars that the
naked eye is able to see. A
sixth-magnitude star is even dimmer than a fifth-magnitude star, and a
seventh-magnitude star is dimmer still.
The Hubble Space Telescope has imaged stars all the way down to roughly
thirtieth-magnitude! Thirdly, the
magnitude scale is also quantified in the other
direction. A zeroth-magnitude star is
brighter than a first-magnitude star. A
star with magnitude negative-one is even brighter than a zeroth-magnitude star,
and a star with magnitude negative-two is brighter still. Our Sun has a magnitude of roughly
negative-twenty-seven! More precisely,
the modern quantified magnitude scale is a logarithmic scale. In particular, every unit on the magnitude
scale corresponds to a factor of roughly 2.5 in brightness. For example, a sixth-magnitude star is
roughly 2.5 times brighter than a seventh-magnitude star. A fifth-magnitude star is roughly 2.5 times
brighter than a sixth-magnitude star, which makes a fifth-magnitude star
roughly 6.25 times brighter than a seventh-magnitude star (since 2.5 times 2.5
is 6.25). A fourth-magnitude star is
roughly 2.5 times brighter than a fifth-magnitude star, which makes a
fourth-magnitude star roughly 6.25 times brighter than a sixth-magnitude star,
which makes a fourth-magnitude star roughly 15.625 times brighter than a
seventh-magnitude star (since 2.5 times 2.5 times 2.5 is 15.625). In brief, one magnitude of separation is
roughly a factor of 2.5 in brightness, two magnitudes of separation is roughly
a factor 6.25 in brightness, and three magnitudes of separation is roughly a
factor of 15.625 in brightness. Four
magnitudes of separation is nearly a factor of 40 in brightness, and five
magnitudes of separation is nearly a factor of 100 in brightness! This reveals that lower magnitude stars are
much brighter than higher magnitude stars, since we must multiply by a string
of factors to calculate their relative brightnesses. Stated the other way around, higher magnitude
stars are much dimmer than lower magnitude stars, since must divide by a string
of factors to calculate their relative brightnesses. The apparent magnitude of a star is how
bright the star appears, depending upon its distance. The absolute magnitude of a star expresses
its luminosity or its intrinsic brightness.
More precisely, astronomers define the absolute magnitude of a star as
the apparent magnitude the star would have if it were ten parsecs distant. It is easy to prove that this precise
definition of absolute magnitude relates directly to luminosity or intrinsic
brightness. Therefore, we will casually
regard all three of these variables (luminosity, absolute magnitude, and
intrinsic brightness) as essentially the same quantity. If a star has a relatively constant
luminosity or intrinsic brightness as most stars do, then its absolute
magnitude is a fixed number. However,
the star will appear dimmer from further away, and the star will appear
brighter when closer. This is precisely
the same as the appearance of a lightbulb.
Most lightbulbs have a fixed luminosity (power output), but a lightbulb
will still appear dimmer from further away, and the lightbulb will still appear
brighter when closer. Because of the
illogical magnitude scale, the apparent magnitude of a star will be a higher
number (since the star appears dimmer) when further from the star, and the
apparent magnitude of a star will be a lower number (since the star appears
brighter) when closer to the star.
Again, Hipparchus of Nicaea assigned lower magnitude numbers to brighter
stars, and Hipparchus of Nicaea assigned higher magnitude numbers to dimmer
stars.
As we discussed,
astrophysicists use two methods to determine the surface temperature of our
Sun. Perhaps we can apply these same two
methods to determine the surface temperatures of other stars. One of these methods uses the Wien
displacement law (essentially using the color of the star), and the other
method uses the Stefan-Boltzmann law ℒ = σ(4πR2)T 4, where T is
the surface temperature of the star, ℒ is the luminosity (or absolute magnitude or intrinsic
brightness) of the star, and σ
is the Stefan-Boltzmann constant.
Warning: we use lowercase r
for the distance from the star, and we use uppercase (capital) R for the actual radius (physical size)
of the star. Let us first consider the
Stefan-Boltzmann law. To use this
equation to calculate the surface temperature of the star, we need the
luminosity and the actual radius (physical size) of the star. Although we have determined the luminosities
of nearby stars in the solar neighborhood, our telescopes are not powerful
enough to magnify even these nearby stars enough to actually
see their physical radii (their physical sizes). Stars appear to be twinkling points of light
to the naked eye, and most stars still appear to be twinkling points of light
through even our most powerful telescopes.
If we cannot measure the actual physical radii of stars (their physical
sizes), then we cannot use the Stefan-Boltzmann law to calculate their surface
temperatures. We are
now forced to consider the Wien displacement law. Unfortunately, even nearby stars in the solar
neighborhood are very dim, and so we receive insufficient light from them to
graph their continuous blackbody spectra to find the primary wavelength of
their light, which we require to calculate the surface temperature. All seems lost, but roughly
a century ago astronomers formulated an ingenious method to construct the
continuous blackbody spectrum of a star in a coarse but effective way. We place a red filter on our telescope that
permits only red light to enter the telescope.
Thus, we measure the brightness of a star in red light only. This is called the star’s red magnitude with
the symbol mR.
After removing the red filter, we then place a blue filter on the
telescope that permits only blue light to enter the telescope. Thus, we measure the brightness of the same
star in blue light only. This is called the star’s blue magnitude with the symbol mB.
After removing the blue filter, we then place a yellow-green filter on
the telescope that permits only yellow-green light to enter the telescope. Thus, we measure the brightness of the same
star in yellow-green light only. This is called the star’s visual magnitude with the symbol mV. (Astronomers use the word visual since
yellow-green corresponds with the primary wavelength of light emitted by our
own Sun.) After measuring the brightness
of the star at these different wavelengths, we then subtract these color
magnitudes. The difference between two
color magnitudes of the same star is called a color
index. The three possible color indices
we may calculate using these three filters are mB–mV (blue minus visual), mV–mR
(visual minus red), and mB–mR
(blue minus red). These color indices
yield estimates for the surface temperature of the star. For example, if the star radiates more blue
light than any other wavelength, its surface temperature must be hotter than
the surface temperature of our own Sun.
If the star radiates more red light than any other wavelength, its
surface temperature must be cooler than the surface temperature of our own
Sun. If the star radiates more
yellow-green (visual) light than any other wavelength, its surface temperature
must be roughly the same as the surface temperature of our own Sun. Because of the illogical magnitude scale,
both mB–mV and mV–mR will be negative numbers for hot,
blue stars. Also because of this
illogical magnitude scale, both mB–mV and mV–mR will be
positive numbers for cool, red stars. Moreover because of this illogical magnitude scale, mB–mV will be a positive number and mV–mR will be a
negative number for intermediate-temperature, yellow-green stars like our
Sun. In summary, we can estimate the
surface temperature of a star by measuring its color magnitudes (brightnesses at different wavelengths) and calculating
color indices (differences of color magnitudes). By using many more filters and carefully
measuring the brightness of the star at many different wavelengths (colors), we
can calculate many color indices (perform many subtractions) to coarsely but
effectively pinpoint the primary wavelength of a star’s continuous blackbody
spectrum, enabling us to fairly accurately calculate
its surface temperature from the Wien displacement law. Now that we have calculated the surface
temperature of the star, we can then use the Stefan-Boltzmann law to calculate
the actual radius (physical size) of the star, since the actual radius
(physical size) of the star is the only unknown remaining in that equation. This is remarkable. Even though our most powerful telescopes
cannot magnify most stars to actually see their physical radii (their physical
sizes), astronomers have nevertheless succeeded in calculating the physical
radii (physical sizes) of stars using this procedure. As the decades have passed, astronomers have
constructed larger and larger and hence more and more
powerful telescopes. If
a star is close enough and large enough, astronomers have eventually been able
to magnify these stars sufficiently to actually see their physical radius
(their physical size) through telescopes, and the actual radius (physical size)
of stars that astronomers have directly measured through telescopes is
consistent with calculations from decades earlier using the distances, the
luminosities, and the surface temperatures of stars.
As we discussed earlier in
the course, the only way to calculate the mass of any object in the universe is
to use Kepler’s third law. Fortunately,
most stars are members of binary star systems: two stars orbiting each other,
as we will discuss shortly. Therefore,
we may use the orbital parameters of the two stars (the orbital period and the
semi-major axes of the orbits) to calculate the masses of the stars. In summary, astrophysicists have determined
the composition of stars, the distance to stars, the luminosity or the absolute
magnitude or the intrinsic brightness of stars, the surface temperature of
stars, the physical radius (physical size) of stars, and the mass of stars.
At first, astronomers
classified stars based on the strength of their hydrogen lines in their
absorption spectra, since stars are composed mostly of hydrogen. Stars with the strongest hydrogen absorption
lines were called A-type stars. Stars with strong hydrogen absorption lines
but not as strong as A-type stars were called B-type
stars. Stars with strong hydrogen
absorption lines but not as strong as A-type stars or B-type stars were called C-type stars, and so on and so forth. In brief, stars with strong hydrogen
absorption lines have a spectral type near the beginning of the English
alphabet, while stars with weak hydrogen absorption lines have a spectral type
near the end of the English alphabet.
When astronomers determined the surface temperatures of stars using
color magnitudes and color indices, they realized that stars should
be classified based on their temperatures, not based on the strength of
their hydrogen absorption lines.
Therefore, astronomers reordered the stellar spectral types based on
surface temperature. Astronomers discovered
that the hottest, bluest stars are O-type stars. Stars that are hot and blue, but not as hot
and not as blue as O-type stars, were the B-type stars. Next come A-type stars, which are white-hot
stars, but not as hot as O-type or B-type stars. After A-type stars come F-type stars which
are also white-hot stars, but not as white-hot as A-type stars. Next come G-type stars which are yellow-hot
stars, like our own Sun. In fact, our
Sun is considered a G-type star. Even cooler than G-type stars are K-type
stars, which are orange in color.
Finally, the coolest, reddest stars are M-type stars. In summary, the spectral types of stars in
the correct order starting with the hottest stars are O, B, A, F, G, K, and
finally M for the coolest stars. For the
past century, all astronomers have memorized this temperature sequence using
the mnemonic, “Oh be a fine guy/gal, kiss me!”
Astronomers have also quantified this spectral sequence. In particular, each of these spectral types is subdivided into ten subclasses running from zero through
nine. The hottest, bluest stars have a
spectral type O0 followed by O1,
O2, O3, O4,
O5, O6, O7,
O8, and O9. After O9 would come
B0, B1, B2,
B3, B4, B5,
B6, B7, B8,
and B9. After B9 would come A0 through A9, then F0 through F9, G0 through G9, K0, through K9, and M0 through finally M9, the spectral type of the coolest, reddest stars. As a simple exercise, a K4-star
is hotter than a K7-star. As another simple exercise, a B6-star is cooler than a B3-star. Using this quantified temperature sequence,
our Sun is more precisely classified as a G2-star. We will
discuss shortly that stars also have a luminosity type in addition to the
spectral type. The luminosity type of a
star is labeled with a Roman numeral, such as I, II, III, IV,
and V. We will discuss the
meaning of each of these luminosity types shortly. Our Sun’s luminosity type is Roman numeral V,
as we will discuss. Therefore, our Sun’s
full spectral-luminosity type is G2V. Again, G2 is our
Sun’s spectral type, which indicates that our Sun is a yellow star. The Roman numeral V is our Sun’s luminosity
type, as we will discuss shortly.
The Hertzsprung-Russell Diagram
The Hertzsprung-Russell
diagram (or the H-R diagram for short) is the single most important diagram in all
of astrophysics. This diagram is named for the Danish astronomer Ejnar
Hertzsprung and the American astronomer Henry Norris
Russell, the two astronomers who first constructed this diagram. The vertical axis of the Hertzsprung-Russell
diagram is luminosity or absolute magnitude or intrinsic brightness. More luminous
(intrinsically brighter) stars are toward the top of the Hertzsprung-Russell diagram, while less luminous
(intrinsically dimmer) stars are toward the bottom of the Hertzsprung-Russell
diagram. The horizontal axis of the Hertzsprung-Russell diagram is temperature or spectral type
or color. Hotter, bluer stars are toward
the left side of the Hertzsprung-Russell diagram,
while cooler, redder stars are toward the right side of the Hertzsprung-Russell
diagram. Since the horizontal axis of
the Hertzsprung-Russell diagram is temperature or
spectral type or color, the horizontal axis can be labeled with the spectral
types O, B, A, F, G, K, and M. Again,
notice that the hotter, bluer stars are toward the left, while the cooler,
redder stars are toward the right. We
emphasize that the vertical axis of the Hertzsprung-Russell
diagram is the absolute magnitude, not the apparent magnitude. Therefore, we must measure the distance to a
star to calculate its absolute magnitude (or luminosity or intrinsic
brightness) before we can plot the star on the Hertzsprung-Russell
diagram. Thus far
in this course, we have only discussed the measurement of distances to nearby
stars within the solar neighborhood, within a couple of thousand parsecs. Until we discuss higher rungs of the
Cosmological Distance Ladder, we can only construct the Hertzsprung-Russell
diagram for nearby stars, within the solar neighborhood. Fortunately, there are still millions of
stars within the solar neighborhood.
Assuming that there is nothing particularly unusual with
the stars in the solar neighborhood as compared with all other stars throughout
the universe, we should be able to determine the fundamental properties
of all the stars in the entire universe by constructing the Hertzsprung-Russell
diagram for the stars within the solar neighborhood.
The first thing we notice
when we construct the Hertzsprung-Russell diagram for
the solar neighborhood is that the vast majority of the stars on the diagram
are along a band from the upper left corner of the diagram to the lower right
corner of the diagram. The astronomers Hertzsprung and Russell called this the main part of the
diagram. Hence, this band on the Hertzsprung-Russell diagram was
eventually named the main sequence.
We will clearly define what we mean by a main sequence star
shortly. For now, hotter main sequence
stars are more luminous (intrinsically brighter), while cooler main sequence
stars are less luminous (intrinsically dimmer).
Therefore, we may naively consider main sequence stars to be normal
stars, since we simplistically expect hotter stars to be more luminous and
cooler stars to be less luminous. Also notice that the vast majority of stars on the Hertzsprung-Russell diagram are main sequence stars, again
persuading us to naively consider these main sequence stars to be normal
stars. The entire main sequence is assigned the luminosity type Roman numeral V. Our Sun is a main sequence star, as are the
vast majority of all stars. Hence, our
Sun’s luminosity type is Roman numeral V.
Thus, our Sun’s spectral-luminosity type is G2V,
where G2 is the spectral type (meaning that our Sun
is yellow hot) and Roman numeral V is the luminosity type (meaning that our Sun
is a main sequence star).
Although the vast majority of
stars on the Hertzsprung-Russell diagram are along
the main sequence, there is a collection of stars on the upper right corner of
the diagram and another collection of stars on the lower left corner of the
diagram. The collection of stars in the
upper right corner of the Hertzsprung-Russell diagram
are intrinsically bright (since they are toward the top of the diagram) and
cool (since they are toward the right on the diagram). How is it possible for a cool star to be
intrinsically bright? Some students
argue that these stars are only apparently bright, since they are closer to us,
but this argument is not correct. Again,
the vertical axis of the Hertzsprung-Russell diagram
is the absolute magnitude, not the apparent magnitude. Stars that are toward the top of the Hertzsprung-Russell diagram are not apparently bright
because they happen to be close to us; stars that are toward the top of the Hertzsprung-Russell diagram are intrinsically bright. Thus, the collection of stars on the upper
right corner of the Hertzsprung-Russell diagram are
truly intrinsically bright even though they are cool. How can this be the case? The Stefan-Boltzmann law ℒ = σ(4πR2)T 4 reveals the answer.
The luminosity is determined by two variables:
temperature and radius (size). The
temperature is the more important variable, since it is
raised to the fourth power in the Stefan-Boltzmann law. The radius (size) is the less important
variable, since it is raised to only the second power
in the Stefan-Boltzmann law. However,
imagine a star with a radius (a size) so enormous that squaring its radius
overpowers its cool temperature to the fourth power, resulting in a large
luminosity. Thus, the collection of
stars on the upper right corner of the Hertzsprung-Russell
diagram have high luminosities (intrinsically bright) because they are giant
(since they are enormous) even though they are red (since they are cool). This is precisely why these stars are called red giants.
This collection of stars on the upper right corner of the Hertzsprung-Russell diagram is more properly subdivided
into red supergiants (the largest stars since they
are the most luminous), the red bright giants, the red ordinary giants, and the
red subgiants (the smallest red giants since they are the least luminous). The red supergiants
have luminosity type Roman numeral I, the red bright giants have luminosity
type Roman numeral II, the red ordinary giants have luminosity type Roman
numeral III, and the red subgiants have luminosity type Roman numeral IV. Red supergiants are
the largest stars in the entire universe; they have a radius comparable to the
radius of the Earth’s orbit around the Sun!
If we could replace our Sun with a red supergiant star, it would engulf
the entire inner Solar System! We will
often casually refer to the entire collection of stars on the upper right
corner of the Hertzsprung-Russell diagram as simply
red giants.
The collection of stars in
the lower left corner of the Hertzsprung-Russell
diagram are intrinsically dim (since they are toward the bottom of the diagram)
and hot (since they are toward the left on the diagram). How is it possible for a hot star to be
intrinsically dim? Some students argue
that these stars are only apparently dim, since they are further from us, but
this argument is not correct. Again, the
vertical axis of the Hertzsprung-Russell diagram is
the absolute magnitude, not the apparent magnitude. Stars that are toward the bottom of the Hertzsprung-Russell diagram are not apparently dim because
they happen to be far from us; stars that are toward the bottom of the Hertzsprung-Russell diagram are intrinsically dim. Thus, the collection of stars on the lower
left corner of the Hertzsprung-Russell diagram are
truly intrinsically dim even though they are hot. How can this be the case? The Stefan-Boltzmann law ℒ = σ(4πR2)T 4 again reveals the answer. The luminosity is
determined by two variables: temperature and radius (size). The temperature is the more important variable,
since it is raised to the fourth power in the
Stefan-Boltzmann law. The radius (size)
is the less important variable, since it is raised to
only the second power in the Stefan-Boltzmann law. However, imagine a star with a radius (a
size) so small that squaring its radius overpowers its hot temperature to the
fourth power, resulting in a small luminosity.
Thus, the collection of stars on the lower left corner of the Hertzsprung-Russell diagram have low luminosities
(intrinsically dim) because they are dwarfs (since they are small) even though
they are white hot. This is precisely
why these stars are called white dwarfs. Excluding neutron stars and black holes, both
of which we will discuss shortly, white dwarfs are the smallest stars in the
entire universe; they have a radius roughly equal to the radius of planet
Earth! In summary, the vast majority of
stars are main sequence stars, where the main sequence runs from the upper left
corner of the Hertzsprung-Russell diagram to the
lower right corner of the Hertzsprung-Russell
diagram. Some stars are red giants,
which are toward the upper right corner of the Hertzsprung-Russell
diagram. Red giants are intrinsically
bright even though they are cool because they are so large, hence their name
red giants. Some stars are white dwarfs,
which are toward the lower left corner of the Hertzsprung-Russell
diagram. White dwarfs are intrinsically
dim even though they are hot because they are so small, hence their name white
dwarfs.
The main sequence is both a
temperature sequence and a luminosity sequence.
In particular, given any two stars on the main sequence, the hotter star
will be more luminous (intrinsically brighter), while the cooler star will be
less luminous (intrinsically dimmer).
Warning: this is only true on the main sequence. Is it possible for a hotter star to be less
luminous? Yes, white dwarfs are hot but
are intrinsically dim. Is it possible
for a cooler star to be more luminous?
Yes, red giants are cool but are intrinsically bright. However, given two stars on the main sequence,
the hotter star is indeed more luminous, and the cooler star is indeed less
luminous. For example, suppose the
spectral types of two stars are A9 and F2. Although the A9 star is certainly hotter since it has an earlier
spectral type and the F2 star is certainly cooler
since it has a later spectral type (recall OBAFGKM),
we cannot draw any conclusion about the luminosities of these two stars. If however in addition to the spectral types
of the two stars we are also told that both stars are on the main sequence,
only then may we draw the conclusion that the A9V
star (Roman numeral V for main sequence) is more luminous, while the F2V star (Roman numeral V for main sequence) is less
luminous.
In addition to being a
temperature sequence and a luminosity sequence, the main sequence is also a
radius (size) sequence. In particular,
given any two stars on the main sequence, the hotter, more luminous star will
be larger, while the cooler, less luminous star will be smaller. Warning: this is only true on the main
sequence. Is it possible for a hotter
star to be smaller? Yes, white dwarfs
are hot but are small. Is it possible
for a cooler star to be larger? Yes, red
giants are cool but are large. However,
given two stars on the main sequence, the hotter, more luminous star is indeed
larger, and the cooler, less luminous star is indeed smaller. For example, suppose the spectral types of
two stars are B2 and M8. Although the B2
star is certainly hotter since it has an earlier spectral type and the M8 star is certainly cooler since it has a later spectral
type (recall OBAFGKM), we cannot draw any conclusion
about the luminosities or the sizes of these two stars. If however in addition to
the spectral types of the two stars we are also told that both stars are on the
main sequence, only then may we draw the conclusion that the B2V star (Roman numeral V for main sequence) is more
luminous and larger, while the M8V star (Roman
numeral V for main sequence) is less luminous and smaller.
In addition to being a
temperature sequence, a luminosity sequence, and a radius (size) sequence, the
main sequence is also a mass sequence.
In particular, given any two stars on the main sequence, the hotter,
more luminous, larger star will be more massive, while the cooler, less
luminous, smaller star will be less massive.
Warning: this is only true on the main sequence. For example, suppose the spectral types of
two stars are G7 and K5. Although the G7
star is certainly hotter since it has an earlier spectral type and the K5 star is certainly cooler since it has a later spectral
type (recall OBAFGKM), we cannot draw any conclusion
about the luminosities, the radii (sizes), or the masses of these two
stars. If however in
addition to the spectral types of the two stars we are also told that both
stars are on the main sequence, only then may we draw the conclusion that the G7V star (Roman numeral V for main sequence) is more
luminous, larger, and more massive, while the K5V
star (Roman numeral V for main sequence) is less luminous, smaller, and less
massive.
In nearly every way
imaginable, our Sun is an ordinary star.
Firstly, our Sun is a main sequence star, just as the vast majority of
stars are main sequence stars. Recall
that the spectral-luminosity type of our Sun is G2V,
and notice that its spectral type G2 places it
roughly in the middle of the main sequence.
Our Sun is not toward the beginning of the main sequence such as an
O-type or a B-type main sequence star, nor is our Sun toward the end of the
main sequence such as a K-type or an M-type main sequence star. Therefore, our Sun is not particularly hot,
nor is our Sun particularly cool; our Sun is intermediate in temperature. Our Sun is not particularly intrinsically
bright, nor is our Sun particularly intrinsically dim; our Sun is intermediate
in luminosity. Our Sun is not
particularly large, nor is our Sun particularly small; our Sun is intermediate
in size. Our Sun is not particularly
high mass, nor is our Sun particularly low mass; our Sun is intermediate in
mass. Recall that our Sun has been
fusing hydrogen into helium in its core for roughly five billion years, and it
will continue to fuse hydrogen into helium in its core for another roughly five
billion years. Therefore, our Sun is not
particularly young, nor is our Sun particularly old; our Sun is intermediate in
age. In nearly every way imaginable, our
Sun is an ordinary star.
The main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, and two more types of sequences that we will discuss shortly. We are compelled to ask the following
question: is there any type of sequence that the main sequence is not? When the astronomers Hertzsprung
and Russell first constructed the Hertzsprung-Russell
diagram, they believed that the main sequence was an evolutionary
sequence. In other words, they believed
that supposedly stars are born hot, bright O-type stars, and supposedly stars
cool as they shine, becoming B-type followed by A-type then F-type then G-type
then K-type until finally they supposedly die cool, dim M-type stars. Today, we realize that this is completely
incorrect. Stars do not evolve along the
main sequence. Unfortunately, the
astronomers Hertzsprung and Russell believed so
strongly that the main sequence was an evolutionary sequence that they called
the main sequence stars toward the upper left corner of the Hertzsprung-Russell
diagram early-type stars, and they called the main sequence stars toward the
lower right corner of the Hertzsprung-Russell diagram
late-type stars. Most unfortunately,
this incorrect nomenclature persists among astronomers and astrophysicists to
the present day. For example, an
astronomer or astrophysicist may refer to a K3V star
as being earlier than a K5V star. As another example, an astronomer or
astrophysicist may refer to an O9V star as being
later than an O3V star. Since this incorrect nomenclature persists to
the present day, we will also use this incorrect nomenclature in this
course. To summarize, the main sequence
is a temperature sequence, a luminosity sequence, a radius (size) sequence, a
mass sequence, and two more types of sequences that we will discuss shortly. By these sequences, we mean that given any
two stars on the main sequence, the star earlier in the sequence OBAFGKM will be hotter, more luminous, larger, and more
massive, while the star later in the sequence OBAFGKM
will be cooler, less luminous, smaller, and less massive. However, the main sequence is not an
evolutionary sequence, even though we will refer to stars toward the left of
the OBAFGKM sequence as being early-type and stars
toward the right of the OBAFGKM sequence as being
late-type. We emphasize this again: the
main sequence is not an evolutionary sequence.
If stars do not evolve along the main sequence, then how do stars
actually evolve? How are stars actually
born? How do stars actually live? How do stars actually die? This is the next major topic of this course,
and our entire discussion of stellar evolution will be in the context of the Hertzsprung-Russell diagram.
Stellar Evolution: Birth, Life, and Death
Stars are born from a diffuse
nebula, a giant cloud of gas many light-years across composed primarily of
hydrogen and helium. The
gases within a diffuse nebula are pushed by many different forces, including
thermal pressures, gravitational forces, magnetic pressures, and even cosmic
rays (ultra high-energy particles). All these different forces are comparable in
strength with each other in interstellar space (the space between star
systems). Thus, the gases within a
diffuse nebula are pushed in seemingly random directions, causing some regions
within the diffuse nebula to be more dense than average and other regions
within the diffuse nebula to be less dense (or more tenuous) than average. Small regions within a diffuse nebula may
become dense enough that gravity dominates over all the other forces. Thus, those small regions of the diffuse
nebula will collapse from their self-gravity (under their own weight). We can gain insight into how stars are born
by considering only gravitational forces and thermal pressures. Note that this simplified argument ignores
other forces, such as magnetic pressures and cosmic rays for example. Consider a self-gravitating cloud of gas with
thermal pressures resulting from its own temperature. If this cloud of gas is more massive than a
certain critical mass, then its self-gravity will dominate over its own thermal
pressures, and the cloud will contract.
If the cloud of gas is less massive than that critical mass, then its
own thermal pressures will dominate over its self-gravity, and the cloud will
expand. If the cloud of gas is equal in
mass to this critical mass, then its self-gravity will balance its own thermal
pressures, and the cloud will remain in equilibrium. This critical mass is
called the Jeans limit, named for the British physicist James Jeans who
first performed this simplified calculation.
Even in this simplified analysis, note that the Jeans limit is not a
particular amount of mass, since the Jeans limit itself depends upon the
temperature as well as the density of the gas.
In other words, the Jeans limit is actually a range of masses that
depends upon the temperature and the density of the gas. As a result, there is a range of masses that
a star can be born with, as we will discuss shortly. We again emphasize that this is a simplified
analysis. A cloud of gas more massive
than the Jeans limit may still not contract if magnetic pressures for example
are sufficiently strong. Astrophysicists
can measure the magnetic fields within a diffuse nebula from the polarization
of starlight that passes through the nebula, and astrophysicists have discovered
magnetic fields within regions of diffuse nebulae that are sufficiently strong
to prevent the contraction of gas within those regions of the diffuse
nebula. Nevertheless, if a small region
of a diffuse nebula is dense enough for gravity to dominate over all other
forces, then that small region of the diffuse nebula will contract, collapsing
from its self-gravity (under its own weight).
At first, the collapse does not significantly change the temperature of
the gas, since the gas is so tenuous (low density) that its constituent particles
are so far from one another that they almost never collide with one
another. However, as the cloud continues
to collapse, it becomes more and more dense and hence more and more opaque
(less and less transparent). Eventually,
the cloud becomes so dense that if it continues to collapse, its constituent
particles begin to collide with one another more and more frequently, thus
causing the collapsing cloud to become warmer.
The collapsing cloud has now become sufficiently dense that it is able
to convert gravitational energy into heat, which is Kelvin-Helmholtz
(gravitational) contraction as we discussed.
Although this collapsing cloud is not yet a star, this transition in
density and hence opacity marks the beginning of an important stage of
development. Therefore, the collapsing
cloud is now called a protostar. As a protostar
collapses, it becomes hotter and hotter due to Kelvin-Helmholtz (gravitational)
contraction. These hotter temperatures
cause greater thermal pressures, which push against the self-gravity of the protostar. Hence,
the collapse of the protostar slows. This imbalance between gravitational forces
and thermal pressures may cause pulsations within the protostar,
causing its size to oscillate from large to small and back again. As a result, the luminosity of the protostar oscillates from bright to dim and back
again. These protostars
are called Tauri variable
stars, which we will discuss later in the course. For now, if the protostar
is sufficiently massive for its self-gravity to continue to dominate over all
other forces, then it will continue to collapse, becoming hotter and
hotter. Eventually, the protostar has collapsed to such a small size that its core
temperature reaches millions kelvins, and hydrogen begins fusing into helium. These nuclear fusion reactions provide an
outward pressure to balance inward self-gravity. When the protostar
attains gravitational equilibrium, we say that a star is born.
All stars are born main
sequence stars. If all stars are born
main sequence stars, then where do red giants and white dwarfs come from? These stars come from stellar death, as we
will discuss shortly. For now, all stars
are born main sequence stars, but where along the main sequence are stars
born? With which spectral type, O, B, A,
F, G, K, or M, is a star born? As we
discussed, the Jeans limit is actually a range of masses. Hence, there is a range of many different
masses a star can be born with, and it is the mass that a
star is born with that determines the spectral type of the star. In fact, the mass of a star is the single
most important physical quantity of a star.
The mass of a star determines how it will be born, how it will live, and
how it will die. We will discuss stellar
life and stellar death shortly. For now,
if a star happens to be born with high mass because it had to overcome a large
Jeans limit, then it will be born early on the main sequence, perhaps O-type or
B-type. If a star happens to be born
with low mass because it had to overcome a small Jeans limit, then it will be
born late on the main sequence, perhaps K-type or M-type. If a star happens to be born with
intermediate mass because it had to overcome an intermediate Jeans limit, then
it will be born roughly in the middle of the main sequence, perhaps A-type,
F-type, or G-type. In brief, the mass a
star is born with determines its spectral type on the main sequence. Our argument is as follows. If a star happens to be born with high mass,
it will have strong self-gravity.
Therefore, a strong outward pressure is necessary to balance that strong
inward self-gravity, and there will be a correspondingly hot temperature
associated with that strong pressure.
Hence, the star will be born hot and bright. If a star happens to be born with low mass,
it will have weak self-gravity.
Therefore, a weak outward pressure is necessary to balance that weak
inward self-gravity, and there will be a correspondingly cool temperature
associated with that weak pressure.
Hence, the star will be born cool and dim. This explains why the main sequence is a
temperature sequence, a luminosity sequence, and a mass sequence. High-mass stars must be born hot and bright
to provide the strong outward pressure to balance the strong inward self-gravity
created by its high mass, while low-mass stars must be born cool and dim to
provide the weak outward pressure to balance the weak inward self-gravity
created by its low mass.
There is an upper limit of
mass that a star is permitted to be born with. This limit is called
the Eddington limit, named for the British physicist Arthur Eddington who first
calculated this upper mass limit. The
Eddington limit is roughly equal to 100M☉ (one hundred solar masses or one hundred times the
mass of our Sun). If a protostar happens to have a mass greater than this
Eddington limit, then the outward radiation pressure generated by its
incredible luminosity will not just balance its inward self-gravity; that
enormous outward radiation pressure will overpower its inward self-gravity. The protostar collapses
at first as usual, but the enormous outward radiation pressure eventually halts
the collapse and actually begins to expand the protostar. Essentially, the protostar
blows itself apart before it could ever be born a main sequence star. Indeed, astronomers have never discovered a
star with a mass significantly greater than roughly 100M☉ (one
hundred solar masses or one hundred times the mass of our Sun). This Eddington limit defines the beginning of
the main sequence. The earliest main
sequence star has spectral-luminosity type O0V, and
these stars have a mass roughly equal to the Eddington limit of roughly 100M☉ (one
hundred solar masses or one hundred times the mass of our Sun). If a protostar
happens to have a mass greater than the Eddington limit, it will blow itself
apart before it can even be born a main sequence star. If a protostar
happens to have a mass less than the Eddington limit, it will be born a main
sequence star, fusing hydrogen into helium in its core.
There is a lower limit of
mass that a star is permitted to be born with, roughly
equal to 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun). If a protostar
happens to have a mass less than this lower limit,
then its self-gravity will be so weak that the outward pressure necessary to
balance its weak inward self-gravity is also extraordinarily weak. The corresponding temperature is so cool that
nuclear fusion is never ignited in the core. The protostar does
eventually stop collapsing and attains gravitational equilibrium with outward
pressure balancing inward self-gravity, but the outward pressure is not provided by the nuclear fusion of hydrogen into
helium. The outward pressure is provided by electron degeneracy pressure, which we will
discuss in detail shortly. In this
course, we strictly define a main sequence star as a star that fuses hydrogen
into helium in its core. Therefore, main
sequence stars are also called hydrogen-burning stars. The use of the word burning is strictly
incorrect, since the word burning implies chemical reactions instead of nuclear
reactions. Nevertheless, astronomers and
astrophysicists use this word burning not just for hydrogen fusing into helium but for any nuclear reaction. Again, the strict definition of a main
sequence star is a hydrogen-burning star, a star that fuses hydrogen into
helium in its core. If a protostar happens to have a mass less than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then it will not be born a
main sequence star. The protostar becomes a very low mass sphere of mostly hydrogen
and helium that is not hot enough to fuse hydrogen into helium in its
core. These are called
brown dwarf stars, although they are not strictly stars. The simple term brown dwarf instead of the
term brown dwarf star would be more correct.
This lower limit of 0.08M☉ (0.08 solar masses or eight percent the mass of our
Sun) defines the end of the main sequence.
The latest main sequence star has spectral-luminosity type M9V, and these stars have a mass roughly equal to 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun). We can actually plot brown dwarfs on the Hertzsprung-Russell diagram. Since brown dwarfs are less massive and
cooler and dimmer and smaller than even M9V stars at
the end of the main sequence, brown dwarfs would be further to the right (since
they are cooler) and further down (since they are dimmer) than the end of the
main sequence. These brown dwarfs even
have their own spectral type; brown dwarfs are classified
as L-type stars, even cooler and dimmer than M-type main sequence stars. Therefore, a more complete listing of
spectral types in the correct order from hottest to coolest is OBAFGKML. These
L-type stars (brown dwarfs) are sufficiently cool that they radiate more
infrared light and less visible light.
If a protostar happens to have a mass greater
than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then the protostar will be born a main sequence (hydrogen-burning)
star, fusing hydrogen into helium in its core.
If a protostar happens to have a mass less
than 0.08M☉ (0.08
solar masses or eight percent the mass of our Sun), then the protostar will be born a brown dwarf, a very low mass
sphere of mostly hydrogen and helium that is not hot enough to fuse hydrogen
into helium in its core. These brown
dwarf stars should sound familiar. A
gas-giant planet is a sphere of mostly hydrogen and helium that is much smaller
and much less massive than a star and is not hot enough to fuse hydrogen into
helium in its core, as we discussed earlier in the course. Indeed, there is virtually no difference
between a brown dwarf star and a gas giant planet. The only difference is the circumstances of their
formation (their birth). If the object
formed from a collapsing cloud of gas within a diffuse nebula, then we name it
a brown dwarf star. If the object formed
within the protoplanetary disk orbiting a true main sequence star, then we name
it a gas giant planet. Other than their
formation (how they are born), there is virtually no difference between a brown
dwarf star and a gas giant planet.
However, many students then conclude that Jupiter is a failed star. These students argue that if Jupiter had been
just a little more massive that it would have become a true main sequence star,
resulting in us living in a binary star system.
(We will discuss binary star systems shortly.) This conclusion is false. The minimum mass necessary to become a true
main sequence star is roughly 0.08M☉ (0.08 solar masses or eight percent the mass of our
Sun), but Jupiter has a mass of only 0.001M☉
(one-thousandth of a solar mass), as we discussed earlier in the course. The ratio between 0.08 and 0.001 is
eighty. Hence, the minimum mass
necessary to become a true main sequence star is roughly eighty jovian masses. In other words, Jupiter only has a very small
fraction (one-eightieth) of the mass necessary to become a true main sequence
star. Thus, the mass of Jupiter is not
close to the minimum mass necessary to become a true main sequence star. Therefore, Jupiter should
certainly be regarded as a gas giant planet, not as a failed star. On the other hand, Jupiter might
be incorrectly regarded as a brown dwarf star by intelligent alien
lifeforms living billions of years from now, as we will discuss.
As a protostar
collapses, it spins faster and faster in accordance with the Law of
Conservation of Angular Momentum. The
amount by which a protostar collapses is so
tremendous that we can easily calculate that incredibly strong centrifugal
forces should rip apart all protostars during their
collapse, thus preventing any stars from ever forming. Fortunately, protostars
actually lose angular momentum as they collapse. Firstly, a protoplanetary disk forms around
the protostar that will eventually form planets, as
we discussed earlier in the course. Most
of the angular momentum of the collapsing gas resides in the material orbiting
around the protostar, not in the protostar
itself. In the case of our own Solar
System for example, although the mass of the Sun accounts for roughly 99.9
percent of the total mass of the entire Solar System as we discussed earlier in
the course, the rotational angular momentum of the Sun accounts for less than
four percent of the total angular momentum of the entire Solar System. In other words, the orbiting planets account
for more than ninety-six percent of the total angular momentum of the entire
Solar System. Secondly, the magnetic
field of a protostar strengthens as it
collapses. This strengthening magnetic
field ejects ionized gases at fast speeds, and these ejected ionized gases
carry angular momentum away from the protostar. These ionized gases are
often ejected as narrow columns or jets near the angular momentum axis
of the forming star system, and these jets illuminate surrounding gases as the
fast-moving jets collide with the surrounding gases. These illuminated gases, together with the
colliding ionized gases ejected from the young star system, are
called Herbig-Haro objects or HH objects for short, named for the American astronomer
George Herbig and the Mexican astronomer Guillermo Haro who discovered them.
Although protostars lose most of their angular
momentum through these mechanisms, they nevertheless collapse by such
tremendous amounts that they do rotate faster as they collapse. A protostar
eventually rotates so fast that the centrifugal force becomes sufficiently
strong that the protostar usually rips itself apart,
but not completely. A protostar usually rips apart into two protostars;
these two protostars remain orbiting each other, and
both protostars are eventually born as two main
sequence stars orbiting each other. This
process is called fragmentation, and it results in a
binary star system: two stars orbiting each other with possibly planets orbiting
both of the stars. Most star systems are
binary star systems, since fragmentation usually occurs from
the strong centrifugal forces from the increase in rotational speed due to the
tremendous collapse in size of the protostar. Fragmentation can be even more severe,
resulting in three protostars that eventually are
born as a trinary star system: three stars orbiting each other with possibly
planets orbiting all three stars.
Usually, the two more massive stars orbit each other on a tighter orbit
while the least massive third star orbits those two stars on a somewhat larger
orbit. Planets would then orbit all
three stars on even larger orbits. The
closest star system to our Solar System, the α
(alpha) Centauri star system, happens to be a trinary star system. This star system is slightly more than one
parsec distant from our Solar System, and the three stars are
named α (alpha) Centauri A, α (alpha) Centauri B, and α
(alpha) Centauri C. The star α
(alpha) Centauri A happens to be a G2V star, just like
our own Sun. The star α (alpha)
Centauri C happens to be closer to our Solar System than the other two
stars. Hence, α (alpha) Centauri C
is the closest star to us, besides the Sun of course. For this reason, the star α (alpha)
Centauri C is also called Proxima
Centauri, since the word proximity means near or close. Other nearby star systems include the Barnard
star system nearly two parsecs distant, the Luhman 16
binary star system roughly two parsecs distant, the Wolf 359 star system
roughly 2.4 parsecs distant, and the Sirius binary star system nearly three
parsecs distant. We will discuss the
Sirius binary star system in more detail shortly. Fragmentation may result in a quadruplet star
system: four stars orbiting each other with possibly planets orbiting all four
stars. Often, two of the stars orbit
each other on one tight orbit, the other two stars orbit each other on another
tight orbit, and both pairs of stars orbit each other on a somewhat larger orbit. Essentially, a quadruplet star system is
often a double binary star system.
Planets would then orbit all four stars on even larger orbits. Fragmentation may result in quintuplet star
systems (five stars orbiting each other with possibly planets orbiting all five
stars), sextuplet star systems (six stars orbiting each other with possibly
planets orbiting all six stars), and so on and so forth. All such star systems are rare. Again, most star systems are binary star
systems, with a fair number of star systems as single-star star systems, such
as our own Solar System. We discussed
that our Sun is an ordinary star in several respects. However, there is one thing unusual about our
Sun: it did not suffer from fragmentation while it was being born as a protostar, since it is the only star in our Solar
System. This is unusual, since protostars usually suffer from fragmentation, as we
discussed. Since fragmentation usually
occurs, a high mass protostar will
usually become fragmented into lower mass protostars. Hence, high mass main sequence stars are less
abundant (more rare), while low mass main sequence stars are more abundant
(more common). We conclude that the main
sequence is also a population abundance sequence, meaning earlier main sequence
stars (hotter, more luminous, larger, more massive main sequence stars) are
less abundant (more rare), while later main sequence stars (cooler, less
luminous, smaller, less massive main sequence stars) are more abundant (more
common). Therefore, most stars are born
M-type main sequence stars. Many stars
are also born K-type main sequence stars, but not as commonly as M-type main
sequence stars. A fair number of stars
are born G-type main sequence stars. Few
stars are born F-type main sequence stars, and even fewer stars are born A-type
main sequence stars. A
small fraction of all stars are born B-type main sequence stars, and
only a tiny fraction of stars are born O-type main sequence stars. To summarize, the main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, a population abundance sequence, and one more type of sequence that
we will discuss shortly. By these
sequences, we mean that given any two stars on the main sequence, the star
earlier in the sequence OBAFGKM will be hotter, more
luminous, larger, more massive, and less abundant (more rare), while the star
later in the sequence OBAFGKM will be cooler, less
luminous, smaller, less massive, and more abundant (more common). Caution: the main sequence is not an
evolutionary sequence!
After a main sequence star
(hydrogen-burning star) is born, it spends its life fusing hydrogen into helium
in its core. The duration of time that a
star spends fusing hydrogen into helium in its core depends upon its mass. Again, we see that the mass of a star is the
single most important physical quantity of a star. The mass of a star determines how it will be
born, how it will live, and how it will die.
We have already discussed stellar birth, and we will discuss stellar
death shortly. For now, the mass of a
star determines the duration of its main sequence (hydrogen-burning)
lifetime. Many students argue that high
mass stars should live longer lives, since they have more mass and therefore
more hydrogen to use as fuel for the nuclear fusion reactions in the core. These students also argue that low mass stars
should live shorter lives, since they have less mass and therefore less
hydrogen to use as fuel for the nuclear fusion reactions in the core. Although this argument seems reasonable, it
is completely wrong. In fact, the
opposite is true: high mass main sequence stars have shorter lifetimes, while
low mass main sequence stars have longer lifetimes. Firstly, simple calculations using the basic
properties of stars reveal that this must be the case. High mass stars may have more mass and therefore more hydrogen to use as fuel for the
nuclear fusion reactions in the core, but high mass stars are also much more
luminous. This luminosity is ultimately
coming from the nuclear reactions in the core, and therefore we conclude that
the nuclear reactions are proceeding at a faster rate. Low mass stars may have less mass and
therefore less hydrogen to use as fuel for the nuclear fusion reactions in the
core, but low mass stars are also much less luminous. Again, this luminosity is ultimately coming
from the nuclear reactions in the core, and therefore we conclude that the
nuclear reactions are proceeding at a slower rate. These conclusions drawn from simple
calculations are consistent with more complex calculations. Nuclear reactions should proceed at faster
rates at hotter temperatures, and nuclear reactions should proceed at slower
rates at cooler temperatures. (The same
is also true for chemical reactions.)
High mass main sequence stars are so hot that the nuclear fusion
reactions proceed so quickly that these high mass stars burn through all their
hydrogen in an extremely brief amount of time, even though they have much more
hydrogen to burn than low mass stars. Low
mass main sequence stars are so cool that the nuclear fusion reactions proceed
so slowly that it takes these low mass stars an extremely long amount of time
to burn through their hydrogen, even though they have much less hydrogen to
burn than high mass stars. Many students
argue that the lifetime of a high mass star may be relatively shorter but must
in fact be actually longer since it has much more hydrogen to burn. These students also argue that the lifetime
of a low mass star may be relatively longer but must in fact be actually
shorter since it has much less hydrogen to burn. This argument is again false. High mass main sequence stars are so hot with
such tremendous luminosity that the nuclear fusion reactions must proceed so
quickly that their actual lifetime is truly shorter,
even though these high mass stars have much more hydrogen to burn. Low mass main sequence stars are so cool with
such little luminosity that the nuclear fusion reactions must proceed so slowly
that their actual lifetime is truly longer, even though these low mass stars
have much less hydrogen to burn. The
following analogy is helpful. Most
students believe that a rich person who earns millions of dollars per year will
be able to survive much longer than a poor slob who only earns a few thousand
dollars per year, but the opposite is true.
A rich person who earns millions of dollars per year almost always
spends their money at a furious pace.
The rich spend their money so fast that they burn through their money in
a short amount of time, even though they have more money to burn. A poor slob who earns only a few thousand
dollars per year hardly has any money to spend.
Hence, the poor spend their money at such a slow pace that they can survive
their entire lifetimes on their miserable salaries. Indeed, this is actually the case; the
highest bankruptcy rates are among the rich, not among the poor. Similarly, high mass main sequence stars burn
through their hydrogen more quickly even though they have more hydrogen to
burn, while low mass main sequence stars burn through their hydrogen more
slowly even though they have less hydrogen to burn. O-type main sequence stars are so hot and so
luminous that their nuclear fusion reactions proceed so quickly that they burn
through their hydrogen in an incredibly short amount of time even though they
have much more hydrogen to burn; the main sequence lifetime of an O-type star
is roughly one million years, incredibly short by astronomical terms. B-type main sequence stars are also hot enough
and luminous enough that their nuclear fusion reactions proceed so quickly that
they burn through their hydrogen in a very short amount of time, although they
are not as hot and not as luminous as O-type stars. Therefore, B-type stars live somewhat longer
than O-type stars; the main sequence lifetime of a B-type star is roughly ten
million years. A-type stars are also hot
and luminous, but not as hot and not as luminous as O-type or B-type
stars. Therefore, their nuclear fusion
reactions proceed somewhat more slowly, giving them a somewhat longer main
sequence lifetime of roughly one hundred million years. F-type stars are not as hot and not as
luminous as A-type stars; therefore, their nuclear fusion reactions proceed
somewhat more slowly, giving them a somewhat longer main sequence lifetime of
roughly one billion years. G-type stars
have an even longer main sequence lifetime of roughly ten billion years. Recall that our Sun is a G2V
star. Also recall that our Sun has been
fusing hydrogen into helium in its core for the past roughly five billion
years, and also recall that our Sun will continue
fusing hydrogen into helium in its core for the next roughly five billion
years. Therefore, the entire main
sequence lifetime of our Sun is roughly ten billion years, as it should be for
a G-type main sequence star. K-type
stars are so cool and so dim that their nuclear fusion reactions proceed so
slowly that the main sequence lifetime of a K-type star is roughly one hundred
billion years. This is longer than the
current age of the universe, which is only roughly fourteen billion years. Therefore, every K-type
star that has ever been born has not died yet. We must wait at least an additional roughly
eighty-six billion years before K-type stars begin to die. Finally, M-type stars are so cool and so dim
that their nuclear fusion reactions proceed so slowly that the main sequence
lifetime of an M-type star is roughly one trillion years, much much longer than
the current age of the universe.
Therefore, every M-type star that has ever been born
has not died yet. We must wait
countless billions of years before any M-type stars begin to die. In summary, the main sequence is a lifetime
sequence. Given any two stars on the
main sequence, the star earlier in the sequence OBAFGKM
will have a shorter main sequence lifetime, while the star later in the
sequence OBAFGKM will have a longer main sequence
lifetime. Brown dwarf stars are so cool
that they do not fuse hydrogen into helium in their cores. Consequently, brown dwarf stars do not expend
their hydrogen, and so we may regard brown dwarf stars as living indefinitely.
We will discuss how the mass
of a star determines its death shortly.
For now, we briefly mention that the mass of a star not only determines
its main sequence lifetime but the duration of its death as well. The main-sequence lifetime of low-mass stars
is in the billions of years, while the main-sequence lifetime of high-mass
stars is only in the millions of years.
The processes involved with stellar death are shorter in duration as compared
with a star’s main-sequence lifetime, but these shorter durations are in
approximate proportion to the main sequence lifetimes. In particular, the death of a low-mass star
is millions of years in duration, while the death of a high-mass star is only
thousands of years in duration. The mass
of a star even determines its protostar-lifetime. In particular, the collapse of a low-mass protostar is millions of years in duration, while the
collapse of a high-mass protostar is only thousands
of years in duration. Note that in all
cases, the main-sequence lifetime of a star is overwhelmingly longer than the
duration of its birth as a collapsing protostar and
overwhelmingly longer than the duration of its death. The main-sequence lifetime of a star is so overwhelmingly
longer than its birth and its death that the total lifetime of a star may be
regarded as its main-sequence lifetime as an excellent
approximation. Note that the total
lifetime of a high mass star is in the millions of years, but it takes that
long just for a low-mass protostar to collapse. In other words, a high-mass star could be
born, could live its entire main-sequence lifetime, and could die all in a time
shorter than the time it takes a low-mass protostar
to collapse, meaning that a high-mass star is born, lives, and dies even before
a low-mass star can even be born!
Strictly, there are gradual
changes in the luminosity, the temperature, and even the radius (the size) of a
star over its main sequence lifetime.
Nevertheless, these main-sequence changes are small as compared with the
changes in these quantities during stellar birth, when the changes are much
more severe. Also,
main-sequence changes are small as compared with the changes in these
quantities during stellar death, when the changes are also much more severe, as
we will discuss shortly. Therefore, we
will regard the luminosity, the temperature, and the radius (the size) of a
star as approximately constant (or fixed) over its main sequence lifetime as an
excellent approximation. Since the
main-sequence lifetime of a star is overwhelmingly longer than its birth and
its death, a star remains at approximately the same location on the main
sequence on the Hertzsprung-Russell diagram during
most of its life. This validates our comparison
of temperatures, luminosities, radii (sizes), masses, population abundances,
and lifetimes of main sequence stars as physically meaningful. It is therefore appropriate to summarize the
main sequence. The main sequence is a
temperature sequence, a luminosity sequence, a radius (size) sequence, a mass
sequence, a population abundance sequence, and a lifetime sequence. By these sequences, we mean
that given any two stars on the main sequence, the main sequence star earlier
in the sequence OBAFGKM will be hotter, more
luminous, larger, more massive, less abundant (more rare), with a shorter main
sequence lifetime, while the main sequence star later in the sequence OBAFGKM will be cooler, less luminous, smaller, less
massive, more abundant (more common), with a longer main sequence lifetime. Warning: all of these conclusions can only be drawn if both stars are on the main
sequence. If even one of the two stars
is not on the main sequence, we cannot easily make any comparisons between the
two stars. Finally, the main sequence is
not an evolutionary sequence, as we are currently discussing. In particular, a star can be born anywhere
along the main sequence depending on its mass, and a star will remain at its
particular location on the main sequence throughout its main sequence lifetime,
fusing hydrogen into helium in its core.
When stars die, they actually evolve off of the
main sequence, as we now discuss.
Stellar death is defined to begin when a main sequence star has exhausted
the hydrogen in its core, having fused the hydrogen into helium. Without hydrogen in its core to fuse into
helium, the star’s main sequence (hydrogen-burning) lifetime has ended, and
stellar death begins. For the purposes
of stellar death, we divide all main sequence stars into two categories: low
mass main sequence stars and high mass main sequence stars. A low mass main sequence star has a mass less
than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses or seven, eight, or nine times the mass of our
Sun). A high mass main sequence star has
a mass greater than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses or seven, eight, or nine times the mass of our
Sun). Note that our Sun is a low mass
main sequence star as far as stellar death is concerned, since the mass of our
Sun is 1M☉ (one
solar mass), and one is less than seven, eight, or nine! In terms of spectral types, low mass main
sequence stars have spectral types A, F, G, K, or M, while high mass main
sequence stars have spectral types O or B.
Again, our Sun is a G2V star, which falls into
the low mass main sequence category. The
vast majority of all main sequence stars are low mass; only a very small
fraction of all main sequence stars are high mass. We divide all main sequence stars into these
two categories because low mass death and high mass death are sufficiently
different that we must discuss them separately.
Actually, low mass death and high mass death are somewhat similar to
each other. High mass death is simply
more violent as compared with low mass death.
In other words, low mass death is more gentle
as compared with high mass death. Since
the vast majority of all main sequence stars are low mass, most stars die
gently. Since only a very small fraction
of all main sequence stars are high mass, few stars die violently. Even though high mass death is rare, we must
devote a thorough discussion to high mass death, since we owe our very
existence to violent high mass death, as we will discuss shortly. Nevertheless, we begin our discussion with
low mass death, since the vast majority of all stars die gently, including our
own Sun.
Low mass stars have long main
sequence lifetimes. After exhausting the
hydrogen in the core, the nuclear fusion reactions end. Thus, there is no outward pressure to balance
the inward self-gravity of the helium core.
Hence, the helium core begins to collapse under its self-gravity. As the helium core collapses, it becomes
hotter, since it is converting gravitational energy into heat. A layer of hydrogen around that collapsing
helium core becomes hot enough to itself fuse into
helium. This fusion layer around the
collapsing helium core provides pressure that pushes the outer layers of the
star further outward. If the outer
layers expand, then they must become cooler.
The core of the star and the outer layers of the star are doing two
opposite things at the same time! The
core collapses and becomes hotter, while the rest of the star expands and
becomes cooler! We can only observe the
outer layers of a star; the inner layers of a star are hidden
beneath its outer layers. Hence, we
observe the outer layers of the star become larger and cooler. Cooler temperatures correspond to redder
colors. Therefore, the star becomes
larger and redder. In other words, the
star has become a red giant. As we
discussed, all stars are born main sequence stars, while red giants are
essentially dying stars. More correctly,
the outer layers of the star gradually expand and cool over millions of years,
turning the star from a main sequence star to an orange subgiant star to a red
giant star. Although this gradual
expansion over millions of years seems long as compared with human timescales,
this expansion is relatively short as compared with the billions of years the
star spent as a main sequence star. The
imbalance between gravitational forces and thermal pressures during the
expansion from a main sequence star to an orange subgiant star to a red giant
star may cause pulsations within the star, causing its size to oscillate from
large to small and back again. As a
result, the luminosity of the star oscillates from bright to dim and back
again. These stars are
called Cepheid variable stars, which we will discuss later in the
course. The helium core continues to
collapse, becoming hotter. Eventually,
the helium core becomes so hot that helium nuclei begin fusing into heavier
nuclei, in particular carbon nuclei.
This is called helium burning, although again
the use of the word burning is incorrect nomenclature. The moment when helium begins fusing into
carbon is called the helium flash. The
nuclear fusion of helium into carbon is more properly written 3 → energy +
. This nuclear reaction is
called the triple-alpha process, since three helium nuclei (three alpha
particles) fuse into a carbon nucleus.
Note that the electromagnetic repulsion between electrical charges is
directly proportional to the product of the charges. Hence, the temperature necessary to overpower
the electromagnetic repulsion between two helium nuclei (two alpha particles)
each having two positive protons is hotter than the temperature necessary to
overpower the electromagnetic repulsion between two hydrogen nuclei (two
protons), each having one positive proton.
In the case of helium-helium fusion, the electromagnetic repulsion is
proportional to two times two, which is four.
In the case of hydrogen-hydrogen fusion, the electromagnetic repulsion
is proportional to one times one, which is one.
Four is significantly greater than one, meaning more electromagnetic
repulsion and hence a hotter temperature is required for helium-helium fusion
(the basis of the triple-alpha process) as compared with hydrogen-hydrogen
fusion (the basis of the proton-proton cycle).
This helium flash causes a small expansion of the core and hence a
slight decrease in the core temperature.
This in turn causes the outer layers of the star to contract and
warm. This imbalance between gravitational
forces and thermal pressures may cause pulsations within the star, causing its
size to oscillate from large to small and back again. As a result, the luminosity of the star
oscillates from bright to dim and back again.
These stars are called Lyrae
variable stars, which we will discuss later in the course. Eventually, the entire star attains a new
gravitational equilibrium as a helium-burning star, although note that there is
a layer of hydrogen fusing into helium around the core where helium fuses into
carbon. The helium-burning lifetime of
the star is much shorter than its hydrogen-burning (main sequence) lifetime,
since helium fusing into carbon occurs at much hotter temperatures than
hydrogen fusing into helium. The star
spends millions of years as a helium-burning star. Although this seems long as compared with
human timescales, these millions of years as a helium-burning star is
relatively short as compared with the billions of years the star spent as a
hydrogen-burning (main sequence) star.
Eventually, the core exhausts the helium in its core, and even this
triple-alpha process ends. Again, there
is no outward pressure to balance the inward self-gravity of the carbon
core. Hence, the carbon core again
collapses, becoming hotter. A layer of
helium around that collapsing carbon core becomes hot enough to fuse into carbon,
and a layer of hydrogen around that helium-burning layer becomes hot enough to
fuse into helium. These two fusion
layers around the collapsing carbon core provide pressure that again pushes the
outer layers of the star further outward, causing the outer layers to become
cooler and hence redder. The star has
become a red giant a second time! Since
the star is low mass, its self-gravity is too weak to
compress the carbon core sufficiently to ignite the nuclear fusion of carbon
nuclei into even heavier nuclei. In
other words, a carbon flash does not occur.
Hence, the outer layers of the star continue to expand until they become
divorced from the very small, very hot carbon core. The outer layers have become a slowly
expanding shell of gas. This is called a planetary nebula, which is a truly incorrect
term since a planetary nebula has nothing to do with planets! The planetary nebula exposes the very small,
very hot carbon core. This naked core is
very small and very hot since it has collapsed twice. We might suspect that this naked core is
intrinsically bright, since it is so hot.
However, this naked core is very small; it is roughly the size of the
Earth! According to the Stefan-Boltzmann
law, such a small size results in a low luminosity, even though the temperature
is hot. Therefore, this naked core is
small, hot, and intrinsically dim. The
naked core has become a white dwarf. As
we discussed, all stars are born main sequence stars, while red giants and
white dwarfs result from stellar death. Hence, a low-mass main sequence
(hydrogen-burning) star dies by first becoming a red giant, entering a
helium-burning phase, becoming a red giant a second time, and finally dying as
a slowly expanding planetary nebula surrounding a white dwarf. White dwarfs have incredible densities, since
they have roughly the mass of our Sun squeezed into roughly the size of the
Earth. The radius of the Earth, and
therefore the radius of a white dwarf, is roughly 0.01R☉
(one-hundredth of a solar radius or one-hundredth the radius of our Sun). Therefore, the volume of the Earth, and
therefore the volume of a white dwarf, is roughly one-millionth of the volume
of our Sun. With roughly the mass of the
Sun squeezed into roughly one-millionth the volume of the Sun, white dwarfs
therefore have densities roughly one million times normal densities! White dwarfs also have sufficiently hot
surface temperatures to radiate a fair amount of ultraviolet light. The gases of the surrounding planetary nebula
absorb some of these ultraviolet photons radiated by the hot white dwarf,
bringing the electrons within these gases to higher energy quantum states. The electrons then transition back down to
lower energy quantum states, emitting visible light photons. As a result, a planetary nebula surrounding a
white dwarf often displays a variety of beautiful colors. The planetary nebula continues to expand,
becoming cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the planetary nebula
return to the interstellar medium. The
interstellar medium (which astrophysicists always abbreviate ISM) is the very
diffuse gas that fills the Milky Way Galaxy.
In fact, a nebula is actually a part of the interstellar medium where
densities are greater than the average densities of most of the gas of the
interstellar medium. As we discussed,
stars are born from within a diffuse nebula; therefore, stars are born from the
interstellar medium. Low mass stars live
their lives fusing hydrogen into helium, begin dying by fusing helium into
carbon, and finally die by returning the gas of its outer layers back to the
interstellar medium. These gases may
someday form a new diffuse nebula from which new stars will be born. Hence, stellar evolution is actually a cycle,
since stellar death ultimately leads to stellar birth again. Beautiful examples of planetary nebulae
include the Ring Nebula in the constellation Lyra (the harp), the Little Ghost
Nebula in the constellation Ophiuchus (the serpent bearer), and the Helix Nebula
in the constellation Aquarius (the water bearer). The white dwarf at the center of a planetary
nebula spends billions of years becoming cooler and cooler and hence dimmer and
dimmer. In countless billions of years,
a white dwarf becomes so cool and so dim that it is renamed a black dwarf.
We subdivide low mass main
sequence stars into two subcategories: ordinary low mass stars and very low
mass stars. Ordinary low mass main
sequence stars have masses from 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) down to roughly 0.5M☉
(one-half of one solar mass). Very low
mass main sequence stars have masses from roughly 0.5M☉
(one-half of one solar mass) all the way down to the lower limit of all main
sequence stars of roughly 0.08M☉ (0.08 solar masses).
Note that our Sun is an ordinary low mass star, since the mass of our
Sun is 1M☉ (one
solar mass), and one is between one-half and seven, eight, or nine! In terms of spectral types, ordinary low mass
main sequence stars have spectral types of A, F, or G, while very low mass main
sequence stars have spectral types of K or M.
Recall that our Sun is a G2V star, again
placing our Sun into the ordinary low mass subcategory. The stellar death we have discussed thus far
strictly applies to ordinary low mass main sequence stars, like our Sun. Very low mass main sequence stars die
somewhat differently. A very low mass
star spends countless billions of years fusing hydrogen into helium in its
core. After exhausting the hydrogen in
its core, the nuclear fusion reactions end.
Thus, there is no outward pressure to balance the inward self-gravity of
the helium core. Hence, the helium core
begins to collapse under its self-gravity.
As the helium core collapses, it becomes hotter, since it is converting
gravitational energy into heat. A layer
of hydrogen around that collapsing helium core becomes hot enough to fuse into
helium. This fusion layer around the
collapsing helium core provides pressure that pushes the outer layers of the
star further outward. If the outer
layers expand, then they must become cooler.
Again, the core of the star and the outer layers of the star are doing
two opposite things at the same time.
The core collapses and becomes hotter, while the rest of the star
expands and becomes cooler. Again, we
can only observe the outer layers of a star; the inner layers of a star are hidden beneath its outer layers. Hence, we observe the outer layers of the
star become larger and cooler. Cooler
temperatures correspond to redder colors.
Therefore, the star becomes larger and redder. In other words, the outer layers of the star
gradually expand and cool over millions of years, turning the star from a main
sequence star to a subgiant star to a giant star. The death of very low mass stars seems identical
with the death of ordinary low mass stars, but now the differences begin. Very low mass stars have such weak
self-gravity that they cannot compress their cores to reach the threshold
temperatures at which helium fuses into carbon.
In other words, the helium flash never occurs, and the star only becomes
a red giant once instead of twice. The
outer layers of the star continue to expand, eventually becoming a planetary
nebula surrounding a helium white dwarf instead of a carbon white dwarf. As we discussed, every
K-type or M-type main sequence star that has ever been born has not died
yet. Hence, there are no helium white
dwarfs in the entire universe as of yet.
We must wait at least an additional roughly eighty-six billion years
before K-type stars begin to die. There
are more K-type stars than A-type stars, F-type stars, or G-type stars, since
the main sequence is a population abundance sequence. Hence, when K-type stars begin to die, helium
white dwarfs will become the majority of the white dwarfs in the universe,
turning the carbon white dwarfs into a minority of the white dwarfs in the
universe. Countless billions of years
after that, M-type main sequence stars will begin to die, and there are even
more M-type main sequence stars than K-type main sequence stars. Hence, when M-type stars begin to die, helium
white dwarfs will become the overwhelming majority of all white dwarfs in the
universe, while carbon white dwarfs will become the overwhelming minority of
all white dwarfs in the universe.
Ordinary low mass stars do
not have sufficient self-gravity to compress their carbon cores to sufficient
temperatures for the carbon flash to occur.
Hence, ordinary low mass stars die as a non-burning carbon white dwarf
surrounded by a slowly expanding planetary nebula. Very low mass stars have such weak
self-gravity that not even the helium flash occurs. Hence, very low mass stars die as a
non-burning helium white dwarf surrounded by a slowly expanding planetary
nebula. In either case, low mass stars
die as a slowly expanding planetary nebula surrounding a non-burning white
dwarf. If there are no nuclear reactions
occurring in a white dwarf, what is providing the outward pressure to balance
the inward self-gravity to keep a white dwarf in gravitational equilibrium? As we discussed, white dwarfs have densities
roughly one million times normal densities.
At such incredible densities, electrons are squeezed
toward each other. However, electrons
obey the Pauli Exclusion Principle, named for the Austrian physicist Wolfgang
Pauli who first formulated this fundamental statement of Quantum
Mechanics. According to the Pauli
Exclusion Principle, certain quantum-mechanical particles are
forbidden from occupying the same quantum state at the same time. Thus, any attempt to compress such particles
into the same quantum state will result in a pressure against this
compression. This pressure is called degeneracy pressure. Electrons are one type of quantum-mechanical
particle that obey the Pauli Exclusion Principle. In other words, electrons are
forbidden from occupying the same quantum state at the same time. It is because of this exclusion that
electrons within atoms must occupy higher energy states when lower energy
states happen to be already filled with electrons. It is the electrons in the
higher energy quantum states of one atom that chemically react and chemically
bond with the electrons in the higher energy quantum states of other atoms. Therefore, all of
chemistry, including all of the biochemistry essential for all life, would not
occur if electrons did not obey the Pauli Exclusion Principle. Also since electrons
obey the Pauli Exclusion Principle, it is electron degeneracy pressure that
provides the outward pressure to balance the inward self-gravity of a white
dwarf. Electron degeneracy pressure also
provides the outward pressure to balance the inward self-gravity of brown
dwarfs. Many students argue that this
electron degeneracy pressure must come from the electromagnetic repulsion of
the electrons. As we discussed earlier
in the course, like charges repel, and unlike charges attract. Since electrons are negatively charged, they
must repel each other electromagnetically, and students argue that this is the
source of the electron degeneracy pressure.
Although this argument seems reasonable, it is nevertheless wrong. Electron degeneracy pressure has nothing to
do with electromagnetic repulsion. Of
course, the electromagnetic repulsion of the electrons provides some extra
pressure in addition to the electron degeneracy pressure. However, electron degeneracy pressure has
nothing to do with the charge of electrons.
The source of electron degeneracy pressure is the spin of the
electrons. The spin of any
quantum-mechanical particle is its intrinsic angular momentum. As a crude picture, we can imagine that the
electron is spinning around an axis. According to Quantum Mechanics, it is this spinning of the electron
around an axis that is the source of the electron degeneracy pressure. We will discuss another type of degeneracy
pressure shortly that will beautifully emphasize how degeneracy pressure has
nothing to do with electromagnetic repulsion.
To summarize, white dwarfs (as well as brown dwarfs) remain in gravitational
equilibrium not due to nuclear reactions but due to electron degeneracy
pressure, which arises because the Pauli Exclusion Principle prevents electrons
(and certain other quantum-mechanical particles) from occupying the same
quantum state at the same time.
Degeneracy pressure has nothing to do with electromagnetic repulsion;
degeneracy pressure arises from the intrinsic angular momentum (the spin) of
certain quantum-mechanical particles.
It is instructive to discuss
how our particular Solar System will die.
Our Sun is an ordinary low mass star with a main sequence
(hydrogen-burning) lifetime of roughly ten billion years. Our Sun has spent roughly five billion years
fusing hydrogen into helium in its core, and our Sun will spend an additional roughly
five billion years fusing hydrogen into helium in its core. After exhausting the hydrogen in its core,
our Sun will begin to die. Gradually
over millions of years (which is brief as compared with its ten-billion-year
main sequence lifetime), our Sun’s helium core will collapse and become hotter
while its outer layers expand and become cooler, turning our Sun from a yellow
main sequence star to an orange subgiant star to a red giant star. The helium flash will then occur, and our Sun
will become a helium-burning star. Our
Sun’s helium-burning lifetime will last millions of years, which is again brief
as compared with its ten-billion-year main sequence lifetime. After exhausting the helium in its core, our
Sun’s carbon core will collapse and become hotter, while its outer layers
expand and become cooler. Our Sun will
become a red giant a second time. When
the outer layers of our Sun expand to become a red giant the second time, its
outer layers will consume the inner planets (Mercury, Venus, Earth, and
Mars). However, the outer layers of a
red giant are cool, only one or two thousand kelvins in temperature. Although this temperature is hot by human
standards, it is not hot enough to melt most rocks, and it is certainly not hot
enough to melt most metals. Hence, the
inner planets will not immediately be destroyed when
our Sun’s second red giant phase consumes them.
In fact, the inner planets will at first continue
to orbit the red giant Sun while being inside the red giant Sun! This will not continue long however, since
the outer layers of the red giant Sun will cause drag as the inner planets
orbit within these outer layers of gas.
This drag will cause the inner planets to spiral inward toward the red
giant Sun’s core, which is certainly hot enough to melt metal and rock. This is how the inner planets will be destroyed.
The outer layers of the red giant Sun will continue to expand and
cool. By the time these gases reach the
outer planets, they will be so tenuous (low density) that they will have a
negligible effect on the outer planets.
These outer gas layers will pass the outer planets, continuing to become
cooler and cooler and more and more diffuse (less and
less dense). These outer gas layers will
eventually become a planetary nebula, returning these gases to the surrounding interstellar
medium. Now the only gravitational
attraction the outer planets will feel is from the carbon white dwarf, the
naked core of the former red giant Sun.
However, the Sun has lost most of its mass, since it injected its outer
gas layers, which became an expanding planetary nebula. The carbon white dwarf was once the Sun’s
core, which is only a small fraction of the Sun’s original mass. With significantly less mass, the carbon
white dwarf will not have sufficient gravitational attraction to hold the outer
planets in orbit. Hence, the outer
planets will leave their orbits, becoming rogue planets (or orphan
planets). A rogue (or orphan) planet is
a planet that is not orbiting any particular star but instead moves along its
own trajectory through our Milky Way Galaxy.
Finally, all that will remain of our Solar System will be a carbon white
dwarf, which was once our Sun’s core.
All of these processes will begin in roughly five billion years, and
they will take many millions of years to occur.
If we could return to our Solar System roughly six billion years from
now, all of these processes would be complete, and a carbon white dwarf would
be all that remains of our Solar System.
Billions of years from now, intelligent life may evolve on another planet
orbiting another star. These intelligent
lifeforms may even build telescopes and discover the rogue (or orphan) planet
Jupiter moving through the Milky Way Galaxy.
However, these intelligent lifeforms will have no direct evidence that
Jupiter once orbited our Sun, since our Sun will have long since died. Hence, these intelligent lifeforms will probably mistakenly believe that Jupiter is a brown
dwarf star. Perhaps some of the brown
dwarf stars we observe today were once gas giant planets that were once
orbiting an ancient star that has long since died. In other words, perhaps some of the brown
dwarf stars we observe today are not brown dwarf stars at all but are actually
rogue (or orphan) planets.
High mass main sequence stars
have masses greater than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses). In terms of spectral types, high mass main
sequence stars are either O-type stars or B-type stars. High mass death is somewhat similar to low
mass death but more violent. Since high
mass stars are rare, the vast majority of main sequence stars die gently, while
only a small fraction of main sequence stars die violently. Nevertheless, we must discuss high mass
death, since we owe our very existence to high mass death, as we will discuss
shortly.
A high mass main sequence
star spends a short amount of time fusing hydrogen into helium in its core,
only several million years. After
exhausting the hydrogen in its core, the helium center collapses and becomes hotter,
while a new layer of hydrogen fusion causes the outer layers of the star to
expand further outward and become cooler.
The core is compressed until the triple alpha
process 3 → energy +
begins, and the star becomes a helium-burning
star, having a core where helium fuses into carbon
surrounded by a layer where hydrogen fuses into helium. The helium-burning lifetime of the star is
hundreds of thousands years, shorter than the star’s hydrogen-burning (main
sequence) lifetime, since helium fusion occurs at hotter temperatures than
hydrogen fusion. Eventually, the central
helium is exhausted, the carbon center collapses and becomes hotter, while two
surrounding fusion layers cause the outer layers of the star to expand further
outward and become cooler. Thus far,
high mass death seems nearly identical with low mass death, but now the
differences begin. High mass stars have
such strong self-gravity that their cores are compressed
until they attain the threshold temperature where carbon nuclei fuse into even
heavier nuclei, in particular oxygen nuclei.
More strictly, carbon nuclei fuse with helium nuclei (alpha particles)
to yield oxygen nuclei. This nuclear
reaction is more properly written
+
→ energy +
. Note that the electromagnetic repulsion between
electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower
the electromagnetic repulsion between a carbon nucleus with six positive
protons and a helium nucleus (an alpha particle) with two positive protons is
not as hot as the temperature necessary to overpower the electromagnetic
repulsion between two carbon nuclei, each having six positive protons. In the case of carbon-helium fusion, the
electromagnetic repulsion is proportional to six times two, which is
twelve. In the case of carbon-carbon
fusion, the electromagnetic repulsion is proportional to six times six, which
is thirty-six. Twelve is significantly less
than thirty-six, meaning less electromagnetic repulsion and hence a less hot
temperature is required for carbon-helium fusion as compared with carbon-carbon
fusion. Although there is an even weaker
electromagnetic repulsion between a hydrogen nucleus (a proton) and a carbon
nucleus, the nuclear fusion of a hydrogen nucleus with any other nucleus is
slow, since it involves the weak nuclear force.
The star is now a carbon-burning star, having a core where carbon fuses
into oxygen surrounded by two less hot fusion layers. The carbon-burning lifetime of the star is
tens of thousands of years, even shorter than its helium-burning lifetime,
since carbon burning occurs at even hotter temperatures than helium burning,
since carbon-helium fusion temperatures are proportional to twelve (six times
two), a larger number as compared with helium-helium fusion temperatures which
are proportional to four (two times two).
Eventually, the central carbon is exhausted, the oxygen center collapses
and becomes hotter, while three surrounding fusion layers cause the outer
layers of the star to expand further outward and become cooler. These high mass stars have such strong
self-gravity that their cores are compressed until
they attain the threshold temperature where oxygen nuclei fuse into even
heavier nuclei, in particular neon nuclei.
More strictly, oxygen nuclei fuse with helium nuclei (alpha particles)
to yield neon nuclei. This nuclear
reaction is more properly written
+
→ energy +
. Again, the electromagnetic repulsion between
electrical charges is directly proportional to the product of the charges. Hence, the temperature necessary to overpower
the electromagnetic repulsion between an oxygen nucleus with eight positive
protons and a helium nucleus (an alpha particle) with two positive protons is
not as hot as the temperature necessary to overpower the electromagnetic
repulsion between two oxygen nuclei, each having eight positive protons. In the case of oxygen-helium fusion, the
electromagnetic repulsion is proportional to eight times two, which is
sixteen. In the case of oxygen-oxygen
fusion, the electromagnetic repulsion is proportional to eight times eight,
which is sixty-four. Sixteen is
significantly less than sixty-four, meaning less electromagnetic repulsion and
hence a less hot temperature is required for oxygen-helium fusion as compared
with oxygen-oxygen fusion. Although
there is an even weaker electromagnetic repulsion between a hydrogen nucleus (a
proton) and an oxygen nucleus, the nuclear fusion of a hydrogen nucleus with any
other nucleus is again slow, since it involves the weak nuclear force. The star is now an oxygen-burning star,
having a core where oxygen fuses into neon surrounded by three less hot fusion
layers. The oxygen-burning lifetime of
the star is several thousand years, even shorter than its carbon-burning
lifetime, since oxygen burning occurs at even hotter temperatures than carbon
burning, since oxygen-helium fusion temperatures are proportional to sixteen
(eight times two), a larger number as compared with carbon-helium fusion
temperatures which are proportional to twelve (six times two). Eventually, the central oxygen is exhausted,
the neon center collapses and becomes hotter, while four surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where neon nuclei fuse into even
heavier nuclei, in particular magnesium nuclei.
More strictly, neon nuclei fuse with helium nuclei (alpha particles) to
yield magnesium nuclei. This nuclear
reaction is more properly written
+
→ energy +
. The star is now a neon-burning star, having a
core where neon fuses into magnesium surrounded by four less hot fusion layers. The neon-burning lifetime of the star is
several hundred years, even shorter than its oxygen-burning lifetime, since
neon burning occurs at even hotter temperatures than oxygen burning, since
neon-helium fusion temperatures are proportional to twenty (ten times two), a
larger number as compared with oxygen-helium fusion temperatures which are
proportional to sixteen (eight times two).
Eventually, the central neon is exhausted, the magnesium center
collapses and becomes hotter, while five surrounding fusion layers cause the
outer layers of the star to expand further outward and become cooler. These high mass stars have such strong
self-gravity that their cores are compressed until
they attain the threshold temperature where magnesium nuclei fuse into even
heavier nuclei, in particular silicon nuclei.
More strictly, magnesium nuclei fuse with helium nuclei (alpha
particles) to yield silicon nuclei. This
nuclear reaction is more properly written
+
→ energy +
. The star is now a magnesium-burning star,
having a core where magnesium fuses into silicon surrounded by five less hot
fusion layers. The magnesium-burning
lifetime of the star is several decades, even shorter than its neon-burning
lifetime, since magnesium burning occurs at even hotter temperatures than neon
burning, since magnesium-helium fusion temperatures are proportional to
twenty-four (twelve times two), a larger number as compared with neon-helium
fusion temperatures which are proportional to twenty (ten times two). Eventually, the central magnesium is
exhausted, the silicon center collapses and becomes hotter, while six
surrounding fusion layers cause the outer layers of the star to expand further
outward and become cooler. These high
mass stars have such strong self-gravity that their cores are
compressed until they attain the threshold temperature where silicon
nuclei fuse into even heavier nuclei, in particular sulfur nuclei. More strictly, silicon nuclei fuse with
helium nuclei (alpha particles) to yield sulfur nuclei. This nuclear reaction is more properly
written
+
→ energy +
. The star is now a silicon-burning star,
having a core where silicon fuses into sulfur surrounded by six less hot fusion
layers. The silicon-burning lifetime of
the star is several years, even shorter than its magnesium-burning lifetime,
since silicon burning occurs at even hotter temperatures than magnesium
burning, since silicon-helium fusion temperatures are proportional to
twenty-eight (fourteen times two), a larger number as compared with
magnesium-helium fusion temperatures which are proportional to twenty-four
(twelve times two). Eventually, the
central silicon is exhausted, the sulfur center collapses and becomes hotter,
while seven surrounding fusion layers cause the outer layers of the star to
expand further outward and become cooler.
These high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature
where sulfur nuclei fuse into even heavier nuclei, in particular argon
nuclei. More strictly, sulfur nuclei
fuse with helium nuclei (alpha particles) to yield argon nuclei. This nuclear reaction is more properly
written
+
→ energy +
. The star is now a sulfur-burning star, having
a core where sulfur fuses into argon surrounded by seven less hot fusion
layers. The sulfur-burning lifetime of
the star is several months, even shorter than its silicon-burning lifetime,
since sulfur burning occurs at even hotter temperatures than silicon burning,
since sulfur-helium fusion temperatures are proportional to thirty-two (sixteen
times two), a larger number as compared with silicon-helium fusion temperatures
which are proportional to twenty-eight (fourteen times two). Eventually, the central sulfur is exhausted,
the argon center collapses and becomes hotter, while eight surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where argon nuclei fuse into even
heavier nuclei, in particular calcium nuclei.
More strictly, argon nuclei fuse with helium nuclei (alpha particles) to
yield calcium nuclei. This nuclear
reaction is more properly written
+
→ energy +
. The star is now an argon-burning star, having
a core where argon fuses into calcium surrounded by eight less hot fusion
layers. The argon-burning lifetime of
the star is several days, even shorter than its sulfur-burning lifetime, since
argon burning occurs at even hotter temperatures than sulfur burning, since
argon-helium fusion temperatures are proportional to thirty-six (eighteen times
two), a larger number as compared with sulfur-helium fusion temperatures which
are proportional to thirty-two (sixteen times two). Eventually, the central argon is exhausted,
the calcium center collapses and becomes hotter, while nine surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where calcium nuclei fuse into even
heavier nuclei, in particular titanium nuclei.
More strictly, calcium nuclei fuse with helium nuclei (alpha particles)
to yield titanium nuclei. This nuclear
reaction is more properly written
+
→ energy +
. The star is now a calcium-burning star,
having a core where calcium fuses into titanium surrounded
by nine less hot fusion layers. The
calcium-burning lifetime of the star is even shorter than its argon-burning
lifetime, since calcium burning occurs at even hotter temperatures than argon
burning, since calcium-helium fusion temperatures are proportional to forty
(twenty times two), a larger number as compared with argon-helium fusion
temperatures which are proportional to thirty-six (eighteen times two). Eventually, the central calcium is exhausted,
the titanium center collapses and becomes hotter, while ten surrounding fusion
layers cause the outer layers of the star to expand further outward and become
cooler. These high mass stars have such
strong self-gravity that their cores are compressed
until they attain the threshold temperature where titanium nuclei fuse into
even heavier nuclei, in particular chromium nuclei. More strictly, titanium nuclei fuse with
helium nuclei (alpha particles) to yield chromium nuclei. This nuclear reaction is more properly
written
+
→ energy +
. The star is now a titanium-burning star,
having a core where titanium fuses into chromium surrounded by ten less hot
fusion layers. The titanium-burning
lifetime of the star is even shorter than its calcium-burning lifetime, since
titanium burning occurs at even hotter temperatures
than calcium burning, since titanium-helium fusion temperatures are
proportional to forty-four (twenty-two times two), a larger number as compared
with calcium-helium fusion temperatures which are proportional to forty (twenty
times two). Eventually, the central
titanium is exhausted, the chromium center collapses and becomes hotter, while
eleven surrounding fusion layers cause the outer layers of the star to expand
further outward and become cooler. These
high mass stars have such strong self-gravity that their cores are compressed until they attain the threshold temperature
where chromium nuclei fuse into even heavier nuclei, in particular iron nuclei
and nickel nuclei. More strictly,
chromium nuclei fuse with helium nuclei (alpha particles) to yield iron nuclei,
and iron nuclei fuse with helium nuclei (alpha particles) to yield nickel
nuclei. These nuclear reactions are more
properly written
+
→ energy +
and
+
→ energy +
. The star is now a chromium-burning star,
having a core where chromium fuses into iron and nickel surrounded by eleven
less hot fusion layers. The
chromium-burning lifetime of the star is even shorter than its titanium-burning
lifetime, since chromium burning occurs at even hotter temperatures than
titanium burning, since chromium-helium fusion temperatures are proportional to
forty-eight (twenty-four times two), a larger number as compared with
titanium-helium fusion temperatures which are proportional to forty-four (twenty-two
times two). Eventually, the central
chromium is exhausted, the iron-nickel center collapses and becomes hotter,
while twelve surrounding fusion layers cause the outer layers of the star to
expand further outward and become cooler.
In brief, each successive nuclear reaction occurs at hotter and hotter
temperatures. The first hydrogen-burning
stage occurs at tens of millions of kelvins.
Helium burning, carbon burning, and oxygen burning each occurs at
hundreds of millions of kelvins. All the
remaining burning (fusion) stages occur at a few billion kelvins! Also, each
successive lifetime of the star is shorter and shorter, again since each
successive nuclear reaction occurs at hotter and hotter temperatures. The first hydrogen-burning stage (the
main-sequence lifetime) is itself relatively short for these high-mass stars,
lasting only millions of years. Helium
burning, carbon burning, and oxygen burning each last only thousands of years,
neon burning lasts only centuries, and magnesium burning lasts only
decades. Silicon burning lasts only
years, sulfur burning only months, and argon burning only days! Calcium burning and titanium burning last
only hours, and chromium burning lasts only minutes!
Many students now conclude
that successively hotter and hotter nuclear reactions continue to occur,
synthesizing heavier and heavier nuclei all the way to the end of the Periodic
Table of Elements. However, this nuclear
reaction chain actually ends at iron and nickel, which is roughly halfway
through the Periodic Table of Elements.
As we discussed, nuclear fission is the splitting of
more massive (heavier) nuclei into less massive (lighter) nuclei, while
nuclear fusion is the merging or fusing of less massive (lighter) nuclei into
more massive (heavier) nuclei. Both of
these types of nuclear reactions generate energy because atoms of intermediate
mass (roughly halfway through the Periodic Table of Elements) have the most
stable nuclei among all atoms. More massive (heavier) nuclei attain greater stability by
splitting into less massive (lighter) nuclei, hence releasing energy. Less massive (lighter) nuclei attain greater
stability by merging or fusing into more massive (lighter) nuclei, again
releasing energy. Hence, nuclei of
intermediate mass (roughly halfway through the Periodic Table of Elements) cannot be merged or fused into more massive (heavier) nuclei, since
those more massive (heavier) nuclei would spontaneously split back into the
starting nuclei. Similarly,
nuclei of intermediate mass (roughly halfway through the Periodic Table of
Elements) cannot be split into less massive (lighter) nuclei, since those less
massive (lighter) nuclei would spontaneously merge or fuse back into the
starting nuclei. Iron and nickel are
intermediate-mass atoms, roughly halfway through the Periodic Table of
Elements. Hence, the iron nucleus and
the nickel nucleus are among the most stable nuclei of all the nuclei in the
universe. Thus, the nuclear reaction
chain at the center of a high mass star ends at iron and nickel. Caution: the physical strength of iron has
nothing to do with its nuclear stability; the physical strength of iron arises
from interactions among its electrons that reside in atomic states around the nucleus,
not nuclear states within the nucleus.
The core of the high mass star now has several layers, rather like the
layers of an onion. Starting
at the center of the many-layered core, we have non-burning iron and nickel
surrounded by a layer of chromium burning (fusing) into iron and nickel
surrounded by a layer of titanium burning (fusing) into chromium surrounded by
a layer of calcium burning (fusing) into titanium surrounded by a layer of
argon burning (fusing) into calcium surrounded by a layer of sulfur burning
(fusing) into argon surrounded by a layer of silicon burning (fusing) into
sulfur surrounded by a layer of magnesium burning (fusing) into silicon
surrounded by a layer of neon burning (fusing) into magnesium surrounded by a
layer of oxygen burning (fusing) into neon surrounded by a layer of carbon
burning (fusing) into oxygen surrounded by a layer of helium burning (fusing)
into carbon surrounded by a layer of hydrogen burning (fusing) into helium. Surrounding this many-layered core is the
rest of the star, which is not hot enough for any nuclear fusion reactions to
occur. Hence, most of the star is
composed of roughly seventy-five percent (three-quarters) hydrogen and roughly
twenty-five percent (one-quarter) helium.
With each core collapse, these outer layers of the star have expanded
further outward, becoming cooler and therefore redder. Since the outer layers of the star have
expanded many times with each of the many collapses of the core, the star has
become enormous. The star has become a
red supergiant. While low mass stars
begin to die by becoming red giants, high mass stars begin to die by becoming
red supergiants.
Since non-burning iron and nickel constitutes the center of the
many-layered core of this supergiant star, the non-burning iron and nickel
center must be supported by electron degeneracy
pressure. Note that the center of the
core has compressed many times, squeezing the electrons closer and closer to
each other. As we discussed, white dwarfs
are supported by electron degeneracy pressure. Therefore, we may regard the center of the
many-layered core of this supergiant star as an iron-nickel white dwarf. This iron-nickel white dwarf core has
collapsed many times, making it small and hot.
In brief, at this stage of the life of a high mass star, it has become a
supergiant star with a many-layered core, and the center of that many-layered
core of the supergiant star is an iron-nickel white dwarf supported by electron
degeneracy pressure.
The iron-nickel white dwarf
that comprises the center of the many-layered core of a red supergiant star is
under such tremendous pressure that exotic nuclear reactions can occur. One such exotic nuclear reaction is called electron capture, where a proton devours an
electron thus transmuting itself into a neutron and emitting a neutrino. This nuclear reaction is more properly
written + e–
→
+ νe.
Caution: in nuclear physics, the symbol
is used for the
hydrogen-1 nucleus, which is simply a proton.
Also note that
is the symbol of the neutron in nuclear
physics, as we discussed. Also as we discussed, e–
is the symbol of the (ordinary) electron, and νe is the symbol
of the neutrino. Neutrinos are extremely
weakly interacting particles, as we also discussed. Hence, the neutrinos generated by this
nuclear reaction simply fly out of the center of the many-layered core, passing
through all the other layers of the core, flying through the outer layers of
the red supergiant, and propagating into the surrounding outer space at nearly
the speed of light. The iron-nickel
white dwarf center was being supported by electron
degeneracy pressure. If protons are
devouring electrons, then the electron degeneracy pressure that was supporting
the center of the many-layered core vanishes.
The neutrons that were synthesized by this
nuclear reaction go into free fall, since there is no pressure to balance
self-gravity. According to Quantum
Mechanics, neutrons obey the Pauli Exclusion Principle, just as electrons obey
the Pauli Exclusion Principle. In other
words, no two neutrons can occupy the same quantum state at the same time, and
thus attempting to squeeze neutrons together results in a repulsion called
neutron degeneracy pressure. This
beautifully illustrates that degeneracy pressure has nothing to do with
electromagnetic repulsion. Neutrons are
neutral; they do not attract or repel each other electromagnetically. However, neutrons do repel each other through
neutron degeneracy pressure if they are squeezed too
close to each other. The neutrons
therefore stop collapsing when neutron degeneracy pressure halts their
collapse. It is not difficult to
calculate that neutron degeneracy pressure halts the collapse when the
iron-nickel white dwarf has collapsed from the size of the Earth (the white
dwarf size scale) down to a radius of roughly ten kilometers. This is roughly the size of a city! This incredibly small and dense sphere of
neutrons supported by neutron degeneracy pressure is called
a neutron star. The existence of white
dwarfs is already difficult to comprehend, since they have compressed roughly
the mass of our Sun into roughly the size of the Earth, with
a density roughly one million times normal densities. Now imagine compressing roughly the mass of
our Sun into roughly the size of a city!
The resulting density of a neutron star is hundreds of millions of times
more dense than even a white dwarf, making a neutron star hundreds of trillions
of times more dense than normal densities! These densities are fantastic, far beyond
human comprehension. As
a result of these fantastic densities, the gravity near a neutron star
significantly warps the fabric of space and time around it, as we will discuss
shortly in the context of Einstein’s theory of gravity, the General Theory of
Relativity. Although the density of a
neutron star is far beyond human imagination, its density is actually roughly
equal to the density of every nucleus of every atom composing everything in the
universe, including our own bodies.
Therefore, we may regard a neutron star as an enormous atomic
nucleus! The most massive atoms at the
end of the Periodic Table of Elements have atomic masses of nearly three
hundred, but far far far beyond those atoms are neutron stars with atomic
masses of roughly one octillion nonillion or one septillion decillion! It is not difficult to calculate that the
free-fall collapse of the iron-nickel white dwarf from roughly the size of the
Earth to a neutron star roughly the size of a city occurs in roughly one
millisecond, one-thousandth of one second!
It is also not difficult to calculate the amount of energy liberated
when roughly one solar mass collapses from roughly the size of the Earth to
roughly the size of a city. The energy
liberated is comparable to the total energy radiated by our Sun over its entire
ten billion year lifetime! The resulting
luminosity of this high mass star is in the billions of solar
luminosities! This is roughly the total
power output of an entire galaxy of stars!
Such fantastic quantities of energy liberated in such an incredibly short
amount of time causes a cataclysmic explosion.
This is how high mass stars die; they obliterate themselves in a
spectacularly violent explosion called a supernova. Strictly, this is a Type II (Roman numeral)
supernova. We will discuss Type I (Roman
numeral) supernovae later in the course.
To summarize, high mass stars live short main sequence lifetimes, swell
to become red supergiants, and explode as Type II
supernovae. The violence of this
explosion throws the outer layers of the star away from the explosion at very
high speeds and heats these gases to millions of kelvins of temperature. This rapidly expanding, hot gas is called a supernova remnant, which astrophysicists always
abbreviate SNR. Beautiful examples of
supernova remnants include the Crab Nebula in the constellation Taurus (the
bull), the Tycho Nebula in the constellation
Cassiopeia (the queen of Aethiopia), and the Kepler
Nebula in the constellation Ophiuchus (the serpent bearer).
The Type II supernova of a
high mass star is so violent that all the nuclei across the entire Periodic
Table of Elements are synthesized. The nuclear reactions do not end at iron and
nickel, roughly halfway through the Periodic Table of Elements. The nuclear reactions actually proceed all
the way to the end of the Periodic Table of Elements, synthesizing even the
most massive (heaviest) of all nuclei, such as uranium and plutonium. As we will discuss toward the end of the
course, the universe was essentially pure hydrogen and helium when it was born
in the fires of the Big Bang. If the
universe was born pure hydrogen and helium, where did all the other atoms of
the Periodic Table of Elements come from?
Most stars are born low mass, and these low mass stars fuse hydrogen
into helium. At best, they can fuse
helium into carbon. However, high mass
stars synthesize all the elements up to iron and nickel within their cores, and
then synthesize all the elements through to the end of the Periodic Table of
Elements within their violent Type II supernovae. The rapidly expanding supernova remnant
carries all these nuclei into the surrounding outer space. As the supernova remnant expands, it becomes
cooler and cooler and more and more diffuse (less and less dense). Eventually, the gases of the supernova
remnant return to the interstellar medium, enriching or polluting the
interstellar medium with these new nuclei.
These enriched or polluted gases may someday form a new diffuse nebula
from which new stars will be born, but now this diffuse nebula has been enriched or polluted with new nuclei. We now realize why we owe our very existence
to high mass death. Our bodies are
composed of these atoms, such as the iron in our blood, the sodium and
potassium in our nerves, the calcium in our bones, and the oxygen that composes
the water that makes up most of our bodies.
All the terrestrial planets, including our own planet Earth, are also
composed of these atoms, such as iron and nickel and silicon and oxygen, as we
discussed earlier in the course. Without
high mass stellar death, there would be no terrestrial planets and no
life. If the universe was essentially
pure hydrogen and helium when it was born in the fires of the Big Bang, then
the first generation of stars born in the universe were essentially pure
hydrogen and helium, and therefore they could not have had terrestrial planets
orbiting them. At best, they had jovian, gas-giant planets orbiting
them. The deaths of these first
generation stars were essential to creating future generations of stars that
could have terrestrial planets orbiting them and therefore the potential for
life on some of these terrestrial planets, in particular our planet Earth.
We subdivide high mass main
sequence stars into two subcategories: ordinary high mass stars and very high
mass stars. Ordinary high mass stars
have masses from 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses) up to 20M☉ to 25M☉ (twenty to twenty-five solar masses). Very high mass stars have masses from 20M☉ to 25M☉ (twenty
to twenty-five solar masses) all the way up to the Eddington limit of roughly 100M☉ (one
hundred solar masses). In terms of
spectral types, we will regard ordinary high mass stars as B-type stars and
very high mass stars as O-type stars.
The stellar death we have discussed strictly applies to ordinary high
mass stars. A very high mass star also
lives a very short main sequence lifetime and also
swells to become a red supergiant with an iron-nickel white dwarf center
surrounded by a many-layered core. The
red supergiant also explodes with a Type II supernova, again initiated by
electron capture in the core that emits neutrinos, which again throws outward a
very hot and rapidly expanding supernova remnant. Very high mass death seems identical to ordinary
high mass death, but the difference is as follows. Very high mass stars have such tremendous
self-gravity that not even neutron degeneracy pressure can halt the core
collapse. If neutron degeneracy pressure
cannot halt the collapse of the core, then nothing can halt the collapse of the
core. The core continues collapsing all
the way down to a mathematical point.
This is the ultimate triumph of gravity.
This mathematical point is called a black hole,
which we will discuss in more detail shortly in the context of Einstein’s
theory of gravity, the General Theory of Relativity. Recall that the main sequence is a population
abundance sequence, with higher mass main sequence stars being less abundant
(more rare) than lower mass main sequence stars, which are more abundant (more
common). Thus, very high mass stars are more rare than ordinary high mass stars. Therefore, most Type II supernovae leave
behind a hot supernova remnant rapidly expanding away from a neutron star. On rare occasions, a Type II supernova will
leave behind a hot supernova remnant rapidly expanding away from a black
hole. To summarize high mass death, the
star spends a short time as a hydrogen-burning (main sequence) star, swells to
become a red supergiant with an iron-nickel white dwarf center surrounded by a
many-layered core, and explodes as a Type II supernova. The Type II supernova is
triggered by electron capture in the core that emits neutrinos,
collapses the core, and throws outward a hot supernova remnant that rapidly
expands away from either a neutron star or a black hole. Notice that high mass death is similar to low
mass death, just more violent. As we
discussed, a low mass star spends a long time as a hydrogen-burning (main sequence)
star, swells to become a red giant, and finally dies as a planetary nebula
slowly expanding away from a white dwarf.
The supernova remnant for high mass death is analogous to the planetary
nebula for low mass death, and the neutron star or the black hole for high mass
death is analogous to the white dwarf for low mass death. We can turn this logic completely around and
claim that low mass death is similar to high mass death, just more gentle. The planetary nebula for low mass death is
analogous to the supernova remnant for high mass death, and the white dwarf for
low mass death is analogous to the neutron star or the black hole for high mass
death.
Supernovae are rare, since
only a tiny fraction of all stars are high mass that
die with Type II supernova explosions.
In a typical galaxy like our Milky Way Galaxy that is composed of
roughly one hundred billion stars, there is only one supernova per century on
average. If a supernova occurs in a
typical galaxy roughly once every century (once every one hundred years), then
if astronomers continuously observe one hundred galaxies, we should observe
roughly one supernova every year on average.
If astronomers continuously observe one thousand galaxies, we should
observe roughly ten supernovae every year on average; this would be roughly
once a month. If astronomers
continuously observe ten thousand galaxies, we should observe roughly one
hundred supernovae every year on average; this would be roughly twice a
week. Over the past few decades,
astronomers have used telescopes to continuously observe
tens of thousands of galaxies. Thus, we
observe several hundred supernovae every year; this is roughly once every
day. However, these supernovae are in
distant galaxies, far beyond our own Milky Way Galaxy. These supernovae cannot be
observed with the naked eye; they can only be observed with very
powerful telescopes. The procedure for
discovering a supernova in a distant galaxy is as follows. We use a powerful telescope to photograph a
galaxy night after night after night.
One night, we see a point of light in the galaxy that is as bright as
the entire galaxy. We conclude that one
of the stars in that galaxy has suffered a supernova explosion. The point of light remains bright for a
couple of weeks, and the point of light eventually fades away over the next
several months.
Our Sun will never suffer
from a supernova, since our Sun is a low mass star. This is fortunate, since if our Sun were to
suffer from a supernova, the explosion would obliterate our entire Solar
System! There are no
nearby high mass stars that may suffer a supernova that could harm us in any
way, which stands to reason since high mass stars are rare. There are however some high mass stars close
enough to be visible with the naked eye that have already entered the red
supergiant phase, such as Betelgeuse in the constellation Orion (the hunter)
and Antares in the constellation Scorpius (the scorpion). How would Betelgeuse or Antares appear in the
sky if they were to suffer a supernova? A
supernova has a luminosity of billions of solar luminosities. Hence, the star would appear to become
billions of times brighter. This would
be so bright that we could see the star in the daytime! The star would remain this bright for a
couple of weeks. Over the next several
months, the star would remain fairly bright but would
gradually fade in intensity. Within
roughly one year, the star would vanish from our sky, forever changing the
appearance of the constellation Orion (the hunter) or Scorpius (the scorpion),
since a bright star in the constellation has now been forever erased from our
sky! Again, this sequence of events
would all be visible to the naked eye, making nearby supernovae within our own
Milky Way Galaxy exciting to observe.
Over the past millennium (one thousand years), we have observed roughly
one supernova per century within our own Milky Way Galaxy. Warning: astronomers have observed supernovae
roughly once every day over the past few decades, but these are supernovae in
distant galaxies that can only be observed with very
powerful telescopes. Only a handful of
naked-eye supernovae over the past millennium have been
observed, including on April 1006, July 1054 creating the Crab Nebula,
August 1181, November 1572 creating the Tycho Nebula,
and October 1604 creating the Kepler Nebula.
Note that the last supernova on this list, the most recent naked-eye
supernova from within our Milky Way Galaxy, occurred more than four hundred
years ago. If a supernovae occur in a
typical galaxy roughly once per century, then we are long overdue for a
naked-eye supernova from within our Milky Way Galaxy. We could almost guarantee that we will
observe one or perhaps two or perhaps even three
naked-eye supernovae from within our Milky Way Galaxy within our
lifetimes. Frustratingly, the last
naked-eye supernova from within our Milky Way Galaxy occurred before Galileo
Galilei made his historic observations of the sky with his primitive telescope
in the year 1609, as we discussed earlier in the course. Thus, the model we have presented of a Type
II supernova being triggered by the core collapse and
subsequent explosion of a high mass star remained an untested theoretical model
for many years. This all changed in the
historic year 1987. As we discussed,
working at a neutrino detector is the most boring job in the world, since a neutrino
detector only detects a single neutrino per day. However, on Monday, February 23, 1987, at
07:35:35 universal time, neutrino detectors around the world detected
twenty-five neutrinos within a time span of less than thirteen seconds! Physicists all around the world had no idea
how to explain this incredible burst of neutrinos. The source of these neutrinos was revealed a couple of hours later, when astronomers
witnessed the star named CPD-69 402 (also named GSC 09162-00821) violently explode, becoming
extraordinarily more luminous. This star
was not within our own Milky Way Galaxy however; this star was within a nearby
galaxy called the Large Magellanic Cloud, nearly two
hundred thousand light-years from our Solar System. It suddenly became clear what caused the
neutrino burst. Nearly two hundred
thousand years ago, the high mass star CPD-69 402 (GSC 09162-00821) swelled to become a supergiant star until
electron capture was initiated in its iron-nickel
white dwarf center, triggering a supernova explosion. Neutrinos flew out of the core, with the
light from the explosion following right behind the neutrinos. Over the next nearly two hundred thousand
years, the neutrinos propagated spherically outward, with the light from the
explosion also propagating spherically outward.
On the 23rd day of February in the historic year 1987, the neutrinos
from this supernova passed through planet Earth, and neutrino detectors around
the world detected twenty-five of them.
A couple of hours later, the light from the supernova arrived at the
Earth, and this light was not only observed by
astronomers through telescopes but was actually witnessed by all humans (in the
southern hemisphere) with the naked eye.
This is the closest supernova to occur in roughly four hundred
years. The name of this supernova is SN1987A, since the name of a supernova always begins with
the letters SN (for supernova) followed by the year astronomers first observed
the supernova followed by the letter of the English alphabet indicating numerically
which observed supernova it was for that year.
For example, the first supernova astronomers observed in the year 2017
was named SN2017A, the second supernova astronomers
observed that same year was named SN2017B, the third supernova astronomers observed that same year
was named SN2017C, and so on and so forth. There are only twenty-six letters in the
English alphabet. Therefore, the
twenty-seventh supernova astronomers observed in the year 2017 was named SN2017aa, the
twenty-eighth supernova astronomers observed in that same year was named SN2017ab, the twenty-ninth supernova astronomers observed
in that same year was named SN2017ac, and so on and
so forth. Again, astronomers observe
hundreds of supernovae every year from distant galaxies. However, SN1987A
was the closest supernova observed in roughly four hundred years. This supernova was close enough and hence
bright enough to be visible to the naked eye (but only from the southern
hemisphere). This supernova provided
strong evidence that our theoretical model of supernova explosions is
correct. In summary, a Type II supernova is caused by a dying high mass star that swells
to become a supergiant star. The
nuclear reaction electron capture in the core triggers the Type II
supernova. Neutrinos fly out of the
core, the core collapses, and the energy of the collapse is
liberated in a cataclysmic explosion with a brightness in the billions
of solar luminosities. The final result of the Type II supernova is a very hot
supernova remnant rapidly expanding away from either a neutron star or a black
hole. The next time neutrino detectors
around the world detect a burst of neutrinos, every astronomical telescope in
the world will immediately point to supergiant stars such as Betelgeuse or
Antares to witness the actual explosion of the supergiant star. Over the past few decades since SN1987A, astronomers have witnessed the formation of the
supernova remnant that resulted from that supernova. Astrophysicists will continue to study SN1987A for many centuries, just as astrophysicists
still continue to study the Crab Nebula for example, which resulted from
a supernova observed in July 1054, almost one thousand years ago. Over several decades, astronomers have
measured the growing size of several supernova remnants. We can calculate the speed with which the
supernova remnant expands from these observations, and we can then extrapolate
backwards to calculate how long ago the supernova occurred. In the cases where astronomers from previous
centuries actually witnessed the supernova occur, our
extrapolated date of the supernova is always roughly equal to the date that was
recorded by astronomers centuries ago.
When an ordinary high mass
star suffers a supernova explosion, the iron-nickel white dwarf core collapses
to a neutron star, as we discussed. By
the Law of Conservation of Angular Momentum, the collapsing core must spin
faster. Since the iron-nickel white
dwarf roughly the size of the Earth collapses to a neutron star roughly the
size of a city, the rotational speed of the neutron star after the collapse is
hundreds of thousands of times faster than the rotational speed of the
iron-nickel white dwarf from which it collapsed! If the iron-nickel white dwarf rotated once a
day for example, the neutron star that formed from it must rotate once in less
than one second! Furthermore, the
magnetic field lines of the star are pulled with the
collapsing core. Hence, the magnetic
field lines tighten, increasing the strength of the magnetic field by a tremendous
amount. The magnetic field at the
surface of a neutron star can be trillions of times stronger than the Earth’s
magnetic field! It would be improbable
for the magnetic poles of the neutron star to precisely
coincide with its rotational poles, just as the Earth’s magnetic poles
do not coincide with its own rotational poles, as we discussed earlier in the
course. Hence, as the neutron star
rotates, its magnetic axis precesses around its
rotational axis. The incredibly strong
magnetic field that precesses at the incredibly fast
rotational speed causes electromagnetic waves to be radiated away from the
neutron star, and these radiated electromagnetic waves also
rotate with the neutron star. If the precessing magnetic axis of the neutron star happens to
direct these emissions in our general direction, we will observe regular pulses
of electromagnetic waves as the neutron star rotates, rather like the rotating
light from a lighthouse. These neutron
stars are called pulsars. The first pulsars ever discovered were the
Crab Pulsar at the center of the Crab Supernova Remnant in the constellation
Taurus (the bull) and the Vela Pulsar at the center of the Vela Supernova
Remnant in the constellation Vela (the sails).
The discovery of these and other pulsars at the center of supernova
remnants provides further strong evidence that our theories of supernova
explosions are correct. Presumably, all
neutron stars are born as pulsars, but the continuous emission of electromagnetic
waves carries energy and angular momentum away from the pulsar, thus slowing
the rotation of the neutron star.
Eventually, the neutron star would no longer emit pulses. Astronomers have measured the gradual rotational
slowing of several pulsars to be a few microseconds per year, and several
non-pulsar neutron stars have been discovered. Note however that some of these non-pulsar
neutron stars may in fact be pulsars. If
a neutron star happens to have an axis of rotation that precesses
its magnetic axis to radiate pulses that do not happen to be
emitted in our general direction, then we would not observe its
pulses. Hence, we would incorrectly
conclude that the pulsar neutron star is instead a non-pulsar neutron star, and
it would be virtually impossible for us to discover that this particular
neutron star is in fact a pulsar.
Neutron stars can also have their rotational speed increased. As we will discuss shortly, gases may fall
toward a neutron star, and these gases may add angular momentum to the neutron
star, speeding up its rotation. These are called millisecond pulsars, since they rotate once in
only a couple of milliseconds! These
millisecond pulsars are also called recycled pulsars,
since they were at first rotationally slowing from a pulsar neutron star to a
non-pulsar neutron star, but the additional angular momentum gave the neutron
star a second life as a pulsar.
As we discussed, the cutoff
between low mass main sequence stars and high mass main sequence stars is 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses). Many students
demand to know the exact cutoff: is it 7M☉ (seven solar masses), 8M☉ (eight
solar masses), or 9M☉ (nine
solar masses)? Unfortunately, we cannot
specify this cutoff more precisely, since there is uncertainty in our
theoretical calculations. The Type II
supernova of a high mass star is triggered by the
failure of electron degeneracy pressure to support the white dwarf core of the
red supergiant. Therefore, we might be
able to specify an exact cutoff between a low mass star and a high mass star by
calculating the maximum mass electron degeneracy pressure is able to
support. Caution: this would be the
cutoff mass for only the core of the star, not the cutoff mass for the entire
star. The maximum mass that electron
degeneracy pressure is able to support is called the
Chandrasekhar limit, named for the Indian astrophysicist Subrahmanyan
Chandrasekhar who first performed this calculation. The Chandra observatory, NASA’s great X-ray
space telescope as we discussed earlier in the course, is
also named for this astrophysicist.
The Chandrasekhar limit is equal to 1.4M☉ (1.4
solar masses or 1.4 times the mass of our Sun).
This is the maximum mass that electron degeneracy pressure is able to
support. Therefore, this is the
core-mass cutoff between a low mass star and a high mass star. Again, this is the cutoff mass for the core
only; the cutoff mass for the entire star is 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses). In other
words, a star with a total mass of 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) has a core mass
equal to 1.4M☉ (1.4
solar masses). If the mass of the entire
star less than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses), then the mass of its core is less than 1.4M☉ (1.4
solar masses). In this case, electron degeneracy
pressure will be able to support the core.
Therefore, the star must be a low mass star, and it will die gently as a
slowly expanding planetary nebula surrounding a white dwarf. If the mass of the entire star is greater
than 7M☉, 8M☉, or 9M☉ (seven,
eight, or nine solar masses), then the mass of its core is greater than 1.4M☉ (1.4
solar masses). In this case, electron
degeneracy pressure will not be able to support the core. Therefore, the star must be a high mass star,
and it will die violently as a Type II supernova resulting in a very hot
supernova remnant rapidly expanding away from a neutron star that is supported by neutron degeneracy pressure. Since the Chandrasekhar limit is the maximum
mass that electron degeneracy pressure is able to support, it is not only the
core-mass cutoff between low mass stars and high mass stars. The Chandrasekhar limit is also the maximum
permitted mass of a white dwarf. This has been verified by observations. No white dwarf has ever
been discovered with a mass greater than the Chandrasekhar limit of 1.4M☉ (1.4
solar masses). The brightest star in the
constellation Canis Major (the big dog) is Sirius the
dog star, as we discussed earlier in the course. Sirius is actually a binary star, and the two
stars are named Sirius A and Sirius B.
Whereas Sirius A is a main sequence star, Sirius B is a white dwarf, the
closest white dwarf to our Solar System and one of the first white dwarfs ever
discovered. The mass of the Sirius B
white dwarf is roughly 1.0M☉ (1.0 solar masses), less than the 1.4M☉ (1.4
solar mass) Chandrasekhar limit.
As we discussed, the cutoff
between ordinary high mass main sequence stars and very high mass main sequence
stars is 20M☉ to 25M☉ (twenty
to twenty-five solar masses). Many students
demand to know the exact cutoff: is it 20M☉, 21M☉, 22M☉, 23M☉, 24M☉, or 25M☉? Unfortunately, we cannot specify this cutoff
more precisely, since there is uncertainty in our theoretical
calculations. The formation of a black
hole results from the failure of neutron degeneracy pressure to support the
core. Therefore, we might be able to
specify an exact cutoff between an ordinary high mass star and a very high mass
star by calculating the maximum mass neutron degeneracy pressure is able to
support. Caution: this would be the
cutoff mass for only the core of the star, not the cutoff mass for the entire
star. The maximum mass that neutron
degeneracy pressure is able to support is called the Tolman-Oppenheimer-Volkoff limit
or the TOV limit for short, named for the three physicists who first attempted
this calculation: the American physicist Richard Tolman,
the American physicist J. Robert Oppenheimer, and the Russian physicist George Volkoff. The
American physicist J. Robert Oppenheimer is most famous for being the father of
the nuclear bomb, since he was the head physicist of the secret Manhattan
Project during the Second World War. We
are not certain of the precise value of the Tolman-Oppenheimer-Volkoff limit.
Although these three physicists (and other physicists) have attempted
this calculation, neutron stars have such fantastically high densities that the
precise properties of the state of matter within neutron stars is unknown. Presumably, the outer layers of the neutron
star are composed of neutrons; this is called the
crust of the neutron star. However, the
interior of a neutron star is under such incredible pressures that the quarks
(which compose both protons and neutrons) are squeezed
out of the neutrons. Hence, we no longer
have individual neutrons toward the center of a neutron star. The core of a neutron star is actually
composed of a fantastically dense soup of quarks and gluons all colliding with
each other. Just as light is composed of photons are hence photons are ultimately responsible for
the electromagnetic force, gluons are quantum-mechanical particles that are
ultimately responsible for the strong nuclear force. This new state of matter at the core of a
neutron star is called a quark-gluon plasma, about
which we know very little. Therefore,
calculating the exact value of the Tolman-Oppenheimer-Volkoff limit remains elusive. Nevertheless, theoretical estimates have
revealed its approximate value. The Tolman-Oppenheimer-Volkoff limit
is very roughly equal to 2.4M☉ (2.4 solar masses), and it is definitely less than 3M☉ (three solar masses).
Although we do not know the precise value of the Tolman-Oppenheimer-Volkoff limit, for the purposes of this discussion we will
use roughly 2.4M☉ (2.4
solar masses). This is the maximum mass that
neutron degeneracy pressure is able to support.
Therefore, this is the core-mass cutoff between an ordinary high mass
star and a very high mass star. Again,
this is the cutoff mass for the core only; the cutoff mass for the entire star
is 20M☉ to 25M☉ (twenty
to twenty-five solar masses). In other
words, a star with a total mass of 20M☉ to 25M☉ (twenty to twenty-five solar masses) has a core mass
equal to roughly 2.4M☉ (2.4
solar masses). To summarize, if the mass
of the entire star is less than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) and of course
greater than the lower limit of 0.08M☉ (0.08 solar masses) of all main sequence stars, then
the mass of its core is less than 1.4M☉ (1.4 solar masses).
In this case, electron degeneracy pressure will be able to support the
core. Therefore, the star must be a low
mass star, and it will die gently as a slowly expanding planetary nebula
surrounding a white dwarf. If the mass
of the entire star is greater than 7M☉, 8M☉, or 9M☉ (seven, eight, or nine solar masses) but less than 20M☉ to 25M☉ (twenty
to twenty-five solar masses), then the mass of its core is greater than 1.4M☉ (1.4
solar masses) but less than roughly 2.4M☉ (2.4 solar masses).
In this case, electron degeneracy pressure will not be able to support
the core, but neutron degeneracy pressure will be able to support the
core. Therefore, the star must be an
ordinary high mass star, and it will die violently as a Type II supernova resulting
in a very hot supernova remnant rapidly expanding away from a neutron star that
is supported by neutron degeneracy pressure. If the mass of the entire
star is greater than 20M☉ to 25M☉
(twenty to twenty-five solar masses) and of course less than the Eddington
limit of roughly 100M☉ (one hundred solar masses) of all main
sequence stars, then the mass of its core is greater than roughly 2.4M☉ (2.4 solar masses) and probably less
than roughly 10M☉ (ten solar masses), the approximate
core mass of the most massive stars at the Eddington limit. In this case, not even neutron degeneracy
pressure is able to support the core.
Therefore, the star must be a very high mass star, and it will die
violently as a Type II supernova resulting in a very hot supernova remnant
rapidly expanding away from a black hole.
As we discussed, since the Chandrasekhar limit is the maximum mass that
electron degeneracy pressure is able to support, it is not only the core-mass
cutoff between low mass stars and ordinary high mass stars; the Chandrasekhar
limit is the maximum permitted mass of a white dwarf. We also now realize that this Chandrasekhar
limit is also the minimum mass of a neutron star. Since the Tolman-Oppenheimer-Volkoff limit is the maximum mass that neutron degeneracy
pressure is able to support, it is not only the core-mass cutoff between
ordinary high mass stars and very high mass stars; the Tolman-Oppenheimer-Volkoff limit is the maximum permitted mass of a neutron
star. It is also the minimum mass of a
black hole. In
conclusion, the mass of a white dwarf must be less than the 1.4M☉
Chandrasekhar limit, the mass of a neutron star must be greater than the 1.4M☉ Chandrasekhar limit but less than the
roughly 2.4M☉ Tolman-Oppenheimer-Volkoff limit, and the mass of a black hole must be greater
than the roughly 2.4M☉ Tolman-Oppenheimer-Volkoff limit but less than the 10M☉
rough estimate for the core mass of the most massive stars at the Eddington
limit. Caution: we will discuss
shortly that since nothing can escape from a black hole, a black hole will gain
more and more mass over time. After
billions of years, a black hole may attain a mass of millions and even billions
of solar masses. These are called supermassive black holes, which we will discuss
later in the course. We will use the
term stellar black holes for black holes recently born from the Type II
supernova of a very high mass star, and some stellar black holes grow over
billions of years to become supermassive black holes. We will also discuss toward the end of the
course that there may be microscopic black holes in our universe. These microscopic black holes are also called primordial black holes, since they were born
in the fires of the Big Bang, the creation of the entire universe.
At normal densities, solids
and liquids are highly incompressible, resulting in solids and liquids having
roughly constant densities. In other
words, at normal densities the volume of solids and liquids is directly
proportional to their mass, meaning more mass will occupy a proportionally
larger volume. For example, twice as
much metal or twice as much rock or twice as much
liquid water will all occupy twice as much volume. However, white dwarfs and neutron stars are both supported by degeneracy pressure. Therefore, more massive white dwarfs and more massive neutron stars must in fact have smaller volumes
to provide greater pressure to balance the significantly stronger self-gravity
from its higher mass. A particular white
dwarf or a particular neutron star will actually shrink in volume (shrink in
size) if it happens to gain mass, as we will discuss shortly.
Stars are born from within a
diffuse nebula, a giant cloud of gas many light-years across composed primarily
of hydrogen and helium. Since a diffuse
nebula is so enormous, many stars are born within a diffuse nebula
simultaneously. Therefore, stars are
born in clusters. However, most stars do
not remain in clusters indefinitely.
After a star cluster is born from a diffuse nebula, the individual stars
move apart from one another, moving along their own trajectories through our
Milky Way Galaxy. Therefore, most stars
are not members of star clusters. For
example, our Sun is not presently a member of a star cluster, although it was
presumably born a member of an ancient star cluster that has long since
dispersed. Although most stars are not
members of star clusters, it is the study of star clusters
that has truly revealed that our models of stellar evolution are
correct. In our discussion of star
clusters, we will see the power of the Hertzsprung-Russell
diagram in explaining and predicting stellar properties and stellar evolution.
When we construct the Hertzsprung-Russell diagram for a star cluster, we can see
the main sequence, the red giants, and the white dwarfs on the diagram. Astronomers often abbreviate the main
sequence MS. The red giants appear along
the asymptotic giant branch, which astronomers often abbreviate AGB. The asymptotic
giant branch connects the main sequence with another collection of stars called
the horizontal branch, which astronomers often abbreviate HB. The horizontal branch ends at another
grouping of stars called the clump. In
almost every Hertzsprung-Russell diagram for almost
every star cluster, we notice that the early part of the main sequence is
missing. This confirms that high mass
main sequence stars live shorter lifetimes than low mass main sequence stars,
which live longer lifetimes. The star cluster
is sufficiently old that the stars from the missing early part of the main
sequence have already died, since they live short lifetimes. However, the star cluster is not sufficiently
old for the stars in the late part of the main sequence to have died as of yet. These stars are still hydrogen-burning main
sequence stars, since they have longer lifetimes. The earliest main sequence star in the Hertzsprung-Russell diagram for a star cluster is called the main sequence turnoff, since it connects the
main sequence to the asymptotic giant branch.
In other words, the hottest, most luminous, largest, and most massive
(earliest) main sequence star in the Hertzsprung-Russell
diagram for a star cluster is at the main sequence turnoff. The main sequence turnoff reveals the age of
the star cluster. If the main sequence
turnoff is early, then the star cluster must be young, since there are still
short-lifetime main sequence stars that have not yet evolved into red
giants. If the main sequence turnoff is
late, then the star cluster must be old, since there are only long-lifetime
stars remaining on the main sequence.
For example, if the main sequence turnoff is in the spectral type B,
then the star cluster must be roughly ten million years old, since the main
sequence lifetime of a B-type star is roughly ten million years. As another example, if the main sequence
turnoff is in the spectral type F, then the star cluster must be roughly one
billion years old, since the main sequence lifetime of an F-type star is
roughly one billion years. The main sequence
lifetime of a G-type star like our Sun is roughly ten billion years. No star cluster has ever
been discovered with a main sequence turnoff later than the G spectral
type. This is one way
we know the age of the entire universe.
The universe cannot be much older than ten billion years since we have
never discovered a star cluster with a main sequence turnoff later than
spectral type G. At the very end of this
course, we will discuss that the age of the universe is more precisely fourteen
billion years, which we have determined from the expansion of the entire
universe. Notice these two different
methods of determining the age of the universe are fairly
consistent with each other. Since
the asymptotic giant branch connects with the main sequence at the main
sequence turnoff, the red giants along the asymptotic giant branch must be
expanding to become red giants after ending their main sequence lifetimes. The stars near the beginning of the
asymptotic giant branch are orange subgiant stars; they have only recently left
the main sequence. The stars suffering
from the helium flash are at the end of the asymptotic giant branch, where the
asymptotic giant branch connects to the horizontal branch. We also find Cepheid variable stars along the
asymptotic giant branch, since Cepheid variable stars suffer from the
instability of transitioning from a main sequence star to a red giant
star. We will discuss Cepheid variable
stars later in the course. The stars
along the horizontal branch are in the process of attaining gravitational
equilibrium from the new pressure provided by the helium fusion in the stellar
core. We also find Lyrae
variable stars along the horizontal branch, since Lyrae
variable stars suffer from the instability of transitioning from a red giant
star to a helium-burning star. We will
discuss Lyrae variable stars later in the
course. The clump is the collection of
helium-burning stars that have attained gravitational equilibrium. There is often a second asymptotic giant branch
connected to the clump. These are stars that have exhausted the helium in their
cores. The carbon core collapses, while
the outer layers of the star expand.
These are stars that have become red giants for
the second time. Eventually, the slowly
expanding outer layers of the star will divorce themselves from the core. The slowly expanding outer layers will become
a planetary nebula, while the naked core will become a white dwarf. Indeed, we see white dwarfs in the Hertzsprung-Russell diagram for star clusters. If we plot the stars of a newly born star
cluster on the Hertzsprung-Russell diagram, we would
see the entire main sequence with no red giants and no white dwarfs, since a
newly born cluster has not had time for any main sequence stars to die. If we could wait millions of years as the
stars within this newly born star cluster evolve and if we could plot these
stars accordingly on the Hertzsprung-Russell diagram,
we would see the early-type main sequence stars become red supergiant stars and
then disappear from the Hertzsprung-Russell diagram
as they live their short lifetimes and explode as Type II supernovae. As a result, the main sequence turnoff would
appear to advance from spectral type O to spectral type B, thus shrinking the
appearance of the main sequence on this Hertzsprung-Russell
diagram. As the main sequence turnoff
advances to spectral type A, we would see these main sequence stars evolve into
orange subgiant stars and then into red giant stars, forming the asymptotic
giant branch. When these stars
eventually suffer from the helium flash, we would then see the formation of the
horizontal branch and the clump. When
these stars exhaust the helium in their cores, we would then see the formation
of the second asymptotic giant branch.
We would then see white dwarfs begin to appear on this Hertzsprung-Russell diagram. If we could wait billions
of more years and if we could continue to plot these stars accordingly on the Hertzsprung-Russell diagram, we would see the main sequence
turnoff continue to advance from spectral type A to spectral type F to spectral
G as more and more main sequence stars begin the process of stellar death, thus
further shrinking the appearance of the main sequence on the Hertzsprung-Russell diagram. We would continue to see stars move from the
main sequence toward and along the asymptotic giant branch, along the
horizontal branch, through the clump, and along the second asymptotic giant
branch. We would also see more and more
white dwarfs appear on this Hertzsprung-Russell
diagram as these stars reach the very end of their evolution.
The Hertzsprung-Russell
diagram for a star cluster can be used to determine
the distance to the cluster. Suppose a
star cluster is significantly beyond the solar neighborhood. Therefore, the star cluster is too distant
for parallax to be used to determine its
distance. Hence, we need another method
to determine the distance to this star cluster.
The procedure to determine the distance to this cluster is as follows. First, we first construct the Hertzsprung-Russell diagram for the star cluster. At this suggestion, we should all
protest. The vertical axis of the Hertzsprung-Russell diagram is the luminosity
or the absolute magnitude or the intrinsic brightness, and we must have
the distances to stars to determine their luminosities or absolute magnitudes
or intrinsic brightnesses. Suppose we instead use the apparent magnitude
as the vertical axis instead of the absolute magnitude. At this suggestion, we should all even more
strongly protest. The apparent magnitude
or the apparent brightness of a star depends upon its distance from us; the
apparent magnitude is not an intrinsic property of a star! Here is the crux of the argument. The star cluster is distant enough that all
of the stars within the cluster are roughly the same distance from us;
therefore, all of their apparent brightnesses are
directly proportional to their intrinsic brightnesses. A concrete example will make this clear. Suppose we observe two stars in the sky; we will name these two stars Star Alpha and Star Beta. Suppose Star Alpha appears to be brighter
than Star Beta. We cannot draw any
conclusion about the intrinsic brightness or the luminosity of these two stars
without knowing the distance of each star from us. Star Alpha could be intrinsically brighter
than Star Beta, but Star Beta might in fact be intrinsically brighter than Star
Alpha. In this case, Star Beta only
appears dimmer since it is further from us, and Star Alpha only appears
brighter since it is closer to us.
However, now suppose Star Alpha appears to be brighter than Star Beta,
and in addition suppose we have determined using whatever method
that both stars are the same distance from us. We can now be certain that Star Alpha is
indeed intrinsically brighter than Star Beta, and we can be certain that Star
Beta is intrinsically dimmer than Star Alpha.
Again, without knowing distance, we cannot draw any conclusions, but if
we happen to know that two stars are the same distance
from us, then the apparently brighter star is in fact intrinsically brighter
and the apparently dimmer star is in fact intrinsically dimmer. If a star cluster is distant enough, which we
know is the case if the parallax angles are too small to measure, then all the
stars within the cluster are roughly the same distance from us. Of course, the stars in front of the cluster
are somewhat closer to us; of course, the stars in the back of the cluster are
somewhat further from us. Nevertheless,
these are small variations if the entire star cluster is distant enough from
us. If all of the stars in the star
cluster are roughly the same distance from us, then the stars that appear to be
brighter truly are more luminous, and the stars that appear to be dimmer truly
are less luminous. Therefore, we can
construct the Hertzsprung-Russell diagram for a
distant star cluster using the apparent magnitude instead of the absolute
magnitude as the vertical axis. After
constructing the Hertzsprung-Russell diagram, we
should see the main sequence on the diagram, among other features such as the
asymptotic giant branch, the horizontal branch, and the clump. We already know the absolute magnitudes of
main sequence stars as a function of their spectral types from studying nearby
stars within the solar neighborhood.
Thus, we assign these absolute magnitudes to the corresponding main
sequence stars we see on the Hertzsprung-Russell
diagram for the star cluster.
Essentially, we are sliding the star cluster’s entire Hertzsprung-Russell diagram vertically (up and down) until
all main sequence stars on the diagram attain their appropriate absolute
magnitudes. Now that we have both the
absolute magnitudes and the apparent magnitudes of the main sequence stars in
the cluster, the only unknown remaining in the equation I = ℒ / 4πr2
is the distance r, meaning that we
have successfully determined the distance to the star cluster. This method is called
the main sequence fitting method, since we are determining the distance to the
cluster by combining the apparent magnitudes of the main sequence stars with
their established absolute magnitudes from nearby main sequence stars in the
solar neighborhood. The main sequence
fitting method is the next major rung of the Cosmological Distance Ladder above
the parallax method, which is the lowest rung of the Cosmological Distance
Ladder.
As we discussed, most star
systems are binary star systems. This is
reason enough to devote some discussion to binary star systems. Most binary star systems are detached binaries,
meaning that the two stars orbit each other sufficiently far from one another
that they do not significantly affect each other’s evolution. Whichever star is more massive will live a
shorter main sequence lifetime. That
star will then swell to become a red giant star. The helium flash will occur, and that star
will then become a helium-burning star.
After the star’s helium-burning lifetime, the star will swell to become
a red giant star a second time, eventually ejecting a slowly expanding
planetary nebula and leaving behind a white dwarf. Eventually, the other star will experience
the same sequence of stages of stellar death.
If one of the stars in a binary star system is high mass, it will live
an extremely short main sequence lifetime.
That star will then swell to become a supergiant star and suffer from a
Type II supernova. Extraordinarily, the
other star survives this supernova explosion.
The supernova ejects a hot supernova remnant rapidly expanding away from
either a neutron star or a black hole.
The other star is probably a low mass star and will therefore eventually
experience its own appropriate sequence of stages of stellar death. In conclusion, since most binary star systems
are detached binaries where the two stars orbit each other sufficiently far
from one another that they do not significantly affect each other’s evolution,
all of the stellar evolution we have discussed applies to most binary star
systems. However, if the two stars in a
binary star system are orbiting sufficiently close to each other, they can
affect each other’s evolution.
Therefore, much of the stellar evolution we have discussed requires
modifications. Such binary star systems are called close binaries.
Caution: the term close binary does not mean the binary star system is
close to our Solar System; the term close binary means the two stars in the
binary star system orbit close to each other.
In a close binary, whichever main sequence star is more massive will
live a shorter main sequence lifetime.
That star will then swell to become a giant star. However, the two stars orbit sufficiently
close to each other that the outer layers of the giant star approach the other
less massive star. These gases will then
feel more gravitational attraction from the less massive star. Hence, the outer layers of the giant star
will fall toward the less massive star.
The gas does not fall directly to the second star, since the gas has
angular momentum from the orbital motion of both stars around each other. Therefore, these gases settle into an orbit
around the less massive star, forming a flat disk. Gases within the disk that are closer to the
star orbit faster while gases within the disk that are further from the star
orbit slower, in accordance with Kepler’s laws.
As a result, neighboring gases within the disk move at different speeds;
hence, the gases within the disk rub against each other, resulting in friction
that heats the disk. This increase in
thermal energy (heat energy) must come at the expense of gravitational orbital
energy, since energy must be conserved. Therefore, the gas within the disk migrates
inward, toward the less massive star.
Eventually, the gas collides onto the surface of the star. The less massive star therefore gains mass
through these collisions, which is accretion as we discussed earlier in the
course. It is for this reason that the
flat disk around the less massive star is called an
accretion disk. In summary, there is a
mass transfer from the giant star to the less massive star through an accretion
disk around the less massive star.
Eventually, the less massive star may gain so much mass that it becomes
more massive than the giant star, which has now lost so much mass that the
giant star is less massive than the second star! This explains why we sometimes discover
binary star systems with a giant star that is less massive than the other
star. More massive
stars live shorter main sequence lifetimes; therefore, the giant star
should be the more massive star. Indeed,
the giant star was formerly the more massive star, but it lost much of its mass
through a mass transfer to the other star.
A giant star in a close binary may lose so much mass from its outer
layers through this mass transfer that it may become an exotic subgiant star
with a disproportionately large core.
Eventually, the other star may gain so much mass that it begins to
evolve off of the main sequence prematurely. It swells to become a giant star, and its
outer layers approach the first star.
These gases will then feel more gravitational attraction to the first
star, eventually falling toward the first star.
Hence, there is a second mass transfer from the second star back to the
first star!
Usually, both stars in a
close binary are low mass stars.
Eventually, one of the stars will reach the very end of its life,
ejecting a planetary nebula and leaving behind a white dwarf. The star has lost most of its mass when it ejects the planetary nebula, and hence the gravitational
attraction between the white dwarf and the second star is
greatly reduced. As a result, the
center of mass of the close binary is displaced to be
much closer to the second star, and the trajectories of both stars around that
new center of mass is greatly altered.
Often, the gravitational attraction of the two stars is
sufficiently weakened that both stars subsequently move on unbound
trajectories, leaving each other and ending the binary system. The two stars may however continue to orbit
each other, and the two stars may even continue to orbit close to each other,
maintaining the close binary system.
Eventually, the second star ends its main sequence lifetime and expands
to become a giant star. In the case
where the two stars continue to orbit close to each other, the outer layers of
the giant star approach the white dwarf.
These gases will then feel more gravitational attraction from the white
dwarf. Hence, the outer layers of the
giant star will fall toward the white dwarf.
Again, the gas does not fall directly to the white dwarf, since the gas
has angular momentum from the orbital motion of both stars around each
other. Therefore, these gases settle
into an orbit around the white dwarf, forming an accretion disk. Again, the gases within the disk rub against
each other, resulting in friction that heats the disk. This increase in thermal energy (heat energy)
must come at the expense of gravitational orbital energy, since energy must be conserved.
Therefore, the gas within the disk migrates inward, toward the white
dwarf. Eventually, the gas collides onto
the surface of the white dwarf, causing the white dwarf to gain mass. In summary, there is a mass transfer from the
giant star to the white dwarf through an accretion disk around the white dwarf. However, a white dwarf is small, roughly the
size of the Earth, as we discussed.
Hence, the gravitational well of a white dwarf is sufficiently deep that
the gas that collides onto the surface of the white dwarf is strongly
compressed and significantly heated.
This gas is predominantly hydrogen, since it came from the outer layers
of the giant star. As gas continues to
fall onto the white dwarf, the hydrogen on its surface may
eventually be heated to millions of kelvins, causing it to fuse into
helium. This causes the white dwarf to suddenly increase in brightness to thousands of solar
luminosities for a few weeks. This is called a nova. The
sudden increase in luminosity ejects material from the surface of the white
dwarf, resulting in an expanding shell of hot gas away from the close binary
system. This is called
a nova remnant. The nova and the ejected
nova remnant do not stop the mass transfer from the giant star to the white
dwarf from continuing. Eventually,
another nova may occur, ejecting another nova remnant. In other words, novae and nova remnants from
a white dwarf in a close binary are periodic, occurring regularly. Novae and nova remnants from a white dwarf in
a close binary may occur once every few decades, once every few centuries, or
once every few millennia. To summarize,
there are several important differences between a nova and a supernova. Firstly, novae from a white dwarf in a close
binary occur regularly, while the Type II supernova of a high mass star occurs
only once. Secondly, novae last a few
weeks, while a supernova lasts a few months.
Thirdly, novae have luminosities of thousands of solar luminosities,
while supernovae have luminosities of billions of solar luminosities. Note however that observationally a nova and
a supernova may appear identical, at least at first glance. A nova that occurs sufficiently close to us
may appear just as bright (same apparent magnitude) as a supernova that
occurred much further from us. We can
discriminate between a nova and a supernova by determining the distance to the
event and then using that distance to calculate the luminosity (the absolute
magnitude or the intrinsic brightness) of the event. If the absolute magnitude is thousands of
solar luminosities, the event was a nova, not a supernova. If instead the absolute magnitude is billions
of solar luminosities, the event was a supernova, not a nova. We may also discriminate between a nova and a
supernova by noting the duration of time of the event. If the increase in brightness lasts for a few
weeks, we may conclude that the event was a nova, not a supernova. If instead the increase in brightness lasts
for a few months, we may conclude that the event was a supernova, not a
nova. We may also discriminate between a
nova and a supernova by observing the space surrounding the event. If we observe a large slowly expanding
planetary nebula around the event, we may conclude that the event was a nova,
not a supernova. In this case, the
surrounding large planetary nebula was ejected when
the white dwarf first formed. If instead
we observe a small very hot (in the millions of kelvins) supernova remnant
rapidly expanding away from the event, we may conclude that the event was a
supernova, not a nova. In this case, the
small very hot rapidly expanding supernova remnant was just
ejected when the supernova occurred.
Note that the word nova is derived from a Latin
word meaning new. Observationally, a
nova simply appears to be a new star. A
supernova also appears to be a new star, but with much greater luminosity or
absolute magnitude or intrinsic brightness.
Caution: a white dwarf in a close binary may suffer from its own unique
type of supernova, as we will discuss later in the course.
If one of the stars in a
close binary is a high mass star, it will live a short main sequence
lifetime. It then swells to become a
supergiant star, and explodes as a Type II supernova, throwing out a hot
supernova remnant rapidly expanding away from a neutron star or a black
hole. Extraordinarily, the other star
survives the supernova, even though the two stars orbit close to each
other. We now have a neutron star or a
black hole, called the compact object, orbiting a main sequence star, called
the primary object. The primary object
will eventually end its main sequence lifetime and swell to become a giant star. The outer layers of the giant star approach
the compact object and hence feel stronger gravitational attraction from the
compact object. Thus, the outer layers
of the giant star fall toward the compact object. Again, the gas does not fall directly to the
compact object, since the gas has angular momentum from the orbital motion of
both stars around each other. Therefore,
these gases settle into an orbit around the compact object, forming an
accretion disk where friction heats the disk causing the gas to migrate inward
toward the compact object. However, the
gravitational well of a neutron star or a black hole is so incredibly deep that
the gas is heated to millions of kelvins of
temperature as it falls toward the compact object. At these very hot temperatures, the accretion
disk radiates X-rays. These binary star
systems are called X-ray binaries, which
astrophysicists often abbreviate XRBs. The incredibly deep gravitational well of the
compact object also accelerates the falling gas to nearly the speed of
light. Some of this infalling
gas may be ejected as narrow columns or jets near the
rotational angular momentum axis of the accretion disk around the compact
object. For all of these reasons, some
types of X-ray binaries are often called microquasars. We will discuss quasars later in the
course. For now, we simply mention that
the accretion disk of an X-ray binary together with the high-speed jets of gas
ejected along the rotational angular momentum axis of the accretion disk around
the compact object makes X-ray binaries similar to quasars, but on a much
smaller size scale than quasars. This is
why some types of X-ray binaries are often called microquasars.
The compact object of an
X-ray binary is either a neutron star or a black hole. A neutron star has a solid surface. Therefore, very hot gas falling toward a
neutron star that has been accelerated to nearly the
speed of light will eventually collide onto the surface of the neutron star,
causing sudden and intense X-ray bursts.
These X-ray bursts can have luminosities of many thousands of solar
luminosities, entirely in X-rays! Black
holes however do not have a solid surface, as we will discuss shortly. Therefore, very hot gas falling toward a
black hole that has been accelerated to nearly the
speed of light will not collide with a solid surface; the gas rather quietly
disappears from the observable universe as it falls into the black hole. Therefore, there are no sudden and intense
X-ray bursts from a black hole. This is one way astrophysicists determine whether the compact
object in an X-ray binary is a neutron star or a black hole. If we detect sudden and intense X-ray bursts,
then the compact object is a neutron star.
If we do not detect sudden and intense X-ray bursts, then the compact
object is a black hole. Another way
astrophysicists make this determination is by calculating the mass of the
compact object using Kepler’s third law.
If the mass of the compact object is greater than the Tolman-Oppenheimer-Volkoff limit,
then the compact object must be a black hole.
If the mass of the compact object is less than the Tolman-Oppenheimer-Volkoff limit but greater than the Chandrasekhar limit,
then the compact object must be a neutron star.
The first black hole ever discovered was the compact object in an X-ray
binary in the constellation Cygnus (the swan).
This X-ray binary was named Cygnus X-1. The primary object in this star system is a
supergiant star. The compact object was calculated to have a mass significantly greater than the
Tolman-Oppenheimer-Volkoff
limit, revealing that it is indeed a black hole. Yet another way
astrophysicists determine whether the compact object in an X-ray binary
is a neutron star or a black hole is the observation of pulses from the compact
object. As we discussed, a pulsar is a
neutron star, and the observation of electromagnetic pulses from the X-ray
binary would reveal that the compact object is a neutron star. The mass transferred from the primary object
to the neutron star through the accretion disk may add angular momentum to the
neutron star, thus speeding up its rotation.
The result is a millisecond pulsar, since it rotates once in only a
couple of milliseconds. These
millisecond pulsars are also called recycled pulsars,
since they were at first rotationally slowing from a pulsar neutron star to a
non-pulsar neutron star through the loss of angular momentum carried away by
its pulses, but the additional angular momentum from the accreting gases gave
it a second life as pulsar.
The Theories of Relativity: Galilean-Newtonian Relativity, Einsteinian Special Relativity, and Einsteinian General Relativity
Galilean-Newtonian Relativity
theory was formulated between three hundred and four
hundred years ago. This relativity
theory may also be called common-sense relativity
theory, since many of us understand this relativity theory intuitively from our
daily experiences. Fundamental to
Galilean-Newtonian Relativity theory is the Galilean-Newtonian velocity addition
law, which states that the velocity of Object A relative to Object B (written )
plus the velocity of Object B relative to Object C (written
)
is equal to the velocity of Object A relative to Object C (written
). This law is more properly written
=
+
. This Galilean-Newtonian velocity addition law
may seem intimidating at first, but in fact many of us
already understand this law intuitively from our daily experiences, even if we
cannot state this law mathematically.
Let us discuss several examples to illustrate that this law is indeed
consistent with our common sense. As our
first example, suppose a train is moving at ten miles per hour to the right
relative to the ground, and suppose someone on the train fires a bullet moving
at one hundred miles per hour to the right relative to the train. Then, the velocity of the bullet relative to
the ground is one hundred and ten miles per hour to the right. As our second example, suppose that a train
is moving at ten miles per hour to the right relative to the
ground, and suppose someone on the ground fires a bullet moving at one
hundred miles per hour to the right relative to the ground. Then, the velocity of the bullet relative to
the train is ninety miles per hour to the right. As our third example, suppose a car is moving
at seventy miles per hour on one side of a highway relative to the ground, and
suppose another car is moving at fifty miles per hour on the same side of the
highway and hence is moving in the same direction relative to the ground. Then, the seventy-car is moving at twenty
miles per hour relative to the fifty-car.
Also, the fifty-car is moving at twenty miles
per hour backwards relative to the seventy-car.
As our fourth example, suppose a car is moving at seventy miles per hour
on one side of a highway relative to the ground, and suppose another car is
moving at fifty miles per hour on the opposite side of the highway and hence is
moving in the opposite direction relative to the ground. Then, either car is moving at one hundred and
twenty miles per hour relative to the other car. Why did we subtract seventy miles per hour
and fifty miles per hour to obtain twenty miles per hour in our third
example? Why did we add seventy miles
per hour and fifty miles per hour to obtain one hundred and twenty miles per
hour in our fourth example? The simple,
common-sense explanations are as follows.
If we are driving at fifty miles per hour on a highway and if a car on
the same side of the highway comes up from behind us at seventy miles per hour
and collides with us, the collision will be mild, since the relative speed
between the two cars is only twenty miles per hour. This collision is exactly as if our car were
parked and we were hit by a car moving at twenty miles per
hour. However, if we are driving at
fifty miles per hour on a highway and if a car moving in the opposite direction
at seventy miles per hour collides with us (a head-on collision), we would be dead, since the relative speed between the two
cars is one hundred and twenty miles per hour.
This collision is exactly as if our car were parked and we were hit by a car moving at one hundred and twenty miles per
hour. As a fifth example, suppose
a car is moving at sixty miles per hour on one side of a highway relative to
the ground, and suppose another car is moving at sixty miles per hour on the
same side of the highway and hence is moving in the same direction relative to
the ground. Then either car is not
moving (is at rest) relative to the other car.
After centuries of physicists believing that Galilean-Newtonian Relativity
theory (common-sense relativity theory) is correct, new physics was discovered
that began to reveal that these laws, although seemingly indisputable, are in
fact not correct. In the 1860s, the brilliant Scottish physicist James Clerk Maxwell
formulated classical electromagnetic theory with four equations, later named
the Maxwell equations in his honor.
These four Maxwell equations are mathematically beautiful. These four Maxwell equations completely
summarize classical electromagnetism.
These four Maxwell equations even revealed that light is an
electromagnetic wave, and the entire wave theory of light can
be derived from these four Maxwell equations, including all the laws of
classical optics. However, these four
Maxwell equations also stated that the vacuum speed of light is always the same
number, written c and equal to
roughly three hundred thousand kilometers per second or roughly one hundred and
eighty-six thousand miles per second.
This cannot be true, can it? According
to common sense, the vacuum speed of light cannot always be the same number, as
the following few examples will illustrate.
Suppose a train is moving at velocity V to the right relative to the ground, and suppose someone on the
train with a flashlight sends a light beam moving at velocity c to the right relative to the
train. Then, the velocity of the light
beam relative to the ground is c plus
V, isn’t
it? Please review our first example from
Galilean-Newtonian Relativity theory for help with this example, since they are
in fact identical examples. Suppose a
train is moving at velocity V to the
right relative to the ground, and suppose someone on the ground with a
flashlight sends a light beam moving at velocity c to the right relative to the ground. Then, the velocity of the light beam relative
to the train is c minus V, isn’t
it? Please review our second example
from Galilean-Newtonian Relativity theory for help with this example, since they
are in fact identical examples. Suppose
a train is moving at velocity c to
the right relative to the ground, and suppose someone on the ground with a
flashlight sends a light beam moving at velocity c to the right relative to the ground. Then, the light beam is not moving (is at
rest) relative to the train, isn’t it? Please review our fifth example from
Galilean-Newtonian Relativity theory for help with this example, since they are
in fact identical examples. Suppose a
train is moving at velocity V relative
to the ground, and suppose someone on the ground with a flashlight sends a
light beam moving at velocity c relative to the ground at right angles to the train’s velocity. Then, the light beam is moving at a speed relative to the train, isn’t
it? All of these examples persuade us to
conclude that according to the common sense of our daily experiences, the
vacuum speed of light should depend upon which direction we are moving, how
fast we are moving, and in which direction the light itself is moving. Our particular examples, using the common
sense of our daily experience, tell us that sometimes the vacuum speed of light
might be c, but other times it might
be c plus V, sometimes it could be c
minus V, sometimes it could be zero,
sometimes it could be
,
and so on and so forth. The two American
physicists Albert A. Michelson and Edward W. Morley set out to show that this
is the case in the 1880s with what was later called
the Michelson-Morley experiment in their honor.
However, their experiment showed that the
vacuum speed of light does not depend upon which direction we are moving or how
fast we are moving or even in which direction the light is itself moving! Their measurements showed that the vacuum speed
of light is always the same number, always c! Our common sense tells us that this cannot be
true, and indeed many physicists believed that Michelson and Morley performed
their experiment incorrectly. Some
physicists did believe the result, but they could not explain how this can possibly
be true.
This brings us to the person
who would explain all of these mysteries.
Albert Einstein was a mediocre physicist who struggled with
mathematics. One of his
elementary-school math teachers told young Albert’s father, “Nothing good will
ever come from this boy!” In the year
1905, Albert Einstein worked at a patent office in Switzerland. Although many physicists would feel
humiliated working in such a position, this job gave Einstein plenty of time to
think about fundamental physics. Einstein
was so enraptured by the beauty of the Maxwell
equations that he became convinced that they must be true. Most physicists would have responded that the
Maxwell equations cannot be completely true, since
they state that the vacuum speed of light is always c, which common sense says is impossible. Only Einstein would dare assert the
following. The Maxwell equations are so
beautiful that they must be true.
Therefore, if they state that the vacuum speed of light is always c, then the vacuum speed of light is
always c! This is Special Relativity theory in one
sentence. Einstein’s Special Relativity
theory states that the vacuum speed of light does not depend upon which
direction we are moving or how fast we are moving or even in which direction
light happens to be moving. Einstein’s
Special Relativity theory states that the vacuum speed of light is always the
same number, written c and equal to
roughly three hundred thousand kilometers per second or roughly one hundred and
eighty-six thousand miles per second. In
other words, Einstein’s Special Relativity theory states that the vacuum speed
of light is an invariant.
Einstein’s Special Relativity
theory is simple to state, but this theory confounds our common sense. How can the vacuum speed of light possibly be
an invariant? The basic argument is as
follows. If the vacuum speed of light is
always the same number c, then space
and time must change to ensure that the vacuum speed of light c does not change. For example, Einstein made the following incredible
deduction from his new theory. Time
slows down when we move; moving clocks actually run slow! This is called time
dilation. Consider any clock whatsoever,
such as a mechanical clock or the electronic clock within our mobile
telephones. According to Einstein’s
Special Relativity theory, a clock must run slower if it moves. Suppose all of us had identical mobile
telephones, and suppose we synchronized their clocks. If one of us walks with our mobile telephone,
our mobile telephone runs slower than everyone else’s mobile telephones! As a result, our time becomes behind everyone
else’s time! Is Einstein actually
claiming that whenever we walk or ride a bicycle or drive a car or ride a train
or ride an airplane that our time slows down?
Yes! But then
why do we never notice in our daily experience that our time slows down? The time dilation effect is tiny under most
circumstances. We would only notice
these temporal changes if we moved incredibly fast, close to the vacuum speed
of light c. The implications of this time dilation effect
are staggering. For example, consider
two identical twins who have lived together in the same house their entire
lives. Hence, they are the same
age. However, if one of them walks down
the street, that twin will age a tiny amount slower, since their time is now
running slower. When that twin returns
home, that twin will be a tiny amount younger than the twin who remained at
home! Time dilation was considered
outrageous a century ago, but this effect has actually been observed in recent
decades. For example, suppose we
synchronize two extraordinarily accurate atomic clocks. Then, suppose we place one of these atomic
clocks on an airplane. After the flight,
physicists have actually experimentally measured that
the airplane’s atomic clock is behind the ground’s atomic clock by a tiny
amount! As another example, consider an
unstable subatomic particle that decays after a short lifetime. If this particle is
accelerated close to the vacuum speed of light c, it lives much longer before decaying since its lifetime is much
longer. When the particle moves, its
time slows down, permitting it to live a longer lifetime before decaying. Computers, mobile telephones, and the global
positioning system (GPS) would all not function correctly without taking into
account the fact that all of their clocks run at different rates since they all
move at different speeds!
Einstein drew another
incredible conclusion from his new theory: space contracts when we move; moving
objects actually contract! This is called length contraction. Consider any object whatsoever. According to Einstein’s Special Relativity
theory, the object must contract in the direction it is moving. While we are walking, we are skinnier than
usual, and not because we are getting exercise!
When we stop moving, our shape returns to normal. Is Einstein actually claiming that moving
cars and moving trains are shorter than normal?
Yes! But then
why do we never notice in our daily experience that moving cars and moving
trains are shorter than normal? The
length contraction effect is tiny under most circumstances. We would only notice this spatial change if
we moved incredibly fast, close to the vacuum speed of light c.
We now begin to have some
understanding how it could possibly be true that the vacuum speed of light is
an invariant, always equal to the same number c. Speed is equal to
distance divided by time. Distance is measured with graduated rods such as meter sticks, and
time is measured with clocks. However,
moving objects contract and moving clocks run slow! To ensure that the vacuum
speed of light is always equal to the same number c, every graduated rod in the universe contracts by just the right
amount and every clock in the universe slows down by just the right amount to
ensure that the distance traveled by light divided by the time for light to
travel always equals the same number c. Space and time must change to ensure that c does not change!
When Einstein deduced length
contraction and time dilation from his new theory, he realized that the
Galilean-Newtonian velocity addition law could no longer be true. Physicists believed that the
Galilean-Newtonian velocity addition law was true for centuries, and even today the common sense of our daily experience tells us that
it is true. We must realize that time
dilation and length contraction are very tiny effects for objects moving at
speeds very slow compared with c,
such as walking people, driving cars, moving trains, and flying airplanes. This makes the Galilean-Newtonian velocity
addition law almost correct, but still not exactly correct,
at these slow speeds. At very fast
speeds approaching c, we would
actually notice that this law is severely wrong. Einstein deduced the correct velocity addition
law by taking time dilation and length contraction into account. This new law is called
the Lorentz-Einstein velocity addition law, named for both Albert Einstein and
the Dutch physicist Hendrik Lorentz. The
Lorentz-Einstein velocity addition law states that . This new velocity addition law correctly
ensures that the vacuum speed of light is an invariant, always equal to the
same number c.
Einstein drew yet another
incredible conclusion from his new theory: the mass of an object increases as
it moves faster. For centuries,
physicists believed that the mass of an object is fixed, and even today the common sense of our daily experience tells us that
the mass of an object is fixed. We must
realize that the additional mass is very tiny at speeds very slow compared with
c, such as the speeds of walking
people, driving cars, moving trains, and flying airplanes. We would need to move at very fast speeds
approaching c to
actually notice this extra mass. If
we stand still, we have a certain amount of mass, but as we walk
we have more mass! The next time someone
urges us to go jogging to lose some weight, we should
respond that we will gain mass if we jog!
The extra mass is tiny at such a slow speed, but it is nevertheless
real! The equation for this extra mass
(which we will not present in this course) reveals yet another outrageous
consequence of this theory: the vacuum speed of light c is the speed limit of the universe. An object gains mass when it moves faster,
but this means that we would then require more force to speed it up
further. If a force does speed the
object up further, then the object would gain even more mass, and thus we would
require even more force to speed it up further still. According to the extra-mass equation, the
mass of an object approaches infinity as its speed approaches c.
This means that we would need an infinite force to speed the object up
to c, but it is impossible to exert
an infinite force. Not only does the
universe forbid anything from moving faster than c, the universe forbids any object to even reach
c!
We could accelerate a spaceship faster and faster, making it move closer
and closer to c, but the spaceship
can never reach c. The only things permitted to actually move at
c are things already moving at c, such as light or any electromagnetic
wave (composed of photons) from across the entire Electromagnetic
Spectrum. We will discuss shortly that
according to Einstein’s General Relativity theory, gravity also moves at c.
Any object moving slower than c
is forever constrained to move slower than c. Such an object can move faster and faster
approaching c, but actually reaching c is forbidden. Moving faster than c is out of the question.
These conclusions were considered outrageous a century ago, but they
have been proven in recent decades. In
particle accelerators, we can speed up subatomic particles, but physicists have
experimentally verified that these particles do indeed gain more and more mass as they move faster and faster, in precise
obedience to the extra-mass equation that Einstein discovered. Moreover, physicists have experimentally
verified that the vacuum speed of light c
acts as a bottleneck, precisely as predicted by the extra-mass equation that Einstein
discovered. We can accelerate particles
very close to c, but we cannot
accelerate particles to actually reach c.
Speeds faster than c are out
of the question. Modern particle
accelerators can accelerate protons to speeds faster than 99.9999% of c, but nevertheless
still slower than c itself. In summary, Einstein’s Special Relativity
theory states that our universe has a speed limit, the vacuum speed of light c!
Einstein also deduced his
famous mass-energy relation from his new theory, which states that energy is
equal to mass multiplied by the square of the vacuum speed of light. This law is most commonly written E = mc2. The consequences of this equation are
staggering. For example, consider
nuclear reactions. An exothermic nuclear
reaction liberates energy, while an endothermic nuclear reaction absorbs
energy. These two terms exothermic and
endothermic are used to describe chemical reactions as
well. If an exothermic nuclear reaction
liberates energy, then the reaction liberated mass as well. Thus, the products of an exothermic reaction
have less mass than the reactants! If an
endothermic nuclear reaction absorbs energy, then the reaction absorbed mass as
well. Thus, the products of an
endothermic reaction have more mass than the reactants! Einstein stated this mass-energy relation in
the year 1906. It would be a few years
later before physicists even discovered that an atom has a nucleus, and it
would be almost forty years later before the first nuclear weapons were built.
Nevertheless, Einstein actually stated in the year 1906 that his
mass-energy relation could be proven by studying
radioactive materials. It would be years
before physicists even realized that radioactivity is a type of nuclear
reaction! Let us spend a moment reflecting
upon Einstein’s genius. Almost forty
years before nuclear weapons were built and even a few
years before the nucleus of an atom was discovered, Einstein formulated the
mass-energy equation and applied it to radioactivity, a type of nuclear
reaction!
The mass-energy relation may
be the last of Einstein’s contributions to Special Relativity theory, but one
of his former math teachers, the German-Polish-Russian mathematician Hermann Minkowski, realized what this new theory is really trying
to tell us. According to Special
Relativity theory, we live in a four-dimensional universe. According to the common sense of our daily
experience, we live in a three-dimensional universe. These three dimensions are length, width, and
height, mathematically written as x, y, and z. However, time is the
fourth dimension according to relativity theory. Time is usually written
as t, but time is written as ct in relativity
theory. In other words, we live in a
four-dimensional universe: three spatial dimensions (x, y, and z) and one temporal dimension (ct). Moreover, these four dimensions mix into one
another, and the mixing of the temporal dimension with the three spatial
dimensions is the fundamental cause of time dilation, length contraction, the
invariance of c, the universal speed
limit of c, and even the mass-energy
relation. Minkowski
invented a new word to describe our four-dimensional universe. Minkowski took the
word space and the word time, and he put them together to form a new word: spacetime. Notice
that there is no space or even a hyphen between the two words used to construct
this new word. To summarize Einstein’s
Special Relativity theory, we live in a four-dimensional spacetime
with three spatial dimensions (x, y, and z) and one temporal dimension (ct) that all mix into one another
thus causing time dilation, length contraction, the invariance of c, the universal speed limit of c, and the mass-energy relation.
Fictitious forces or pseudoforces are forces that do not actually exist; they
only seem to exist in certain frames of reference. For example, suppose we are in a stationary
car waiting at a red traffic light. When
the red traffic light turns green, we place our foot upon the car’s accelerator
pedal. As the car accelerates forward,
everyone and everything in the car feels a backward force. We actually feel ourselves pulled backward
into the backrest of our chair. Anything
hanging from the rearview mirror also swings backward. This backward force is a fictitious force or
a pseudoforce.
It does not exist; it only seems to exist within the car as the car
accelerates forward. Although everyone
and everything within the car feels this backward force, it nevertheless does
not actually exist. In actuality,
everyone and everything within the car remains stationary for a moment as the
car and its chairs accelerate forward, and hence the backrests of the chairs
accelerate forward and collide with our own backs. This is amusing: within the car we feel pulled backward into the backrests of the
chairs, but in actuality we remain stationary while the backrests of the chairs
accelerate forward into our backs!
Although we feel this backward force within the car, we nevertheless
conclude that this backward force is a fictitious force or a pseudoforce. It does
not actually exist; it only seems to exist within the car as the car
accelerates forward. As another example,
suppose we are in a moving car when we see a green traffic light turn yellow,
and so we place our foot upon the car’s brake pedal. As the car slows down, everyone and
everything in the car feels a forward force.
We actually feel ourselves pulled forward off of
the backrest of our chair. Anything
hanging from the rearview mirror also swings forward. In extreme cases, we may feel pulled forward
so strongly that our heads may collide with the windshield. This forward force is a fictitious force or a
pseudoforce.
It does not exist; it only seems to exist within the car as the car
slows down. Although everyone and
everything within the car feels this forward force, it nevertheless does not
actually exist. In actuality, everyone
and everything within the car remains in motion for a moment as the car and its
chairs and its windshield slow down, and hence the
backrests of the chairs move away from our own backs while the windshield moves
toward our heads. This is amusing:
within the car we feel pulled forward off of the
backrests of the chairs and toward the windshield, but in actuality the
backrests of the chairs move away from our backs and the windshield moves
toward our heads! Although we feel this
forward force within the car, we nevertheless conclude that this forward force
is a fictitious force or a pseudoforce. It does not actually exist; it only seems to
exist within the car as the car slows down.
As yet another example, suppose we are in a moving car when we see that
the highway ramp ahead curves to the left, and so we turn the steering wheel to
the left so that the car will remain on the highway ramp. As the car turns left, everyone and
everything in the car feels a rightward force.
We actually feel ourselves pulled rightward away from the driver’s side
of the car and toward the passenger’s side of the car. Anything hanging from the rearview mirror
also swings rightward and continues to remain suspended rightward in apparent
defiance of the Earth’s downward gravity as the car turns left! This rightward force is a fictitious force or
a pseudoforce.
It does not exist; it only seems to exist within the car
as the car turns left. Although everyone
and everything within the car feels this rightward force, it nevertheless does
not actually exist. In actuality,
everyone and everything within the car remains in forward motion as the car
turns left, and hence the driver’s side of the car turns away from us while the
passenger’s side of the car turns toward us.
This is amusing: within the car we feel pulled
rightward toward the passenger’s side of the car, but in actuality we remain in
forward motion while the passenger’s side of the car turns leftward toward
us! Although we feel this rightward
force within the car, we nevertheless conclude that this rightward force is a
fictitious force or a pseudoforce. It does not actually exist; it only seems to
exist within the car as the car turns left. As a fourth example, projectiles will appear
to suffer from deflections within a rotating frame of reference. This deflecting force is a fictitious force
or a pseudoforce.
It does not exist; it only seems to exist within the rotating frame of
reference. In actuality, the projectiles
are not deflected; the projectiles in fact continue
moving along straight paths. The frame
of reference is rotating, and the rotation of the entire frame of reference
seems to cause projectiles to deviate from straight trajectories. This particular fictitious force or pseudoforce is called the Coriolis
force, named for the French physicist Gaspard-Gustave de Coriolis who first
derived the mathematical equations describing this particular fictitious force
or pseudoforce.
The Coriolis force appears to cause rightward deflections in frames of
reference rotating counterclockwise, and the Coriolis force appears to cause
leftward deflections in frames of reference rotating clockwise. The Coriolis force appears to cause stronger
deflections if the frame of reference is rotating faster and appears to cause
weaker deflections if the frame of reference is rotating slower. The Coriolis force appears to vanish if the
frame of reference stops rotating. The
Coriolis force only appears to cause deflections; it does not cause projectiles
to speed up or slow down.
As we discussed earlier in
the course, physicists use the word acceleration for the rate at which an
object’s motion changes, where the object could be suffering from any change in
motion whatsoever. An object that is
speeding up is said to be accelerating, but an object
that is slowing down is also said to be accelerating. (In colloquial English, we would use the word
decelerating instead.) Moreover, an
object that is neither speeding up nor slowing down but only changing the
direction that it moves is also said to be
accelerating. In all four of our
examples of fictitious forces or pseudoforces, notice
that the frame of reference is accelerating.
In the first example, the car was accelerating forward. In the second example, the car was slowing
down, which again is a form of acceleration.
In the third example, the car was changing the direction that it was
moving, which again is a form of acceleration.
In our fourth example, the entire frame of reference was rotating, which
is also a form of acceleration. A frame
of reference where there are no fictitious forces or pseudoforces
is called an inertial frame of reference, while a
frame of reference where fictitious forces or pseudoforces
appear to exist is called a non-inertial frame of reference. It is not difficult to prove mathematically
that all inertial frames of reference (where there are no
fictitious forces or pseudoforces) are not
accelerating relative to one another. It
is also not difficult to prove mathematically that all non-inertial frames of
reference (where fictitious forces or pseudoforces
appear to exist) are accelerating relative to all inertial frames of reference,
as our four examples illustrate. Since
fictitious forces or pseudoforces appear to exist
within non-inertial (accelerating) frames of reference, the laws of physics
require particular modifications when used within non-inertial frames of
reference. Since there are no fictitious
forces or pseudoforces within inertial
(non-accelerating) frames of reference, the laws of physics do not require
these particular modifications when used within inertial frames of
reference. More plainly, the laws of
physics apply naturally from within inertial (non-accelerating) frames of
reference, but the laws of physics do not naturally apply from within
non-inertial (accelerating) frames of reference. All of the laws of physics we have discussed
thus far in this course apply naturally from within inertial (non-accelerating)
frames of reference. In particular,
Galilean-Newtonian Relativity theory, Newton’s laws of motion, Newton’s theory
of gravitation, Maxwell’s electromagnetic theory, Quantum Mechanics, and even
Einstein’s Special Relativity theory all apply naturally from within inertial
(non-accelerating) frames of reference. All of the laws of physics we have discussed thus far in this
course do not apply naturally from within non-inertial (accelerating)
frames of reference. Note that this is
why Einstein’s Special Relativity theory is called
Special Relativity. This theory only
applies naturally from within special frames of reference, inertial
(non-accelerating) frames of reference, just as all the laws of physics we have
discussed thus far in this course apply naturally from within inertial
(non-accelerating) frames of reference.
Einstein was
extremely bothered by this restriction upon the laws of physics, in
particular upon his Special Relativity theory.
If the laws of physics are the mathematical equations that describe the
universe, then we should feel free to apply them from within any frame of
reference whatsoever. Consequently,
Einstein realized that he must generalize his Special Relativity theory to a new
theory of physics that could be applied from within any frame of reference
whatsoever, whether inertial (non-accelerating) or non-inertial
(accelerating). This new more general
theory Einstein called General Relativity theory, since it is more general than
his Special Relativity theory and indeed more general than all other laws of
physics. Einstein knew that this new
General Relativity theory must be applied from within not only inertial
(non-accelerating) frames of reference but from within
non-inertial (accelerating) frames of reference as well. Fictitious forces or pseudoforces
appear to act upon all objects from within non-inertial (accelerating) frames
of reference. Einstein then realized that there is another force that acts upon all objects:
gravitation. Einstein began to imagine
that fictitious forces or pseudoforces must act like
gravitational forces, and therefore his General Relativity theory must
ultimately be a theory of gravity. To
illustrate how fictitious forces or pseudoforces act
like gravitational forces, consider a spaceship far from all stars and planets
or any other large gravitating objects.
The astronauts within this spaceship would feel weightless as long as
the spaceship were not accelerating.
However, suppose the spaceship had sufficient fuel to thrust the spaceship,
causing an acceleration. While the
spaceship accelerates, everyone and everything within the spaceship would feel
fictitious forces or pseudoforces, and hence these
fictitious forces or pseudoforces would feel like
gravitational forces, even though the spaceship is far from all stars and
planets or any other large gravitating objects.
In fact, if the spaceship had sufficient fuel to thrust the spaceship
with an acceleration of 9.8 meters per second per second, then the astronauts
would feel the same gravity within the spaceship that they would feel if they
were standing on the surface of the Earth.
As long as the spaceship continues to accelerate, everyone and
everything within the spaceship would feel gravity as if they were standing on
Earth instead of in a spaceship in outer space!
This example persuades us that we can turn gravity on within
non-inertial (accelerating) frames of reference. We can also turn gravity off within
non-inertial (accelerating) frames of reference. For example, suppose we are standing within
an elevator on planet Earth. Now suppose
the elevator cable breaks, causing the elevator to fall. We present two arguments to persuade us that
everyone and everything within this falling elevator would now feel
weightless. Firstly, everything falls
toward the Earth with the same acceleration ignoring non-gravitational forces
such as air resistance, as we discussed earlier in the course. Hence, everyone and everything within the
elevator accelerates downward together.
Consequently, if we were to take our keys out of our pocket for example
and let go, our keys would not appear to fall down but would instead appear to simply float in front of us, since we ourselves and our
keys and everything within the elevator are accelerating downward together with
the elevator with the same acceleration.
Secondly, since the elevator is accelerating downward, it is a
non-inertial frame of reference.
Therefore, everyone and everything within the elevator should feel a
fictitious force or a pseudoforce upward that would
exactly cancel the Earth’s downward gravity.
In conclusion, everyone and everything within the falling elevator feels
weightless. More generally, gravity is always turned off within all freely falling frames of
reference. Caution: just as physicists
use the word acceleration for any change in motion whatsoever, physicists use
the term freely falling for any frame of reference moving only under the
influence of gravity. Someone who is
falling downward is said to be freely falling, but
someone who is shot upward out of a cannon is also said to be freely falling
even while they are moving upward.
Someone who is shot out of a cannon at an angle
is also said to be freely falling even though they are moving along a
trajectory that at first takes them upward and then later takes them
downward. Moons orbiting planets are
freely falling even if the moon and the planet are not actually approaching
each other. Planets orbiting stars are
also freely falling even if the planet and the star are not actually
approaching each other. In all such
cases, gravity is turned off within freely falling
frames of reference. For example,
astronauts feel weightless while orbiting the Earth even though astronauts almost always orbit close enough to the Earth that its
gravity is almost as strong as the gravity on the surface of the Earth. As a counterintuitive example of this
principle, consider a spaceship falling toward a planet. Most students believe that the astronauts
within the spaceship would feel stronger and stronger gravity as their
spaceship approaches the planet, but this is false. In actuality, the astronauts feel weightless
during their entire journey falling toward the planet, since they are in a
freely falling frame of reference.
Assuming the planet has no atmosphere that would slow the spaceship down
or burn the spaceship up, the astronauts within the spaceship would feel
weightless during their entire journey, right up to the moment just before they
crash upon the planet. Other astronauts
right next to the crash site who are standing upon the planet feel the planet’s
gravity, but the astronauts within the spaceship feel weightless, even
immediately before crashing even though they are right next to the other
astronauts standing upon the planet who do feel the planet’s gravity! Einstein struggled for roughly ten years to
mathematically express all of these ideas, and in the year 1915
he finally formulated his General Relativity theory. This new General Relativity theory states
that we live in a four-dimensional spacetime with
three spatial dimensions and one temporal dimension that all mix into one
another, but this is precisely what Special Relativity theory already
asserts. However, this new theory in
addition states that gravity is the curvature of our four-dimensional spacetime. Special
Relativity theory describes four-dimensional spacetime
with a flat (uncurved) geometry because Special
Relativity does not include the effects of gravity, while General Relativity
describes four-dimensional spacetime with a curved
geometry, since this new theory states that gravity is the curvature of our
four-dimensional spacetime. To the present day, Einstein’s General
Relativity is the only theory in all of physics that places all frames of
reference, both inertial (non-accelerating) and non-inertial (accelerating), on
equal footing. Einstein’s General
Relativity theory may be applied from within any frame
of reference whatsoever, whether or not there appear to be fictitious forces or
pseudoforces from within the frame of reference.
All of the outrageous
conclusions that Einstein deduced from Special Relativity are still true in General Relativity, but these outrageous conclusions are
even more outrageous. For example, does
time dilation still occur? Does a clock
still run slow when it moves according to General Relativity theory? Yes, but this effect is now even worse. According to General Relativity theory, a
clock does not even need to be moving for it to run slow because gravity itself
slows down time! In particular, stronger
gravity will slow down time more, while weaker gravity will slow down time
less. Time dilation that is caused by motion is called kinematic time dilation, which
is predicted by both Special Relativity theory and General Relativity
theory. However, the slowing of time by
gravity is called gravitational time dilation, which
is predicted only by General Relativity theory.
This gravitational time dilation was considered
outrageous a century ago, but this effect has actually been observed in recent
decades. For example, suppose we place
one atomic clock on the ground floor of a building, and suppose we place
another atomic clock on the roof of that building. Even after synchronizing these two atomic
clocks, they do not remain synchronized!
The atomic clock on the ground floor is closer to the Earth and thus
feels stronger gravity than the atomic clock on the roof, which is further from
the Earth and thus feels weaker gravity.
Therefore, the atomic clock on the ground floor will run slower and will
lag further and further behind the atomic clock on the roof! Is Einstein actually claiming that whenever
we walk upstairs or downstairs that our clocks are not synchronized with
everyone else’s clocks? Yes! But then why do we never
notice in our daily experience that all of our clocks read different
times? The Earth’s gravity causes this
gravitational time dilation to be so tiny that we do not notice it. Even the Sun’s gravity causes only tiny
amounts of this gravitational time dilation.
We would only notice these temporal changes if we were subject to
incredibly strong gravity, such as near a neutron star or a black hole. We will discuss black holes in detail
shortly. The implications of this
gravitational time dilation effect are staggering. For example, consider two identical twins who
have lived together on the second floor of a building their entire lives. Hence, they are the same age. However, if one of these twins walks
downstairs to the ground floor, that twin will age a tiny amount slower, since
that twin’s time is now running slower.
After walking back upstairs, that twin will now be a tiny amount younger
than the twin who remained on the second floor!
Our feet are younger than our head, since our feet are a little closer
to the Earth than our head, thus causing our feet to age more slowly! As we discussed, the satellites orbiting the
Earth all move at different speeds, resulting in kinematic time dilation. Moreover, all of the satellites orbiting the
Earth are at various distances from the Earth.
Hence, the satellites orbiting the Earth are subject to varying
gravitational field strengths from the Earth.
Our own mobile telephones are with us on the surface of the Earth and
therefore feel a stronger gravitational field strength than all satellites in
orbit. As a result, all satellites as
well as all of our mobile telephones suffer from gravitational time
dilation. In conclusion, the global
positioning system (GPS) would not function correctly without taking into
account both kinematic time dilation and gravitational time dilation.
Just as the vacuum speed of
light c is an invariant according to
Special Relativity theory, the vacuum speed of light c is still an invariant according to General Relativity theory. If we deduced kinematic time dilation from
the invariance of the vacuum speed of light c
in Special Relativity theory, we may deduce kinematic time dilation from the
invariance of the vacuum speed of light c
in General Relativity as well. Length contraction
caused by motion is called kinematic length
contraction, in analogy with kinematic time dilation. If we deduced kinematic length contraction
from the invariance of the vacuum speed of light c in Special Relativity theory, we may deduce kinematic length
contraction from the invariance of the vacuum speed of light c in General Relativity as well. However, this effect is now even worse. According to General Relativity theory, an
object does not even need to be moving for it to contact because gravity itself
causes length contraction! In
particular, stronger gravity will contract objects more, while weaker gravity
will contract objects less. Just as the
slowing of time by gravity is called gravitational
time dilation, the contraction of space by gravity is called gravitational
length contraction, which is predicted only by General Relativity theory.
Consider light that is emitted from the roof of a building that propagates to
its ground floor. Because of
gravitational length contraction, the wavelength of the light must contract as
it approaches the ground floor, since the lower floors are closer to the Earth
where gravity is stronger. However, the
light must continue to propagate at the same speed, the vacuum speed of light c.
The speed of any wave with wavelength λ and frequency f is determined by the equation v = f
λ, where v is the speed (the
velocity) of propagation of the wave, as we discussed earlier in the
course. If the speed remains fixed and
if the wavelength is contracted, then the frequency must increase by an
appropriate amount to keep the product of the larger frequency f with the contracted wavelength λ
equal to a fixed speed v (more
specifically c). We may interpret this increased frequency as
a blueshift.
Conversely, consider light that is emitted from
the ground floor of a building that propagates to its roof. Because of gravitational length contraction,
the wavelength of the light must become less contracted (hence expanded) as it
approaches the roof, since the higher floors are further from the Earth where
gravity is weaker. However, the light
must continue to propagate at the same speed, the vacuum speed of light c.
Again, the speed of any wave with wavelength λ and frequency f is determined by the equation v = f
λ, where v is the speed (the
velocity) of propagation of the wave. If
the speed remains fixed and if the wavelength is less contracted (hence
expanded), then the frequency must decrease by an appropriate amount to keep
the product of the smaller frequency f
with the expanded wavelength λ equal to a fixed speed v (more specifically c). We may interpret this decreased frequency as
a redshift. As we discussed earlier in
the course, motion causes the Doppler-Fizeau shift
for any wave to occur. We now rename
this Doppler-Fizeau shift the kinematic redshift (as
well as the kinematic blueshift). The kinematic redshift (and blueshift) is predicted by both
Special Relativity theory and General Relativity theory. However, we have just presented an argument
for the gravitational redshift (as
well as the gravitational blueshift), which is predicted only by General Relativity
theory. More precisely, light that
propagates from stronger gravitational fields toward weaker gravitational
fields suffers from a gravitational redshift, while light that propagates from
weaker gravitational fields toward stronger gravitational fields suffers from a
gravitational blueshift. This gravitational redshift (and
gravitational blueshift) has
actually been observed. When an
electron in an atom undergoes a transition from a higher-energy quantum state
to a lower-energy quantum state, it must emit a photon with a specific
frequency and a specific wavelength in accordance with the spectrum of the
atom, as we discussed earlier in the course.
If an atom on the ground floor of a building emits a
photon that propagates toward the roof of the building, anyone on the roof will
measure the frequency of that photon to be lower (or its wavelength to be
longer) than the photon emitted from the same transition in an identical atom
that happens to be located at the roof instead! Conversely, if an atom on
the roof of a building emits a photon that propagates toward the ground floor
of the building, anyone on the ground floor will measure the frequency of that
photon to be higher (or its wavelength to be shorter) than the photon emitted
from the same transition in an identical atom that happens to be located at the
ground floor instead! Of course,
the Earth’s gravity causes only tiny amounts of this gravitational redshift
(and gravitational blueshift). Even the Sun’s gravity causes only tiny
amounts of this gravitational redshift (and gravitational blueshift). This gravitational redshift (and
gravitational blueshift) only becomes severe with
incredibly steep changes in gravity, such as near a neutron star or a black
hole. Again, we will discuss black holes
in detail shortly. Einstein’s General
Relativity theory actually predicts a third type of redshift caused by the
expansion of the universe called cosmological redshift, as we will discuss
toward the end of the course. In
summary, Einstein’s General Relativity theory predicts three different types of
redshift: kinematic redshift caused by motion, gravitational redshift caused by
the curvature of spacetime, and cosmological redshift
caused by the expansion of the universe.
All three of these redshifts have been observed
for several decades, providing further evidence of the validity of Einstein’s
General Theory of Relativity.
Just as the vacuum speed of
light c is the speed limit of the
universe according to Special Relativity theory, the vacuum speed of light c is still the speed limit of the
universe according to General Relativity theory. If c
is still the speed limit of the universe, then nothing can move faster than
that speed. We now realize that not even
gravity can move faster than c! In fact, Einstein’s General Relativity theory
states that gravity itself moves at the speed c, just as light moves at the speed c. The implications of this
are outrageous. For example, suppose
that the Sun were removed from the Solar System right
now at this very moment. Since it takes
light roughly eight minutes to propagate from the Sun to the Earth, we would
continue to see the Sun shining in the sky for roughly eight minutes after its
removal. Then, we would see it
removed. However, since gravity also
propagates at the same vacuum speed of light c, the Earth would continue moving along its elliptical orbit as if
the Sun still attracted it for roughly eight minutes after the Sun’s
removal! Then, the Earth would
gravitationally feel that the Sun has been removed and would finally move off of its elliptical orbit!
As we discussed earlier in
the course, light is electromagnetic radiation or electromagnetic waves. More precisely, light is a propagating
disturbance through an electromagnetic field.
If gravity also moves at the same speed c, then there must be gravitational waves that are propagating
disturbances through a gravitational field.
According to Einstein’s General Relativity theory, gravity is actually
the curvature of our four-dimensional spacetime. Therefore, gravitational waves are
propagating disturbances through the curvature of our four-dimensional spacetime. In the
year 1974, the American astrophysicists Russell Alan Hulse
and Joseph Hooton Taylor discovered a binary neutron star system. These two neutron stars are orbiting
sufficiently close to each other and orbiting sufficiently fast that they
should be radiating significant amounts of gravitational waves. As these two neutrons stars radiate
gravitational waves, they must lose orbital energy, since energy must be conserved.
Therefore, these two neutrons stars must approach each other. Indeed, Hulse and
Taylor measured that these two neutron stars are approaching each other by the
precise amount that Einstein’s General Relativity theory predicts. Hulse and Taylor
received the Nobel Prize for their achievement, and this binary neutron star
system was named the Hulse-Taylor
system in their honor. Nevertheless,
this is not a direct detection of the gravitational wave itself. A direct detection of gravitational waves
would require extraordinarily sensitive measurements of varying time dilation
and varying length contraction as the crests and the troughs of the
gravitational waves pass through the detector.
The technology to make such measurements was not
achieved until the year 2015, the one hundredth anniversary of
Einstein’s General Relativity theory!
Ever since that historic year, astronomers have directly detected
several gravitational waves passing through the Earth. Most of these gravitational waves that
astrophysicists have directly detected since the year 2015 were
radiated from the collision and merger of binary black holes into single
black holes in distant galaxies. This is
a splendid manifestation of Einstein’s genius.
His theory predicted that gravitational waves exist, but it took one
century for technologies to be developed that could directly detect their
existence! Just as there is an entire
Electromagnetic Spectrum of different wavelengths or frequencies of
electromagnetic waves, there is an entire Gravitational Spectrum of different
wavelengths or frequencies of gravitational waves. Although astrophysicists have spent decades
observing the universe using electromagnetic waves from different bands of the
Electromagnetic Spectrum to form a more complete picture of the universe,
astrophysicists have just barely begun to observe the universe using
gravitational waves from different bands of the Gravitational Spectrum. A completely new window has
now been opened for astrophysicists to explore to form an even more
complete picture of the universe.
If a certain amount of mass
were concentrated into a single mathematical point of zero volume, then this
point-mass would have infinite density.
According to General Relativity theory, the gravity near this point-mass
would be incredibly strong, since the curvature of the four-dimensional spacetime near this point-mass would be incredibly
severe. The gravity would be so strong
because of this severe spacetime
curvature that an object too close to this point-mass would need to move faster
than c to escape its gravity, but
moving faster than c is
forbidden. Mathematically, there is a
sphere surrounding this point-mass that marks the boundary of no return. An object outside of this mathematical sphere
may have hope of escaping the gravity of the point-mass, but an object that
crosses inside of this mathematical sphere would have no hope of escaping the
incredibly strong gravity of the point-mass.
The object’s light would not even be able to escape from within the
mathematical sphere. Thus, it would
appear as if the object disappeared from our universe, as if it fell into a
hole. This hole would appear black,
since light cannot escape from within the mathematical sphere. So, objects falling
toward the infinite-density point-mass would appear as if they are falling into
a hole that is black. For several
decades, these fantastic objects have been called
black holes. The center of the black
hole is its singularity, the point-mass of infinite density. The mathematical sphere surrounding the
singularity is the event horizon, the boundary of no return. The radius of the event horizon is sometimes called the black hole radius but is more often
called the Schwarzschild radius, named for the German physicist Karl
Schwarzschild who mathematically derived the simplest black-hole solution to
Einstein’s General Relativity theory. Karl Schwarzschild derived the following equation for the
Schwarzschild radius (black hole radius) of the event horizon of a black hole: rs = 2GM / c2,
where rs
is the Schwarzschild radius (black hole radius) of the event horizon, G is Newton’s gravitational constant of
the universe, M is the mass of the
black hole, and c is as usual the
vacuum speed of light. Using this
equation, we can easily calculate that the Schwarzschild radius (black hole
radius) of a typical stellar black hole born from the Type II supernova of a
very high mass star is very roughly eight kilometers. So, any object further than very roughly
eight kilometers from the singularity of such a black hole may have hope of
escaping its gravity, but any object closer than this distance from the
singularity of such a black hole has crossed the event horizon and has no hope
of escaping its gravity.
Nothing can escape from
within the event horizon of a black hole, not even light. Hence, the event horizon of a black hole
appears black, hence the name black hole.
Many students believe that outer space is also black, thus preventing us
from ever imaging the event horizon of a black hole against the surrounding
space. Although this would indeed be the
case for a completely isolated black hole, in actuality diffuse gas fills the
entire universe. Hence, it should be
possible to see the black event horizon of a black hole against the gases of
the surrounding space. Some black holes
have accretion disks around them, as in X-ray binaries, and it should be
possible to see the black event horizon of such a black hole against the
surrounding accretion disk.
Unfortunately, the Schwarzschild radius of a typical stellar black hole
is very roughly eight kilometers, and no telescope is large enough to provide
sufficient resolution (magnification) to image such a small size, even if the
black hole resided as near as within the solar neighborhood. However, there are supermassive black holes
in our universe, as we will discuss later in the course. The Schwarzschild radius of a typical
supermassive black hole is at least one million kilometers! As we discussed earlier in the course, two
telescopes on opposite sides of planet Earth used together as a single
interferometer would in principle have the same
resolving power as a single telescope the size of planet Earth. Using many radio telescopes working together
as a single interferometer, astronomers succeeded in the year 2019 in imaging
the event horizon of a supermassive black hole at the center of a distant
galaxy. Although this galaxy is roughly
sixteen megaparsecs (roughly fifty million
light-years) distant, the supermassive black hole at its center has a
Schwarzschild radius of roughly fifteen billion kilometers. This is roughly the size of our Solar System,
from the Sun all the way out to the Kuiper belt just beyond the orbit of
Neptune! In the radio images of this
supermassive black hole, we actually see its black event horizon against the
gases of the accretion disk that surrounds the supermassive black hole. In the year 2022, astronomers published a
radio image of the event horizon of the supermassive black hole at the center
of our own Milky Way Galaxy. Again, this image was produced by many radio telescopes working together
as a single interferometer.
Although astronomers have
been certain for decades that black holes actually exist, the first black hole
ever discovered, the compact object in the X-ray binary Cygnus X-1, was not discovered until after Einstein died. In fact, Einstein himself did not believe that
these strange objects actually existed in our universe. Nevertheless, even while Einstein was alive,
physicists recognized an important application of Karl Schwarzschild’s
mathematical discovery of this point-mass solution (black hole solution) to
Einstein’s General Relativity theory.
The spacetime curvature (the gravity) outside of an isotropic (spherically
symmetric) distribution of mass is exactly the same as
if all of its mass were concentrated at its center. More plainly, the spacetime
curvature (the gravity) outside of a
spherical distribution of mass (beginning at its surface and extending outward)
would be the same spacetime curvature (the same
gravity) as a black hole of the same total mass placed at the center of the
spherical distribution. This statement is called the Birkhoff theorem,
named for the American mathematician George David Birkhoff
who first mathematically proved this important result. As an application of the Birkhoff
theorem, the Earth is a spherical distribution of mass to an excellent
approximation. Therefore, the gravity outside of the Earth (beginning at its
surface and extending outward) is nearly exactly equal to the gravity of a
black hole with the same mass as the Earth placed at the center of the
Earth. Students often ask for a
description of how the gravity of a black hole would feel if we were near the
black hole but still outside its event horizon.
According to the Birkhoff theorem, every
moment of our lives we feel nearly precisely the same gravity from the Earth as
we would feel from a black hole equal in mass to the Earth and placed at the
center of the Earth, roughly 6400 kilometers beneath our feet! As another application of the Birkhoff theorem, the Sun is also a spherical distribution
of mass to an excellent approximation.
Therefore, the gravity outside
of the Sun (beginning at its surface and extending outward) is nearly exactly
equal to the gravity of a black hole with the same mass as the Sun placed at
the center of the Sun. More plainly, the
gravity with which the Sun attracts the planets and everything else in the
Solar System is nearly exactly the same gravity as a black hole with the same
mass as the Sun placed at the center of the Sun. Although the Sun is a low mass star and will
never suffer from a supernova, imagine that the entire Sun were to suddenly
collapse into a black hole with no change in mass. Most students believe that its gravity would
now become so strong that it would begin to suck in the planets one by one,
beginning with Mercury then Venus then Earth and so on and so forth, but this
is false. According to the Birkhoff theorem, the Sun already creates nearly the same
gravity as a black hole of the same mass placed at its center. Therefore, virtually nothing would happen to
the orbits of the planets if the Sun were to suddenly
collapse into a black hole. The planets
would continue to orbit that black hole with almost precisely the same orbits
that they have always enjoyed! Of
course, all of the planets would also begin to cool, since they would no longer
receive any light from this hypothetically collapsed Sun. All of us on Earth would eventually freeze to
death, although we would die long before then, since nearly all life on Earth
depends entirely upon sunlight, as we discussed earlier in the course. Nevertheless, the orbits of the planets and
everything else in the Solar System would remain almost
exactly the same. Note that the Birkhoff theorem is also true in Newton’s theory of
gravitation: the gravity outside of a
spherical distribution of mass (beginning at its surface and extending outward)
would be the same gravity as a point-mass of the same total mass placed at the
center of the spherical distribution.
For example, if we were to calculate the gravitational force between the
Earth and the Moon according to Newton’s theory of gravitation, we would use
Newton’s law of universal gravitation: ,
where r is the distance between the
Earth and the Moon, as we discussed earlier in the course. However, what value do we use for this
distance? After all, different parts of
the Earth are different distances from different parts of the Moon. However, both the Earth and the Moon are
spherical distributions of mass to an excellent approximation. Therefore, the gravity outside of each of them (beginning at their surfaces and extending
outward) is nearly exactly equal to the gravity of point-masses placed at their
centers, with the same masses as the Earth and the Moon of course. Hence, the Birkhoff
theorem reveals that we must always use the center-to-center distance whenever
calculating the gravitational force between the Earth and the Moon or between
almost any pair of objects in the entire universe.
Today, we know that black
holes do actually exist in our universe, and we also
know that black holes form when the core of a very high mass star is able to
overcome neutron degeneracy pressure, since its mass is greater than the Tolman-Oppenheimer-Volkoff
limit. If neutron degeneracy pressure
cannot halt the collapse of the core, then nothing can halt the collapse of the
core. The core continues collapsing all
the way down to a mathematical point, a black hole. This is the ultimate triumph of gravity. This reveals another interpretation of the
black hole radius (the Schwarzschild radius).
As we discussed, the equation for the black hole radius (the
Schwarzschild radius) is rs
= 2GM / c2. Notice that the only variable that determines
the black hole radius (the Schwarzschild radius) according to this equation is
the mass M, since G (Newton’s gravitational constant of
the universe) and c (the vacuum speed
of light) are both fixed numbers.
Therefore, there is nothing stopping us from calculating the
Schwarzschild radius of any object in the universe, not just black holes. For example, we can easily calculate that the
Schwarzschild radius of our Sun is roughly three kilometers. Many students protest this calculation, since
the Sun is not a black hole and will in fact never become a black hole, since
the Sun is a low mass star.
Nevertheless, there is nothing stopping us from using the mass of the
Sun in this Schwarzschild radius equation.
If the Sun is not a black hole and will in fact never become a black
hole, then how do we interpret this three-kilometer Schwarzschild radius for
the Sun? Firstly, we note that the
actual radius of the Sun (roughly seven hundred thousand kilometers) is much
much larger than its Schwarzschild radius (roughly three kilometers). As a result, the Sun’s gravity is weak as compared to what the Sun’s
gravity would be if we could collapse it into a black
hole. Furthermore, this three-kilometer
Schwarzschild radius for the Sun would be the size we would need to crush the
Sun down into before its own self-gravity became strong enough to crush itself
all the way down into a black hole. In
other words, if we were to crush the entire mass of the Sun down to a radius of
less than three kilometers, the Sun would not be able to escape from its own
self-gravity, and the Sun would crush itself all the way down to a black
hole. The Earth’s Schwarzschild radius
is roughly nine millimeters. The actual
radius of the Earth is nearly 6400 kilometers, which is much much larger than
nine millimeters. Hence, the Earth’s
gravity is weak as compared to what
the Earth’s gravity would be if we could collapse it into a black
hole. Furthermore, this nine-millimeter
Schwarzschild radius for the Earth would be the size we would need to crush the
Earth down into before its own self-gravity became strong enough to crush
itself all the way down into a black hole.
In other words, if we were to crush the entire mass of the Earth down to
a radius of less than nine millimeters, the Earth would not be able to escape from
its own self-gravity, and the Earth would crush itself all the way down to a
black hole. Note that nine millimeters
is almost ten millimeters, which is equal to one centimeter. This would be a Schwarzschild diameter of roughly two centimeters,
which is roughly one inch since one inch is exactly equal to 2.54
centimeters. Therefore, we would have to
crush the entire mass of the Earth down to a size of roughly one inch to turn
it into a black hole! The Schwarzschild
radius of a typical human is ten billion times smaller than the nucleus of an
atom! In other words, we would have to
crush our bodies down to this size before our own self-gravity becomes strong
enough to crush our bodies all the way down into a black hole!
Einstein was
ridiculed for Special Relativity theory, and he was even more harshly
ridiculed for General Relativity theory, but it would only be a couple years
after he proposed General Relativity theory that his theories would be
successfully tested. For example,
although the orbits of the planets around the Sun are ellipses, those
elliptical orbits do not remain fixed. A
planet’s elliptical orbit around the Sun actually suffers a very slow orbital
precession. This orbital precession is caused mostly by the gravitational tugs of the other
planets, primarily Jupiter as we discussed earlier in the course. However, Mercury’s orbit has a very tiny
anomalous orbital precession that could not be explained
using Newton’s theory of gravity. The
amount of Mercury’s anomalous orbital precession is roughly forty-three arcseconds per century.
This is a fantastically tiny orbital shift. According to General Relativity theory, the
curvature of spacetime caused by the Sun’s gravity
causes a planet’s orbit to suffer a tiny orbital precession in addition to the
orbital precession caused by the gravitational tugs from the other planets,
primarily Jupiter. This tiny extra
precession is called general-relativistic orbital
precession. When Einstein calculated the
amount of this general-relativistic orbital precession for Mercury’s orbit caused
by the spacetime curvature of the Sun’s gravity, he
obtained exactly forty-three arcseconds per
century! Although this outstanding
achievement convinced Einstein that his General Relativity theory was superior
to Newton’s theory of gravity, most physicists were still not convinced. So, Einstein
proposed the following experiment.
According to his General Relativity theory, everything in the universe
feels gravity, including light itself.
More correctly, the curvature (the gravity) of our four-dimensional spacetime deflects the trajectory of anything, including
light beams. More plainly, light falls
in gravity just as everything else falls in gravity. In recent decades, astronomers have observed
that the gravity of galactic clusters bends the light from distant galaxies,
thus distorting the image of the distant galaxies. Since this is rather like the glass of a lens
bending light, the gravity of a galactic cluster is called
a gravitational lens. When the distant
galaxy, the gravitational lens, and our own Milky Way Galaxy happen to form a
nearly straight line, the light of the distant galaxy bends into the shape of a
ring around the galactic cluster. This is called an Einstein ring.
When the gravitational lens happens to be slightly
displaced from the line connecting the distant galaxy and our own Milky
Way Galaxy, the light of the distant galaxy bends into two duplicate images in
the shape of arcs around the galactic cluster.
These are called Einstein arcs. When the gravitational lens happens to be even more displaced from the line connecting the distant
galaxy and our own Milky Way Galaxy, the light of the distant galaxy bends into
four duplicate images around the galactic cluster. This is called an
Einstein cross. The amount by which
light beams are deflected by the Earth’s weak gravity
is fantastically tiny. This is why we do
not notice light falling downward in our daily experience. The Sun’s gravity is stronger than the
Earth’s gravity, but even the Sun’s gravity is so weak that no one ever noticed
the deflection of light around the Sun before Einstein. Using his General Relativity theory, Einstein
calculated that light should be deflected by roughly
1.75″ (1.75 arcseconds or 1.75 seconds of arc)
around the surface of the Sun. Although
this is an incredibly small angle, it was measurable a century ago. However, we cannot see any stars in the
daytime, besides the Sun of course! So, measuring the deflection of starlight around the surface
of the Sun seemed hopeless. However,
during the totality of a total solar eclipse, the sky becomes sufficiently dark
that stars become visible, as we discussed earlier in the course. A total solar eclipse was scheduled to occur
in the year 1919, and many physicists gathered for this eclipse for the purpose of proving Einstein wrong. When totality occurred, starlight was indeed
deflected around the surface of the Sun, and astronomers measured the
deflection to be 1.75″ (1.75 arcseconds or 1.75
seconds of arc), in precise agreement with Einstein’s General Relativity
theory! Practically overnight, Einstein
went from being ridiculed to being considered one of
the most brilliant men, if not the most brilliant man, who ever lived. A mediocre physicist who struggled with
mathematics had discovered correct theories of our universe using only the
power of his own genius.
Einstein once claimed that
when he studied physics that he wanted “to know how God thinks.” In other words, Einstein believed that God
not only created the universe, but God also authored the mathematical equations
that describe the universe, the laws of physics. Moreover, Einstein believed that God authored
a single ultimate mathematical equation that completely describes the universe. If the universe is
described by a single ultimate mathematical equation, Einstein believed
that that equation should be deducible from pure logic, from pure
mathematics. Einstein once claimed that
when he studied physics that he wanted “to know whether or not God had any choice
in how He created the universe.” In
other words, Einstein believed that if the ultimate mathematical equation that
describes the universe is deducible from pure logic, from pure mathematics,
then the laws of the universe could not possibly be different from what they
actually are. Einstein also believed
that this ultimate equation must be mathematically beautiful, just as he
believed his own General Relativity theory was mathematically beautiful. In fact, Einstein claimed that if the
deflection of starlight around the Sun was not 1.75″
as his theory predicted, then he would have “pitied the Lord because it would
have proven that He did not create the universe correctly.” Although this quotation seems to imply that
Einstein was so arrogant that he believed himself to be more intelligent than
God, this quotation actually reveals that Einstein was
humbled by the mathematical beauty of the universe that God created. This is summarized by another quotation by
Einstein, “Subtle is the Lord, but malicious He is not.” In other words, the laws of physics that
describe the universe may not be obvious and thus may require a genius to
discover them, but God is not evil and hence God would not create a universe
that was so complicated that humans would not be able to discover the
mathematical equations that describe it.
On the other hand, Einstein also once said, “The most incomprehensible
thing about the universe is that it is comprehensible.” In other words, why did God decide to create
the universe governed by beautiful mathematical equations? We could ask this question another way
around. What is it about
the human mind that it is able to not only study and to understand the universe
but beyond this to actually discover the mathematical equations that describe
the universe? Einstein spent the last
few decades of his life while living in New Jersey trying to discover the
ultimate theory of the universe that he believed God authored when He created
the universe. Today, we would call such
a theory a Super Unification Theory or a Theory of Everything, as we will
discuss toward the end of the course.
Einstein did not succeed in his quest, but other physicists in the
decades after Einstein have succeeded in bringing us much closer to this
ultimate theory than Einstein could have ever dreamt, as we will also discuss
toward the end of the course.
Nevertheless, many physicists agree that no other person single-handedly
advanced our understanding of the universe more than Albert Einstein.
Libarid A. Maljian homepage at the Department of Physics at CSLA at NJIT
Libarid A. Maljian profile at the Department of Physics at CSLA at NJIT
Department of Physics at CSLA at NJIT
College of Science and Liberal Arts at NJIT
New Jersey Institute of Technology
This webpage was most recently modified on Monday, the tenth day of April, anno Domini MMXXIII, at 04:15 ante meridiem EDT.