1. Eight Trends in East Bay Demography
2. A Synopsis of Published Analyses
3. Analysis Using the datascience
Language
4. Raw Data
5. Sources
Below, eight of the most important demographic changes in the East Bay population since 1900 are noted. Accompanying these trends and graphs is code using UC Berkeley's datascience
coding language that will show you the methods used to do these types of analyses as well as develop the graphics shown. See the section entitled Analysis Using the datascience Language for more information on how to run the code yourself. The trends utilize data from the US Census Bureau regarding Alameda city, Albany city, Berkeley city, El Cerrito city, Emeryville city, Oakland city, and Richmond city.
During WWII, the population of the East Bay skyrocketed due to the influx of jobs manufacturing goods to aid in the fight on the Pacific Front (more on that later). Once the war ended, the population of the East Bay experienced decline until the 1990s, and it has since begun to rise. However, as shown in the graphic to the right, the growth of the East Bay population was tapering off at around 730,000. This trend, most hard-hitting after the 2008 financial crisis, has led to the tapering growth rates that the East Bay population was experiencing. Based on the 2016 US Census Bureau estimates, however, the population of the East Bay has begun to grow more rapidly, perhaps a sign of the effect of the burgeoning tech bubble from Silicon Valley.
Note that this analysis, unlike many of the others on this page, includes the 2016 population estimates from the US Census Bureau; this is in order to provide a clearer trend, since the actual census data ends in 2010. In order to do said analysis, you can use this code:
pop_change = census.where('CITY', 'AGGREGATE').where('CATEGORY', 'TOTAL POPULATION')
pop_change = pop_change.relabeled('DATA POP.', 'East Bay Shoreline Population')
pop_change.plot(0, 4)
This is some fairly simple code. The first line selects the aggregate data (data for the whole East Bay shoreline) and then selects the total population data within that set. The second line simply relabels the DATA POP. column to East Bay Shoreline Population to make the y-axis of the graph better labeled. The third line is the command to plot to data, with column 0 (YEAR) on the x-axis and column 4 (the newly relabeled East Bay Shoreline Population) on the y-axis. Note that column indices begin at the number 0; this means that the second column of a table has index 1.
As described above, the population of the East Bay exploded during WWII as the Bay Area became a hub of defense manufacturing to help fight the war on the Pacific Front. Many shipbuilding plants opened throughout the bay, which brought many unskilled workers to the area as the shipbuilding profession was made less artisanal by technological advances. At the peak of manufacturing, it took only four days for a ship to be built from start to finish. As the graph shows, the 1940s brought a surge of almost 80,000 in Richmond's population alone, the location of the Kaiser Shipyards, which were opened by the renowned industrialist Henry J. Kaiser (his firm was one of the primary contractors in the building of the Hoover Dam).
For the analysis above, you can use this code:
richmond_pop = census.where('CITY', 'Richmond city').where('DATA', 'TOTAL POPULATION')
richmond_pop_wwii = richmond_pop.where('YEAR', are.between(1915, 1965))
richmond_pop_wwii = richmond_pop_wwii.relabeled('DATA POP.', 'Richmond Population')
richmond_pop_wwii.plot(0, 4)
This is also some fairly simple code. The first line selects the data for Richmond city and then selects the total population data within that set. The second line selects years surrounding WWII; in this case, the interval 1920 to 1960 was chosen to allow for comparison before and after wartime. The third line simply relabels the DATA POP. column to Richmond Population to make the y-axis of the graph better labeled. The fourth line is the command to plot to data, as in the previous example.
In 1976, the Mexican peso was devalued by 45%. In 1982, it went down again by 30%. The result of these devaluations and the disastrous effect on the Mexican economy resulted in an influx of Mexican immigrants to the US. While immigration has always been occurring, it skyrocketed as a result of the massive unemployment that came out of the peso devaluation. In border towns like El Paso, TX, people who lived in Juarez, Mexico, (just on the other side of the border) would commute across the Rio Grande every morning to work in the US. While the Bay was somewhat insulated from the effects of this by the buffer of Southern California, it did experience a rapid rise in the hispanic population as a result of Mexico's changing economy in the 1980s.
The code for this analysis is very similar to the previous code:
hispanic = census.where('DATA', 'Hispanic or Latino (of any race)').where('CITY', 'AGGREGATE')
hispanic = hispanic.relabeled('DATA POP.', 'Hispanic Population')
hispanic.plot(0, 4)
As before, the first line selects the data that we want, the Hispanic and Latino population in the aggregate data, and the second line relabels the DATA POP. column to Hispanic Population in order to fix the y-axis label of the graph. The third line is the command to plot the graph using the columns with indices 0 and 4 (the YEAR and Hispanic Population columns).
The year 2010 had the first census since the Census Bureau began recording the racial breakdown of the population in which the non-white population of the East Bay went down from the previous census. This alarming trend is likely the result of the gentrification that is occurring in the East Bay, with the decline of racial minorities in Oakland likely the cause. Although other cities along the East Bay shoreline and flats have experienced growth in Non-White populations, Oakland's staggering 21,000 person drop in minority populations has resulted in a downward trend for the East Bay.
As you've probably noticed, the code for these graphs follows the same pattern, so this bit should require no explanation. The code is:
nonwhite = census.where('DATA', 'Non-White').where('CITY', 'AGGREGATE')
nonwhite = nonwhite.relabeled('DATA POP.', 'Non-White Population')
nonwhite.plot(0, 4)
From the population boom of the 1940s, the size of the retirement demographic in the Bay has grown exponentially, especially as workers who were in their 20s and 30s during WWII come to retirement age in the 1970s and -80s. During the decade following the 2000 census, the population of retirement-aged people in the East Bay dropped drastically, which may be part of the reason for the steady increase in liberal politics of the region. In any case, the decline of the retirement population in the East Bay is a notable trend because of the impact that their generation had on the Bay from WWII.
As before, the code for this example is once again of the generic form of the previous trends:
ret = census.where('DATA', '65+ Years Old').where('CITY', 'AGGREGATE')
ret = ret.relabeled('DATA POP.', 'Retirement Population')
ret.plot(0, 4)
The data regarding the population of people below the age of 18 shows a trend that follows the generations of people in the East Bay. The baby boomers, born during the late 1940s and early -50s, caused a high population of minors in the 1960 census, which then declined again as they became of working age but had not yet started families. Once the children of the boomers, the so-called "generation X" began to settle down, beginning in the late 1980s, the population of minors began to rise again, before the millennials began to cross the 18-year boundary.
Following the same vein as earlier, the code for this trend goes like this:
minors = census.where('DATA', '< 18 Years Old').where('CITY', 'AGGREGATE')
minors = minors.relabeled('DATA POP.', 'Minor Population')
minors.plot(0, 4)
As a result of the large-scale entrance of women into the workforce during WWII, the percentage of women and men in the East Bay began to change. Women flocked to the Bay Area to work in the manufacturing that was taking place in order to aid the fight on the Pacific Front; the reason that there was no corresponding migration of men to the Bay was that they went off to fight in the war. After the war ended, many of the people who had come here for work, regardless of whether or not they continued in the workforce, stayed and built families. The decline that is shown in the graph beginning after 1970 can be attributed to probability attempting to even out the population of women and men.
With this example, the code finally gets more robust:
females_with_1980 = census.where('DATA', 'Percent Male').where('CITY', 'AGGREGATE')
females_no_percentage = females_with_1980.take(np.arange(3))
females_no_percentage = females_no_percentage.with_row(
females_with_1980.where('YEAR', are.below(1980))
)
females_no_percentage = females_no_percentage.relabeled('% (AS DECIMAL)', '% MALE')
females = Table().with_column('YEAR', females_no_percentage.column('YEAR'))
females = females.with_column(
'Percent Female', ((1 - females_no_percentage.column('% MALE')) * 100)
)
females = females.with_column(
'Percent Male', (females_no_percentage.column('% MALE') * 100)
)
females.plot(0, 1)
First, the code starts off by defining a table, females_with_1980
, by selecting the data for percent male in the aggregate. Then, because there is not a complete data set on gender percentage for 1980 in the census.csv
file, the next part defines a new table, females_no_percentage
, by taking the top three rows of females_with_1980
and then adding the rows where the year is below 1980 using the are.below()
function. Finally, it creates a whole new table, females
, that allows you to directly compare the percentage of males and females by copying the year and percent male columns from females_no_percentages
and then adding a new column that takes 1 - % MALE from females_no_percentages
to get the percentage of females. The last line plots the year against the PERCENT FEMALE of the females
table.
One of the most obvious trends in demography is the shift to urban centers during times of economic prosperity. As evidence by the rapid rise in East Bay population during the Roaring '20s and the 1940s, both times during which the US was coming out of a depression, this trend is certainly verifiable. Even more to this point is the fact that the growth in population between 1930 and 1940 is very nearly flat, as shown to the right; one of the worst depressions in US history occurred during the 1930s, the Great Depression. This is again shown by a nearly flat growth between 2000 and 2010, as a result of the 2008 financial crisis. After the crisis, however, the economy stabilized and the 2016 population estimates (showing a sharper rise than that which occurred between 2000 and 2010) concur with this phenomenon.
The code for this example is fairly similar to that which was done already. In fact, it's basically the same as that from item (1):
econ_pop = census.where('CITY', 'AGGREGATE').where('CATEGORY', 'TOTAL POPULATION')
econ_pop = econ_pop.relabeled('DATA POP.', 'Population')
econ_pop.plot(0, 4)
The most major documented shift in the East Bay population occurred as a result of World War II. Lotchin (1994) attributes this influx of people as a result of the massive manufacturing complex that arose in the East Bay in order to aid the fight on the Pacific Front. Shipbuilding, perhaps the biggest war industry of the Bay, caused a slew of shipyards to be opened up, such as the Kaiser Shipyards in Richmond, and began to crank out ships so fast that the Axis powers couldn't keep sinking them. At the peak of manufacturing, the Kaiser Shipyards built a ship from start to finish in just four days. The Bay experienced a increase of 62,000 jobs during WWII. Lotchin does, however, argue that the population growth of the 1940s was nothing unprecedented, citing the similar rates of the 1920s, the economic boom after World War I.
Sohoni (1999) presents an interesting case: the rise of the Asian-American population. He presents the fact that although Asian-Americans make up a mere 3% of the total US population, they are the fastest growing group; between 1980 and 1990, they grew by 107.8%. The result of Sohoni's analysis concluded that there was a high correlation between the growth of Asian populations in areas where there was already a substantial population of Asian immigrants (this idea is called "network theory").
Continuing in the vein of East Bay gentrification, Grigoryeva and Ruef (2015) provide a theory for the demographic history that underlies segregation. Although there paper considers the entirety of the US, its analysis of northern urban centers can be applied to the Bay Area with great accuracy. In their article, they demonstrate that although the South is famous for its racial intolerance, it experienced the least residential segregation as opposed to urban areas. Northern cities, on the other hand, became segregated through racialized neighborhoods, which is what is happening now with the trend of gentrification.
If you're a UCB student, you can create a Python notebook to do this analysis at datahub.berkeley.edu using the datascience
module. In order to run the code, you will also need the .csv file with the census data, which you can extract from the Google Sheet in the Raw Data section below. The code, as written below, will read the table on if it is named census.csv
, so either change the file name or the code. For more information about the datascience
module and coding ideas, visit data8.org.
Here is some code that you can use to do your own analysis. It uses the datascience
language taught in UC Berkeley's Data 8 course. Before you run the code, though, make sure you paste this into a cell and run it:
from datascience import *
import numpy as np
%matplotlib inline
census = Table.read_table('census.csv')
In the last line, change census.csv
to whatever you titled the table when you uploaded it. Now, onto the code. If you wanted to, for example, create a table that would allow you to directly compare the percentages of males and females from the 1970 census across the different regions of the East Bay, you would do this:
census_1970_males = census.where('YEAR', 1970).where('DATA', 'Percent Male').take(np.arange(7))
census_1970_females = census_1970_males.with_column('PERCENT FEMALE', 1 - census_1970_males.column(6))
compared_percentages = Table().with_column('YEAR', census_1970_females.column(0))
compared_percentages = compared_percentages.with_column(
'CITY', census_1970_females.column(1)
)
compared_percentages = compared_percentages.with_column(
'PERCENT MALE', census_1970_females.column('% (AS DECIMAL)')
)
compared_percentages = compared_percentages.with_column(
'PERCENT FEMALE', census_1970_females.column('PERCENT FEMALE')
)
compared_percentages
In the first part of the code above, you're telling Python to read the census.csv
table and select the year 1970 from the YEAR column, and then to select the Percent Male data, and drop the last row (the aggregate data) and put all this into a new table called census_1970_males
. You're then creating a new table called census_1970_females
that has an added column that subtracts the percentage of males from 1 to give you the percentage of females in a new column called PERCENT FEMALE. The last part of this code creates a new table from scratch, compared_percentages
, that imports the YEAR, CITY, PERCENT MALE, and PERCENT FEMALE columns to allow you to directly compare the data. The table looks like this:
For more resources on demography, visit the Resources subpage of this section.