Project Abstract
The project focused on finding out the reasons that contributed to the appearance and decline of Chinatown movie theaters and their financial health from 1940 to 2000 in Los Angeles. Using demographic census data and other datasets, we want to identify factors that could be linked to this phenomenon, including the density of Asian immigrants and the industry distribution in both theaters’ primary addresses and their adjacent neighborhoods. I used an interactive ArcGIS online to map theaters in LA and found two main clusters that differentiate from other counties’ or states’ Chinese Language Movie Theaters’ distribution. This raises another exploration in testing the effect of location and clustering on the theaters’ survival state. Using ordinal regression and multilevel logistic regression in R, we found that marriage rate, business industry, Chinese population, and other factors could contribute to the operating condition of theaters. Nevertheless, the location is not a significant factor impacting theaters’ performances. This research could be applied to exploring Chinatown Movie Theaters in North America and contribute to the navigation of their general history and development.
Introduction
The research is a part of a big project about Chinatown Movie Theaters in North America. Theaters in North America screened Chinese-language movies as early as the 1920s, while the peak of such theaters extended from the 1960s through the 1990s, as Hong Kong distributors disseminated Cantonese and Mandarin films on circuits through dozens of dedicated theaters in various cities. This summer, we mainly focus on the Chinese movie theaters in LA, which differentiate from other cities’ theaters since they are mainly concentrated in two clusters.
Similar to other Chinese-language theaters that peaked in the 1970s and 1980s while declining in the 1990s, their financial health changed coincidently with the demographic shift created by a new wave of immigrants from Asia and Central America, primarily fueled by the Alien Quota Act in 1965. The development of Chinatown downtown could also reveal this change as it became more diversified in the 1960s. However, unlike other cities, LA experiences another immigration trend in the 1980s. The post-1980 globalization trend has profoundly shaped Asian immigration, assimilation, and development in a new ethnoburb, San Gabriel Valley. These two Chinese aggregations match the clusters on the map where Chinese-language movies theaters locate. Therefore, we aim to solve two research questions related to this phenomenon. One is exploring the demographic factors contributing to the opening or closing state of Chinese-language theaters and their financial health. The other is finding whether location (either downtown or suburban) is a significant factor in affecting the Chinese-language theaters’ survival.
Data Source
I mainly use the demographic Census dataset retrieved from Social Explorer since the dataset on this database is tract-based (Source). As it lacks the 1950 Decennial Census, I manually input the information based on the table in the 1950 United States Census file (Source).
For the location, as we want to check more detailed demographic information, we choose tracts as individual units, find out theaters’ their neighbor commercial area’s addresses, and attach them to their corresponding tracts. Since the tracts could vary from 1940 to 2000 decades by decades, I have to use Geocoder from Census Bureau to decipher the tracts in the previous decades from 2010 Geographies. Since Social Explorer has the Census based on 2010 tracts from 1970, I mainly dig out tracts that we will use in 1940, 1950, and 1960. Here are the tracts in different decades after geocoding and theaters’ basic information, including their current location and year of survival.
Basic Information About Theaters
I use GIS to plot all theaters with their basic information by importing the dataset as the attribute tables. There are several layers on this map. The basic layer is the Dark Grey Canvas. The second layer is the counties in California, and the third layer is the tracts I choose as the observations, including the ones where theaters are placed and their adjacent tracts. As we can see, they are represented by the polygons on the map. The fourth layer is the observations that could be clustered as components within each theater. I used clustering analysis on these points and found two main clusters. One is in the LA downtown with 63 features, and another is around Monterey Park, San Gabriel, Alhambra, and Rosemead with 65 features. As we can see on the map, the cluster density in the downtown area is higher than in the suburb, which means that downtown theaters are more concentrated than theaters in the suburbs. This could also be shown by the second-level clusters that there are only two clusters downtown while there are four clusters in the suburb, which present four main theaters. This is because the Chinese movie theaters downtown are normally distributed around Chinatown, which makes them denser, while the suburb includes more cities and the population is more scattered.There are seven main theaters.
LA Downtown Theaters
King Hing Theatre
Kim Sing Theatre
Pagoda Cinema
Suburb Theaters
Bard’s Garfield Egyptian Theatre
Kuo Hwa 2 Cinema
Kuo Hwa Theatre
Monterey Theatre
Variables Choose
After checking all existing variables in the Demographic Census Datasets in different decades, we choose twenty independent variables that could potentially relate to the theaters’ health and categorize the dependent variables that could represent the theaters’ opening status and financial health. We recode theaters’ opening years in a decade in three categories and recode theaters’ tax filing records retrieved from California FTB data into categories that could show the theaters’ financial conditions.
Dependent Variables:
Opening Status
0-Closing for The Entire Decade
1-Opening for a Period of Time in This Decade
2-Opening for the Entire Decade
Financial Health
0-unknown or closing
1-Suspend or Termination while opening
2-at least a penalty
3-at least 1 SI
4-very healthy - filing the tax for the existing year
Independent Variables:
Here are the overview variables that we choose for the first draft. We add one variable Real Estate Values retrieved from the location’s parcel assessed total value change in percentage in each decade (Source). But there are a lot of missing data for different decades.
Data Management
The next step is to download those datasets from Social Explorer and do the data management for each one.
First, we have to import all datasets. Since the 1960 dataset can only get all tract information, I import the 1960 dataset first and select the tracts we want. For the 1950 dataset, I import the manual version I collect in this sheet.
x1940T$Age1<-as.numeric(x1940T$X..Under.5.Years)+as.numeric(x1940T$X..5.to.9.Years)+as.numeric(x1940T$X..10.to.14.Years)+as.numeric(x1940T$X..15.to.19.Years)
x1940T$Age2<-as.numeric(x1940T$X..20.to.24.Years)+as.numeric(x1940T$X..25.to.29.Years)+as.numeric(x1940T$X..30.to.34.Years)
x1940T$Age3<-as.numeric(x1940T$X..35.to.39.Years)+as.numeric(x1940T$X..40.to.44.Years)+as.numeric(x1940T$X..45.to.49.Years)+as.numeric(x1940T$X..50.to.54.Years)+as.numeric(x1940T$X..55.to.59.Years)+as.numeric(x1940T$X..60.to.64.Years)
x1940T$Age4<-as.numeric(x1940T$X..65.to.69.Years)+as.numeric(x1940T$X..70.to.74.Years)+as.numeric(x1940T$X..75.Years.and.over)
x1940T<-x1940T%>%
rename(FemaleP=X..Female, MaleP=X..Male, WhiteP=X..White, BlackP=X..Black, OtherRaceP=X..Other, WhiteNativeBorn=White.Population..Native.Born,WhiteForeignBorn=White.Population..Foreign.Born)
x1940T$Educ1 <- as.numeric(x1940T$X..Population.Age.25.and.Over..Less.Than.High.School)
x1940T$Educ2 <- as.numeric(x1940T$X..Population.Age.25.and.Over..Some.High.School.Or.More)
x1940T$Educ3 <- as.numeric(x1940T$X..Population.Age.25.and.Over..Some.College.Or.More)
x1940T$Household1 <- as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home..Under..500..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...500.to..699..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...700.to..999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...1.000.to..1.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...1.500.to..1.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...2.000.to..2.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...2.500.to..2.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...3.000.to..3.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...4.000.to..4.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...5.000.to..5.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...6.000.to..7.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...7.500.to..9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1940T$Household2 <- as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...10.000.to..14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1940T$X..Housing.Units.Reporting.Value.of.Home...15.000.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1940T<-x1940T%>%
rename(NativeForeignP = X..White.Population..Native.Born,
ForeignBornP = X..White.Population..Foreign.Born,
LaborP =X..Population.Age.14.and.Over..In.Labor.Force,
NoLaberP=X..Population.Age.14.and.Over..Not.In.Labor.Force,
CivilianLaborP = X..Population.Age.14.and.Over..In.Labor.Force..In.Civilian.Labor.Force,
EmployedWorker = X..Population.Age.14.and.Over..In.Labor.Force..In.Civilian.Labor.Force..Employed,
Unemployedworker = X..Population.Age.14.and.Over..In.Labor.Force..In.Civilian.Labor.Force..Unemployed..Seeking.Work.,
Occupation.ProfessionalWorkerP = X..Employed.Civilian.Population.Age.14.and.Over..Professional.Workers,
Occupation.ManagersP = X..Employed.Civilian.Population.Age.14.and.Over..Proprietors.Managers.Officials,
Occupation.ClericalP = X..Employed.Civilian.Population.Age.14.and.Over..Clerical.Sales.Kindred.Workers,
Occupation.CraftmanP = X..Employed.Civilian.Population.Age.14.and.Over..Craftmen.Foremen.Kindred.Workers,
Occupation.DomesticServiceP = X..Employed.Civilian.Population.Age.14.and.Over..Domestic.Service.Workers,
Household3= X..Housing.Units.Reporting.Value.of.Home...20.000.and.Over..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
OtherRace=Other,
Population=Total.Population.1,
PopulationDensity=Population.Density.per.sq..mile)
x1940T<-x1940T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
Age1, Age2, Age3, Age4, NativeForeignP, ForeignBornP,
Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,Household1, Household2, Household3)
x1950T<-x1950T%>%
rename(Population=Total.Population,
PopulationDensity="Population. Density.per.sq..mile",
LaborP=LaberP,
NoLaberP=NolaberP)
x1960T<-x1960T%>%
rename(Female=Total.Population..Female, Male=Total.Population..Male, FemaleP=X..Total.Population..Female, MaleP=X..Total.Population..Male, WhiteP=X..Total.Population..White,
White=Total.Population..White, Black=Total.Population..Black, OtherRace=Total.Population..Other.Race, BlackP=X..Total.Population..Black, OtherRaceP=X..Total.Population..Other.Race, ForeignBornP=X..Foreign.Stock.Population..Foreign.Born,ForeignBorn=Foreign.Stock.Population..Foreign.Born,
NativeForeign=Foreign.Stock.Population..Native.of.foreign.or.mixed.parentage,NativeForeignP=X..Foreign.Stock.Population..Native.of.foreign.or.mixed.parentage)
x1960T$Age1<-as.numeric(x1960T$X..Total.Population..Under.5.Years)+as.numeric(x1960T$X..Total.Population..5.to.9.Years)+as.numeric(x1960T$X..Total.Population..10.to.14.Years)+as.numeric(x1960T$X..Total.Population..15.to.19.Years)
x1960T$Age2<-as.numeric(x1960T$X..Total.Population..20.to.24.Years)+as.numeric(x1960T$X..Total.Population..25.to.29.Years)+as.numeric(x1960T$X..Total.Population..30.to.34.Years)
x1960T$Age3<-as.numeric(x1960T$X..Total.Population..35.to.39.Years)+as.numeric(x1960T$X..Total.Population..40.to.44.Years)+as.numeric(x1960T$X..Total.Population..45.to.49.Years)+as.numeric(x1960T$X..Total.Population..50.to.54.Years)+as.numeric(x1960T$X..Total.Population..55.to.59.Years)+as.numeric(x1960T$X..Total.Population..60.to.64.Years)
x1960T$Age4<-as.numeric(x1960T$X..Total.Population..65.to.69.Years)+as.numeric(x1960T$X..Total.Population..70.to.74.Years)+as.numeric(x1960T$X..Total.Population..75.Years.and.Over)
x1960T$Educ1 <- as.numeric(x1960T$X..Population.Age.25...No.school.years.completed)+as.numeric(x1960T$X..Population.Age.25...Elementary.or.more)-as.numeric(x1960T$X..Population.Age.25...High.school.or.more)
x1960T$Educ2 <- as.numeric(x1960T$X..Population.Age.25...High.school.or.more)
x1960T$Educ3 <- as.numeric(x1960T$X..Population.Age.25...College.or.more)
x1960T<-x1960T%>%
rename(Single = X..Population.14.years.and.over..Single,
Married = X..Population.14.years.and.over..Married..not.separated,
Separated=X..Population.14.years.and.over..Separated,
Widowed=X..Population.14.years.and.over..Widowed,
Divorced=X..Population.14.years.and.over..Divorced)
x1960T$Income1 <- as.numeric(x1960T$X..Households..Less.than..1.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1960T$X..Households...1.000....1.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Households...2.000....2.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1960T$X..Households...3.000....3.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Households...4.000....4.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1960T$X..Households...5.000....5.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Households...6.000....6.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1960T$X..Households...7.000....7.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Households...8.000....8.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1960T$X..Households...9.000....9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T$Income2 <-as.numeric(x1960T$X..Households...10.000....14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T$Income3 <-as.numeric(x1960T$X..Households...15.000....24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T$Income4 <-as.numeric(x1960T$X..Households...25.000.and.over..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T<-x1960T%>%
rename(LaborP =X..Total.Population.Age.14...In.Labor.Force,
NoLaberP=X..Total.Population.Age.14...Not.In.Labor.Force,
CivilianLaborP = X..Total.Population.Age.14...In.Labor.Force..In.Civilian.Labor.Force,
EmployedWorker = X..Total.Population.Age.14...In.Labor.Force..In.Civilian.Labor.Force..Employed,
Unemployedworker = X..Total.Population.Age.14...In.Labor.Force..In.Civilian.Labor.Force..Unemployed,
Occupation.ProfessionalWorkerP = X..Employed.Civilians.14...Professional..technical..and.kindred.workers,
Occupation.ManagersP = X..Employed.Civilians.14...Managers..officials..and.proprietors,
Occupation.ClericalP = X..Employed.Civilians.14...Clerical.and.kindred.workers,
Occupation.CraftmanP = X..Employed.Civilians.14...Craftsmen..foremen..and.kindred.workers,
Occupation.DomesticServiceP = X..Employed.Civilians.14...Private.household.workers,
IndustryManufactory=X..Employed.Civilians.Age.14...Machinery,
IndustryFood=X..Employed.Civilians.Age.14...Food.and.kindred.industries,
IndustryTextile=X..Employed.Civilians.Age.14...Textile.and.apparel,
IndustryPublishing=X..Employed.Civilians.Age.14...Printing..publishing..and.allied,
IndustryCommunication=X..Employed.Civilians.Age.14...Communications..utilities..sanitary.services,
IndustrySale=X..Employed.Civilians.Age.14...Wholesale.trade,
Hospitality=X..Employed.Civilians.Age.14...Eating.and.drinking.places,
IndustryRetail=X..Employed.Civilians.Age.14...Other.retail,
IndustryBusiness=X..Employed.Civilians.Age.14...Business.and.repair.services,
Population=Total.Population,
PopulationDensity=Population.Density.per.sq..mile)
x1960T$Household1 <- as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value..Under..5.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value...5.000.to..9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T$Household2 <- as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value...10.000.to..14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value...15.000.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T$Household3 <- as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value...20.000.to..24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1960T$X..Owner.Occupied.Units.Reporting.Value...25.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1960T<-x1960T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
Age1, Age2, Age3, Age4, Single, Married, Separated, Widowed, Divorced,Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,Household1, Household2, Household3,IndustryManufactory,IndustryFood,IndustryTextile,
IndustryPublishing,IndustryCommunication,IndustrySale,Hospitality,IndustryRetail,IndustryBusiness,ForeignBorn,NativeForeign,ForeignBornP,NativeForeignP,Income1,Income2,Income3,Income4)
x1970T<-x1970T%>%
rename(Female=Total.Population..Female, Male=Total.Population..Male, FemaleP=X..Total.Population..Female, MaleP=X..Total.Population..Male, WhiteP=X..White,
OtherRace=Other, BlackP=X..Black, OtherRaceP=X..Other, ForeignBornP=X..Count.of.Persons..Foreign.Born,ForeignBorn=Count.of.Persons..Foreign.Born,
NativeForeign=Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage.,NativeForeignP=X..Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage.,
NativeSouthwestAsiaP=X..Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage...Southwest.Asia,
NativeForeignJapanP=X..Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage...Japan,
NativeForeignChinaP=X..Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage...China,
NativeForeignOtherAsiaP=X..Count.of.Persons.of.Foreign.Stock..Native..of.Foreign.or.Mixed.Parentage...Other.Asia,
ForeignSouthwestAsiaP=X..Count.of.Persons.of.Foreign.Stock..Foreign.Born..Southwest.Asia,
ForeignChinaP=X..Count.of.Persons.of.Foreign.Stock..Foreign.Born..China,
ForeignJapanP=X..Count.of.Persons.of.Foreign.Stock..Foreign.Born..Japan,
ForeignOtherAsiaP=X..Count.of.Persons.of.Foreign.Stock..Foreign.Born..Other.Asia)
x1970T$Age1<-as.numeric(x1970T$X..Total.Population..Under.5.Years)+as.numeric(x1970T$X..Total.Population..5.to.9.Years)+as.numeric(x1970T$X..Total.Population..10.to.14.Years)+as.numeric(x1970T$X..Total.Population..15.to.17.Years)
x1970T$Age2<-as.numeric(x1970T$X..Total.Population..18.to.24.Years)+as.numeric(x1970T$X..Total.Population..25.to.34.Years)
x1970T$Age3<-as.numeric(x1970T$X..Total.Population..35.to.44.Years)+as.numeric(x1970T$X..Total.Population..45.to.54.Years)+as.numeric(x1970T$X..Total.Population..55.to.64.Years)
x1970T$Age4<-as.numeric(x1970T$X..Total.Population..65.to.74.Years)+as.numeric(x1970T$X..Total.Population..75.Years.and.over)
x1970T<-x1970T%>%
rename(Single = X..Count.of.Persons.14.Years.Old.and.over..Never.Married,
Married = X..Count.of.Persons.14.Years.Old.and.over..Married,
Separated=X..Count.of.Persons.14.Years.Old.and.over..Separated,
Widowed=X..Count.of.Persons.14.Years.Old.and.over..Widowed,
Divorced=X..Count.of.Persons.14.Years.Old.and.over..Divorced)
x1970T$Educ1 <- as.numeric(x1970T$X..Population.25.Years.Old.and.over..No.School.Years.Completed..Includes.Nursery.and.Kindergarten.)+as.numeric(x1970T$X..Population.25.Years.Old.and.over..1.8.Years.of.Elementary.Education.or.More)-
as.numeric(x1970T$X..Population.25.Years.Old.and.over..1.4.Years.of.High.School.Education.or.More)
x1970T$Educ2 <- as.numeric(x1970T$X..Population.25.Years.Old.and.over..1.4.Years.of.High.School.Education.or.More)
x1970T$Educ3 <- as.numeric(x1970T$X..Population.25.Years.Old.and.over..1.5.Years.of.College.Education.or.More)
x1970T$Household1 <- as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated..Less.Than..5.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...5.000....7.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...7.500....9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Household2 <- as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...10.000....12.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...12.500....14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...15.000....17.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...17.500....19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Household3 <- as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...20.000....24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...25.000....34.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...35.000....49.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Units.for.Which.Value.Is.Tabulated...50.000.or.More..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Income1 <- as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Without.Income..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..1....999.or.Loss..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..1.000....1.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..2.000....2.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..3.000....3.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..4.000....4.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..5.000....5.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..6.000....6.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..7.000....7.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..8.000....8.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..9.000....9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Income2 <-as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..10.000....14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Income3 <-as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..15.000....24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$Income4 <-as.numeric(x1970T$X..Count.of.Persons.14.Years.Old.and.over..Income..25.000.and.over..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1970T$IndustryManufactory<-as.numeric(x1970T$X..Employed.Population.16...Machinery..Except.Electrical)+
as.numeric(x1970T$X..Employed.Population.16...Electrical.Machinery..Equipment..and.Supplies)
x1970T$IndustryBusiness<-as.numeric(x1970T$X..Employed.Population.16...Business.Services)+
as.numeric(x1970T$X..Employed.Population.16...Repair.Services)
x1970T$IndustryRetail<-as.numeric(x1970T$X..Employed.Population.16...General.Merchandise.Retailing)+
as.numeric(x1970T$X..Employed.Population.16...Motor.Vehicles.Retailing.and.Service.Stations)+
as.numeric(x1970T$X..Employed.Population.16...Other.Retail.Trade)
x1970T<-x1970T%>%
rename(LaborP =X..Population.16.Years.Old.and.over..in.Labor.Force,
NoLaberP=X..Population.16.Years.Old.and.over..Not.in.Labor.Force,
CivilianLaborP = X..Population.16.Years.Old.and.over..in.Labor.Force..in.Civilian.Labor.Force,
EmployedWorker = X..Population.16.Years.Old.and.over..in.Labor.Force..in.Civilian.Labor.Force..Employed,
Unemployedworker = X..Population.16.Years.Old.and.over..in.Labor.Force..in.Civilian.Labor.Force..Unemployed,
Occupation.ProfessionalWorkerP = X..Count.of.Employed.Persons.16.Years.Old.and.over..Professional..Technical..and.Kindred.Workers,
Occupation.ManagersP = X..Count.of.Employed.Persons.16.Years.Old.and.over..Managers.and.Administrators..Except.Farm,
Occupation.ClericalP = X..Count.of.Employed.Persons.16.Years.Old.and.over..Clerical.and.Kindred.Workers,
Occupation.CraftmanP = X..Count.of.Employed.Persons.16.Years.Old.and.over..Craftsmen..Foremen..and.Kindred.Workers,
Occupation.DomesticServiceP = X..Count.of.Employed.Persons.16.Years.Old.and.over..Private.Household.Workers,
IndustryFood=X..Employed.Population.16...Food.and.Kindred.Products,
IndustryTextile=X..Employed.Population.16...Textile.Mill.and.Other.Fabricated.Textile.Products,
IndustryPublishing=X..Employed.Population.16...Printing..Publishing..and.Allied.Industries,
IndustryCommunication=X..Employed.Population.16...Communications,
IndustrySale=X..Employed.Population.16...Wholesale.Trade,
Hospitality=X..Employed.Population.16...Eating.and.Drinking.Places,
Population=Total.Population,
PopulationDensity=Population.Density..per.sq..mile.)
x1970T<-x1970T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
Age1, Age2, Age3, Age4, Single, Married, Separated, Widowed, Divorced,Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,Household1, Household2, Household3,IndustryManufactory,IndustryFood,IndustryTextile,
IndustryPublishing,IndustryCommunication,IndustrySale,Hospitality,IndustryRetail,IndustryBusiness,ForeignBorn,NativeForeign,ForeignBornP,NativeForeignP,Income1,Income2,Income3,Income4,
ForeignChinaP,ForeignJapanP,ForeignSouthwestAsiaP,ForeignOtherAsiaP,NativeSouthwestAsiaP,NativeForeignChinaP,NativeForeignJapanP,NativeForeignJapanP,NativeForeignOtherAsiaP)
x1980T$OtherRace<-as.numeric(x1980T$Total.Population..Dollars.adjusted.for.inflation.to.match.value.in.2010..3)-
as.numeric(x1980T$Total.Population..White..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-as.numeric(x1980T$Total.Population..Black..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$OtherRaceP<-100-as.numeric(x1980T$X..Total.Population..White..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$X..Total.Population..Black..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$Asian<-as.numeric(x1980T$Asian.and.Pacific.Islander..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$Asian.and.Pacific.Islander..Hawaiian..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$Asian.and.Pacific.Islander..Guamanian..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$Asian.and.Pacific.Islander..Samoan..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$AsianP<-100-as.numeric(x1980T$X..Asian.and.Pacific.Islander..Hawaiian..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$X..Asian.and.Pacific.Islander..Guamanian..Dollars.adjusted.for.inflation.to.match.value.in.2010.)-
as.numeric(x1980T$X..Asian.and.Pacific.Islander..Samoan..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T<-x1980T%>%
rename(Female=Total.Population..Female..Dollars.adjusted.for.inflation.to.match.value.in.2010., Male=Total.Population..Male..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
FemaleP=X..Total.Population..Female..Dollars.adjusted.for.inflation.to.match.value.in.2010., MaleP=X..Total.Population..Male..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
White=Total.Population..White..Dollars.adjusted.for.inflation.to.match.value.in.2010.,Black=Total.Population..Black..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
WhiteP=X..Total.Population..White..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
BlackP=X..Total.Population..Black..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
JapaneseP=X..Asian.and.Pacific.Islander..Japanese..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
ChineseP=X..Asian.and.Pacific.Islander..Chinese..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
FilipinoP=X..Asian.and.Pacific.Islander..Filipino..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
KoreanP=X..Asian.and.Pacific.Islander..Korean..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
AsianIndianP=X..Asian.and.Pacific.Islander..Asian.Indian..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
VietnameseP=X..Asian.and.Pacific.Islander..Vietnamese..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
HouseholdWhiteP=X..Households..White..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
HouseholdBlackP=X..Households..Black..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
HouseholdAsianandPacificP=X..Households..Asian.and.Pacific.Islander..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
HouseholdOthersP=X..Households..Other..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
AsianBelowPovertyLevelP=X..Asian.and.Pacific.Islander.Population.for.Whom.Poverty.Status.is.Determined..Below.Poverty.Level..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
AsianAbovePovertyLevelP=X..Asian.and.Pacific.Islander.Population.for.Whom.Poverty.Status.is.Determined..Above.Poverty.Level..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
ForeignBornP=X..Total.Population..Foreign.Born..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
ForeignBorn=Total.Population..Foreign.Born..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Population=Total.Population,
PopulationDensity=Population.Density..per.sq..mile.)
x1980T$Age1<-as.numeric(x1980T$X..Total.Population..Under.5.Year..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..5.to.9.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..10.to.14.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..15.to.17.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$Age2<-as.numeric(x1980T$X..Total.Population..18.to.24.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..25.to.34.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$Age3<-as.numeric(x1980T$X..Total.Population..35.to.44.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..45.to.54.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..55.to.64.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$Age4<-as.numeric(x1980T$X..Total.Population..65.to.74.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..75.to.84.Years..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+as.numeric(x1980T$X..Total.Population..85.Years.and.over..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T<-x1980T%>%
rename(Single = X..Persons.15.Years.and.Over..Single..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Married = X..Persons.15.Years.and.Over..Now.Married..Except.Separated..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Separated=X..Persons.15.Years.and.Over..Separated..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Widowed=X..Persons.15.Years.and.Over..Widowed..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Divorced=X..Persons.15.Years.and.Over..Divorced..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T<-x1980T%>%
rename(Educ1=X..Persons.25.Years.Old.and.Over..Elementary..0.to.8.Years..or.less..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Educ2=X..Persons.25.Years.Old.and.Over..High.School.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Educ3=X..Persons.25.Years.Old.and.Over..College.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$Income1 <- 100*(as.numeric(x1980T$Households..Less.than..2.500..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...2.500.to..4.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...5.000.to..7.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...7.500.to..9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.))/as.numeric(x1980T$Households..Dollars.adjusted.for.inflation.to.match.value.in.2010..1)
x1980T$Income2 <- 100*(as.numeric(x1980T$Households...10.000.to..12.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...12.500.to..14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.))/as.numeric(x1980T$Households..Dollars.adjusted.for.inflation.to.match.value.in.2010..1)
x1980T$Income3 <- 100*(as.numeric(x1980T$Households...15.000.to..17.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...17.500.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...20.000.to..22.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...22.500.to..24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.))/as.numeric(x1980T$Households..Dollars.adjusted.for.inflation.to.match.value.in.2010..1)
x1980T$Income4 <- 100*(as.numeric(x1980T$Households...25.000.to..27.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...27.500.to..29.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...30.000.to..34.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...35.000.to..39.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...40.000.to..49.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...50.000.to..74.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1980T$Households...75.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.))/as.numeric(x1980T$Households..Dollars.adjusted.for.inflation.to.match.value.in.2010..1)
x1980T<-x1980T%>%
rename(NoLaberP=X..Persons.16.Years.and.Over..Not.in.Labor.Force..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
CivilianLaborP = X..Persons.16.Years.and.Over..Civilian.Labor.Force..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
EmployedWorker = X..Persons.16.Years.and.Over..Civilian.Labor.Force..Employed..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Unemployedworker = X..Persons.16.Years.and.Over..Civilian.Labor.Force..Unemployed..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Occupation.ProfessionalWorkerP = X..Employed.Persons.16.Years.and.Over..Professional.and.Related.Services..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Occupation.ManagersP = X..Employed.Persons.16.Years.and.Over..Managerial.and.Professional.Specialty.Occupations..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Occupation.ClericalP = X..Employed.Persons.16.Years.and.Over..Finance..Insurance..and.Real.Estate..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Occupation.CraftmanP = X..Employed.Persons.16.Years.and.Over..Precision.Production..Craft..and.Repair.Occupations..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Occupation.DomesticServiceP = X..Employed.Persons.16.Years.and.Over..Service.Occupations..Private.Household.Occupations..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
IndustryManufactory=X..Employed.Persons.16.Years.and.Over..Manufacturing..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
IndustryCommunication=X..Employed.Persons.16.Years.and.Over..Communications.and.Other.Public.Utilities..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
IndustrySale=X..Employed.Persons.16.Years.and.Over..Wholesale.Trade..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
Hospitality=X..Employed.Persons.16.Years.and.Over..Personal..Entertainment..and.Recreation.Services..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
IndustryRetail=X..Employed.Persons.16.Years.and.Over..Retail.Trade..Dollars.adjusted.for.inflation.to.match.value.in.2010.,
IndustryBusiness=X..Employed.Persons.16.Years.and.Over..Business.and.Repair.Services..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1980T$LaborP <-100-as.numeric(x1980T$NoLaberP)
x1980T<-x1980T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
JapaneseP,ChineseP, FilipinoP,KoreanP, AsianIndianP, VietnameseP, HouseholdWhiteP,HouseholdBlackP,HouseholdAsianandPacificP,HouseholdOthersP,
AsianBelowPovertyLevelP,AsianAbovePovertyLevelP,Asian, AsianP,
Age1, Age2, Age3, Age4, Single, Married, Separated, Widowed, Divorced,Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,IndustryManufactory,
IndustryCommunication,IndustrySale,Hospitality,IndustryRetail,IndustryBusiness,ForeignBorn,ForeignBornP,Income1,Income2,Income3,Income4)
x1990T$AsianP<- as.numeric(x1990T$Asian)/as.numeric(x1990T$Total.Population)
x1990T<-x1990T%>%
rename(Female=Total.Population..Female, Male=Total.Population..Male,
FemaleP=X..Total.Population..Female, MaleP=X..Total.Population..Male,
White=Persons..White,Black=Persons..Black,
WhiteP=X..Persons..White, BlackP=X..Persons..Black,
JapaneseP=X..Asian..Japanese,
ChineseP=X..Asian..Chinese,
FilipinoP=X..Asian..Filipino,
KoreanP=X..Asian..Korean,
AsianIndianP=X..Asian..Asian.Indian,
VietnameseP=X..Asian..Vietnamese,
HouseholdWhiteP=X..Households.With.a.White.Householder,
HouseholdBlackP=X..Households.With.a.Black.Householder,
HouseholdAsianandPacificP=X..Households.With.a.Asian.or.Pacific.Islander.Householder,
AsianBelowPovertyLevelP=X..Asian.or.Pacific.Islander.Persons.for.whom.poverty.status.is.determined..Income.in.1989.below.poverty.level,
AsianAbovePovertyLevelP=X..Asian.or.Pacific.Islander.Persons.for.whom.poverty.status.is.determined..Income.in.1989.above.poverty.level,
ForeignBornP=X..Total.Population..Foreign.born,
ForeignBorn=Total.Population..Foreign.born,
ForeignBornBefore1960=X..Foreign.born.persons..Before.1960,
ForeignBorn1960to1969=X..Foreign.born.persons..1960.to.1969,
ForeignBorn1970to1979=X..Foreign.born.persons..1970.to.1979,
ForeignBornafter1980=X..Foreign.born.persons..1980.to.1990)
x1990T$OtherRace<-as.numeric(x1990T$Persons.1)-as.numeric(x1990T$White)-as.numeric(x1990T$Black)
x1990T$OtherRaceP<-100-as.numeric(x1990T$WhiteP)-as.numeric(x1990T$BlackP)
x1990T$Age1<-as.numeric(x1990T$X..Persons..Under.5.year)+as.numeric(x1990T$X..Persons..5.to.9.years)+as.numeric(x1990T$X..Persons..10.to.14.years)+as.numeric(x1990T$X..Persons..15.to.17.years)
x1990T$Age2<-as.numeric(x1990T$X..Persons..18.to.24.years)+as.numeric(x1990T$X..Persons..25.to.34.years)
x1990T$Age3<-as.numeric(x1990T$X..Persons..35.to.44.years)+as.numeric(x1990T$X..Persons..45.to.54.years)+as.numeric(x1990T$X..Persons..55.to.64.years)
x1990T$Age4<-as.numeric(x1990T$X..Persons..65.to.74.years)+as.numeric(x1990T$X..Persons..75.to.84.years)+as.numeric(x1990T$X..Persons..85.years.and.over)
x1990T<-x1990T%>%
rename(Single = X..Persons.15.years.and.over..Never.married,
Married = X..Persons.15.years.and.over..Now.married..except.separated,
Separated=X..Persons.15.years.and.over..Separated,
Widowed=X..Persons.15.years.and.over..Widowed,
Divorced=X..Persons.15.years.and.over..Divorced)
x1990T<-x1990T%>%
rename(Educ1=X..Persons.25.years.and.over..Less.Than.High.School,
Educ2=X..Persons.25.years.and.over..High.school.graduate.or.more..includes.equivalency.,
Educ3=X..Persons.25.years.and.over..Some.college.or.more)
x1990T$Income1 <- as.numeric(x1990T$X..Households..Less.than..5.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...5.000.to..9.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T$Income2 <- as.numeric(x1990T$X..Households...12.500.to..14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T$Income3 <- as.numeric(x1990T$X..Households...15.000.to..17.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...17.500.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...20.000.to..22.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...22.500.to..24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T$Income4 <- as.numeric(x1990T$X..Households...25.000.to..27.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...27.500.to..29.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...30.000.to..32.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...32.500.to..34.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...35.000.to..37.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...37.500.to..39.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...40.000.to..42.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...42.500.to..44.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...45.000.to..47.499..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...47.500.to..49.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...50.000.to..54.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...55.000.to..59.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...60.000.to..74.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...75.000.to..99.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...100.000.to..124.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...125.000.to..149.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Households...150.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T$Household1 <- as.numeric(x1990T$X..Specified.owner.occupied.housing.units..Less.than..20.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)/2
x1990T$Household2<- as.numeric(x1990T$X..Specified.owner.occupied.housing.units..Less.than..20.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)/2
x1990T$Household3 <- as.numeric(x1990T$X..Specified.owner.occupied.housing.units...20.000.to..49.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Specified.owner.occupied.housing.units...50.000.to..99.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Specified.owner.occupied.housing.units...100.000.to..149.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T$Household4<-as.numeric(x1990T$X..Specified.owner.occupied.housing.units...150.000.to..299.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Specified.owner.occupied.housing.units...300.000.to..499.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x1990T$X..Specified.owner.occupied.housing.units...500.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x1990T<-x1990T%>%
rename(LaborP =X..Population.16.years.and.over..In.labor.force,
NoLaberP=X..Population.16.years.and.over..Not.in.labor.force,
CivilianLaborP = X..Population.16.years.and.over..In.labor.force..Civilian,
EmployedWorker = X..Population.16.years.and.over..In.labor.force..Civilian..Employed,
Unemployedworker = X..Population.16.years.and.over..In.labor.force..Civilian..Unemployed,
Occupation.ProfessionalWorkerP = X..Employed.persons.16.years.and.over..Professional.and.related.services,
Occupation.ManagersP = X..Employed.persons.16.years.and.over..Managerial.and.professional.specialty.occupations,
Occupation.ClericalP = X..Employed.persons.16.years.and.over..Finance..insurance..and.real.estate,
Occupation.CraftmanP = X..Employed.persons.16.years.and.over..Precision.production..craft..and.repair.occupations,
Occupation.DomesticServiceP = X..Employed.persons.16.years.and.over..Service.occupations..Private.household.occupations,
IndustryManufactory=X..Employed.persons.16.years.and.over..Manufacturing..nondurable.goods,
IndustryCommunication=X..Employed.persons.16.years.and.over..Communications.and.other.public.utilities,
IndustrySale=X..Employed.persons.16.years.and.over..Wholesale.trade,
Hospitality=X..Employed.persons.16.years.and.over..Entertainment.and.recreation.services,
IndustryRetail=X..Employed.persons.16.years.and.over..Retail.trade,
IndustryBusiness=X..Employed.persons.16.years.and.over..Business.and.repair.services,
Population=Total.Population,
PopulationDensity=Population.Density..per.sq..mile.)
x1990T<-x1990T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
JapaneseP,ChineseP, FilipinoP,KoreanP, AsianIndianP, VietnameseP, HouseholdWhiteP,HouseholdBlackP,HouseholdAsianandPacificP,
AsianBelowPovertyLevelP,AsianAbovePovertyLevelP,Asian, AsianP,
Age1, Age2, Age3, Age4, Single, Married, Separated, Widowed, Divorced,Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,IndustryManufactory,
IndustryCommunication,IndustrySale,Hospitality,IndustryRetail,IndustryBusiness,ForeignBorn,ForeignBornP,Income1,Income2,Income3,Income4, Household1, Household2, Household3,Household4,
ForeignBornafter1980,ForeignBorn1970to1979,ForeignBorn1960to1969,ForeignBornBefore1960)
x2000T$OtherRace<-as.numeric(x2000T$Total.Population.4)-as.numeric(x2000T$White.Alone)-as.numeric(x2000T$Black.or.African.American.Alone)
x2000T$OtherRaceP<-100-as.numeric(x2000T$X..White.Alone)-as.numeric(x2000T$X..Black.or.African.American.Alone)
x2000T$HouseholdAsianandPacificP<-as.numeric(x2000T$X..Households..with.a.Householder.Who.is.Asian.Alone)+as.numeric(x2000T$X..Households..with.a.Householder.Who.is.Native.Hawaiian.and.Other.Pacific.Islander.Alone)
x2000T<-x2000T%>%
rename(FemaleP=X..Female, MaleP=X..Male,
White=White.Alone,Black=Black.or.African.American.Alone,
WhiteP=X..White.Alone, BlackP=X..Black.or.African.American.Alone,
Asian=Asian.Alone, AsianP=X..Asian.Alone,
JapaneseP=X..Japanese,
ChineseP=X..Chinese..Except.Taiwanese,
FilipinoP=X..Filipino,
KoreanP=X..Korean,
TaiwaneseP=X..Taiwanese,
AsianIndianP=X..Asian.Indian,
VietnameseP=X..Vietnamese,
HouseholdWhiteP=X..Households..with.a.Householder.Who.is.White.Alone,
HouseholdBlackP=X..Households..with.a.Householder.Who.is.Black.or.African.American.Alone,
AsianBelowPovertyLevelP=X..Asian.Population.for.Whom.Poverty.Status.is.Determined..Income.in.1999.Below.Poverty.Level,
AsianAbovePovertyLevelP=X..Asian.Population.for.Whom.Poverty.Status.is.Determined..Income.in.1999.at.or.above.Poverty.Level,
ForeignBornP=X..Foreign.Born,
ForeignBorn=Foreign.Born,
ForeignBornBefore1960=X..Year.of.Entry.for.the.Foreign.Born.Population..Before.1965,
ForeignBorn1960to1969=X..Year.of.Entry.for.the.Foreign.Born.Population..1965.to.1969)
x2000T$ForeignBorn1970to1979<-as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1970.to.1974)+as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1975.to.1979)
x2000T$ForeignBornafter1980<-as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1980.to.1984)+as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1985.to.1989)+as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1990.to.1994)+
as.numeric(x2000T$X..Year.of.Entry.for.the.Foreign.Born.Population..1995.to.March.2000)
x2000T$Age1<-as.numeric(x2000T$X..Under.5.Years)+as.numeric(x2000T$X..5.to.9.Years)+as.numeric(x2000T$X..10.to.14.Years)+as.numeric(x2000T$X..15.to.17.Years)
x2000T$Age2<-as.numeric(x2000T$X..18.to.24.Years)+as.numeric(x2000T$X..25.to.34.Years)
x2000T$Age3<-as.numeric(x2000T$X..35.to.44.Years)+as.numeric(x2000T$X..45.to.54.Years)+as.numeric(x2000T$X..55.to.64.Years)
x2000T$Age4<-as.numeric(x2000T$X..65.to.74.Years)+as.numeric(x2000T$X..75.to.84.Years)+as.numeric(x2000T$X..85.Years.and.over)
x2000T<-x2000T%>%
rename(Single = X..Population.15.Years.and.Over..Never.Married,
Married = X..Population.15.Years.and.Over..Now.Married..not.Including.Separated.,
Separated=X..Population.15.Years.and.Over..Separated,
Widowed=X..Population.15.Years.and.Over..Widowed,
Divorced=X..Population.15.Years.and.Over..Divorced)
x2000T<-x2000T%>%
rename(Educ1=X..Population.25.Years.and.Over..Less.than.High.School,
Educ2=X..Population.25.Years.and.Over..High.School.Graduate.or.More..Includes.Equivalency.,
Educ3=X..Population.25.Years.and.Over..Some.College.or.more)
x2000T$Income1 <- as.numeric(x2000T$X..Household.Income..Less.than..10.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T$Income2 <- as.numeric(x2000T$X..Household.Income...10.000.to..14.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T$Income3 <- as.numeric(x2000T$X..Household.Income...15.000.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...15.000.to..19.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...20.000.to..24.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T$Income4 <- as.numeric(x2000T$X..Household.Income...25.000.to..29.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...30.000.to..34.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...35.000.to..39.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...40.000.to..44.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...50.000.to..59.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...60.000.to..74.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...75.000.to..99.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...100.000.to..124.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...125.000.to..149.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...150.000.to..199.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Household.Income...200.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T$Household1 <- as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units..Less.than..20.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)/2
x2000T$Household2<- as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units..Less.than..20.000..Dollars.adjusted.for.inflation.to.match.value.in.2010.)/2
x2000T$Household3 <- as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...20.000.to..49.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...50.000.to..99.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...100.000.to..149.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T$Household4<-as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...150.000.to..299.999..Dollars.adjusted.for.inflation.to.match.value.in.2010)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...300.000.to..499.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...500.000.to..749.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...750.000.to..999.999..Dollars.adjusted.for.inflation.to.match.value.in.2010.)+
as.numeric(x2000T$X..Value.for.All.Owner.Occupied.Housing.Units...1.000.000.or.more..Dollars.adjusted.for.inflation.to.match.value.in.2010.)
x2000T<-x2000T%>%
rename(LaborP =X..Population.16.Years.and.Over..In.Labor.Force,
NoLaberP=X..Population.16.Years.and.Over..Not.in.Labor.Force,
CivilianLaborP = X..Population.16.Years.and.Over..In.Labor.Force..Civilian,
EmployedWorker = X..Population.16.Years.and.Over..In.Labor.Force..Civilian..Employed,
Unemployedworker = X..Population.16.Years.and.Over..In.Labor.Force..Civilian..Unemployed,
EmployedAsianLaborP=X..Asian.16.Years.Old.in.Civilian.Labor.Force..Employed,
UnemployedAsianLaborP=X..Asian.16.Years.Old.in.Civilian.Labor.Force..Unemployed,
Occupation.ProfessionalWorkerP = X..Employed.Civilian.Population.16.Years.and.Over..Professional.and.Related.Occupations,
Occupation.ManagersP = X..Employed.Civilian.Population.16.Years.and.Over..Management..Business..and.Financial.Operations.Occupations,
Occupation.ClericalP = X..Employed.Civilian.Population.16.Years.and.Over..Finance..Insurance..Real.Estate.and.Rental.and.Leasing,
Occupation.CraftmanP = X..Employed.Civilian.Population.16.Years.and.Over..Production.Occupations,
Occupation.DomesticServiceP =X..Employed.Civilian.Population.16.Years.and.Over..Personal.Care.and.Service.Occupations,
IndustryManufactory=X..Employed.Civilian.Population.16.Years.and.Over..Manufacturing,
IndustrySale=X..Employed.Civilian.Population.16.Years.and.Over..Wholesale.Trade,
IndustryPublishing=X..Employed.Civilian.Population.16.Years.and.Over..Information,
IndustryFood=X..Employed.Civilian.Population.16.Years.and.Over..Food.Preparation.and.Serving.Related.Occupations,
Hospitality=X..Employed.Civilian.Population.16.Years.and.Over..Arts..Entertainment..Recreation..Accommodation.and.Food.Services,
IndustryRetail=X..Employed.Civilian.Population.16.Years.and.Over..Retail.Trade,
IndustryBusiness=X..Employed.Civilian.Population.16.Years.and.Over..Sales.and.Related.Occupations,
Population=Total.Population,
PopulationDensity=Population.Density..per.sq..mile.)
x2000T<-x2000T%>%
select(Tract, Population, PopulationDensity, Male,MaleP,Female,FemaleP,White, Black, OtherRace, WhiteP, BlackP, OtherRaceP,
JapaneseP,ChineseP, FilipinoP,KoreanP, AsianIndianP, VietnameseP, HouseholdWhiteP,HouseholdBlackP,HouseholdAsianandPacificP,
AsianBelowPovertyLevelP,AsianAbovePovertyLevelP,Asian, AsianP,
Age1, Age2, Age3, Age4, Single, Married, Separated, Widowed, Divorced,Educ1, Educ2, Educ3,LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP,
Occupation.ClericalP, Occupation.CraftmanP, Occupation.DomesticServiceP,IndustryManufactory,
IndustrySale,Hospitality,IndustryRetail,IndustryBusiness,ForeignBorn,ForeignBornP,Income1,Income2,Income3,Income4, Household1, Household2, Household3,Household4,
ForeignBornafter1980,ForeignBorn1970to1979,ForeignBorn1960to1969,ForeignBornBefore1960,TaiwaneseP,EmployedAsianLaborP,UnemployedAsianLaborP,IndustryPublishing,IndustryFood)
Merge dataset
x1940T$Year <- "1940"
x1950T$Year <- "1950"
x1960T$Year <- "1960"
x1970T$Year <- "1970"
x1980T$Year <- "1980"
x1990T$Year <- "1990"
x2000T$Year <- "2000"
col_names1940 <- names(x1940T)
col_names1950 <- names(x1950T)
col_names1960 <- names(x1960T)
col_names1970 <- names(x1970T)
col_names1980 <- names(x1980T)
col_names1990 <- names(x1990T)
col_names2000 <- names(x2000T)
x1940T[,col_names1940] <- lapply(x1940T[,col_names1940] , factor)
x1950T[,col_names1950] <- lapply(x1950T[,col_names1950] , factor)
x1960T[,col_names1960] <- lapply(x1960T[,col_names1960] , factor)
x1970T[,col_names1970] <- lapply(x1970T[,col_names1970] , factor)
x1980T[,col_names1980] <- lapply(x1980T[,col_names1980] , factor)
x1990T[,col_names1990] <- lapply(x1990T[,col_names1990] , factor)
x2000T[,col_names2000] <- lapply(x2000T[,col_names2000] , factor)
bind_rows(x1940T,x1950T,x1960T,x1970T,x1980T,x1990T,x2000T)->all_data
all_data<- all_data%>%
mutate(Tract=gsub("Census Tract ","",Tract))%>%
mutate(Tract=gsub("0000","",Tract))%>%
mutate(Tract=gsub("116.0","116",Tract))%>%
mutate(Tract=gsub("117.0","117",Tract))%>%
mutate(Tract=gsub("118.0","118",Tract))%>%
mutate(Tract=gsub("119.0","119",Tract))%>%
mutate(Tract=gsub("478.0","478",Tract))%>%
mutate(Tract=gsub("479.0","479",Tract))%>%
mutate(Tract=gsub("480.0","480",Tract))%>%
mutate(Tract=gsub("481.0","481",Tract))
#Import dataset that have variable theaters, years and tracts.
Final<- read_csv("~/Desktop/theater/allyears/Final - Table.csv", col_types = cols(Year = col_integer(),TotalAssessedChange = col_number()))
#Merge all datasets with the matching ids Tract and Year
df = merge(x = Final, y = all_data, by = c("Tract", "Year"),all.x = TRUE)
df[, c(8:92)]<-apply(df[, c(8:92)],2,function(x) as.numeric(as.character(x)))
Recode some variables
# Make a new variable that categorize Population into six levels
df$PopLevel[df$Population >5000]<- 6
df$PopLevel[df$Population > 4000 & 5000>=df$Population]<-5
df$PopLevel[df$Population > 3000 & 4000>=df$Population]<-4
df$PopLevel[df$Population > 2000 & 3000>=df$Population]<-3
df$PopLevel[df$Population > 1000 & 2000>=df$Population]<-2
df$PopLevel[1000>=df$Population]<-1
#Make a new variable that show whether the theater is in downtown or in suburb
df$Location[df$Theaters=="Kim Sing Theatre"|df$Theaters=="King Hing Theatre"|df$Theaters=="Pagoda Cinema"]<-"Downtown"
df$Location[df$Theaters=="Monterey Theatre"|df$Theaters=="Bard’s Garfield Egyptian Theatre\n"|df$Theaters=="Kuo Hwa 2 Cinema"|df$Theaters=="Kuo Hwa Theatre"]<-"Suburb"
DataDT<- split(df,df$Location)$ Downtown
DataSB<- split(df,df$Location)$ Suburb
#Turn into some variables into factors
df$`Main Address`[df$`Main Address`== "\bYes" | df$`Main Address`=="Yes"] <-"Yes"
df$`Main Address`<-as.factor(as.character(df$`Main Address`))
df$status<-factor(df$Opening,labels=c("Closing Entire Decade","Opening Several Years", "Opening Entire Decade"))
df$Address<-factor(df$`Main Address`,labels=c("Theaters Neighborhood","Theater Main Location"))
df$Tract<-as.factor(df$Tract)
df$Theaters<-as.factor(df$Theaters)
df$Year<-as.factor(df$Year)
df$PopLevel<-as.factor(df$PopLevel)
df$Location<-as.factor(df$Location)
df$AgeYoung<-df$Age1+df$Age2
df$LowEduc<-df$Educ1
df<-df%>%
select(Tract, Year,Theaters, AgeYoung, Opening, "Financial Health", TotalAssessedChange, "Main Address", Location,PopLevel,PopulationDensity, MaleP,WhiteP,BlackP,OtherRaceP, ForeignBornP, LaborP, NoLaberP, CivilianLaborP, EmployedWorker, Unemployedworker, Occupation.ProfessionalWorkerP,Occupation.ManagersP, Occupation.ClericalP, Occupation.CraftmanP,Occupation.DomesticServiceP, Household1,Household2,Household3, Income1,Income2,Income3,Income4,Single,Married, Separated, Widowed, Divorced, IndustryManufactory,IndustryPublishing, Hospitality , IndustryRetail, IndustryBusiness,JapaneseP, ChineseP, FilipinoP, KoreanP, AsianIndianP, VietnameseP,HouseholdAsianandPacificP, AsianBelowPovertyLevelP, AsianP, ForeignBornafter1980,ForeignBorn1970to1979, ForeignBorn1960to1969,ForeignBornBefore1960,LowEduc)
df$Opening <- factor(df$Opening)
df$`Financial Health` <- factor(df$`Financial Health`)
df$Address<-factor(df$`Main Address`,labels=c("Theaters Neighborhood","Theater Main Location"))
df$status<-factor(df$Opening,labels=c("Closing Entire Decade","Opening Several Years", "Opening Entire Decade"))
Dataset basic summary
library(qacEDA)
contents(df)
##
## The data frame df has 147 observations and 59 variables.
##
## Overall
## pos varname type n_unique n_miss pct_miss
## 1 Tract factor 42 0 0%
## 2 Year factor 7 0 0%
## 3 Theaters factor 7 0 0%
## 4 AgeYoung numeric 88 0 0%
## 5 Opening factor 3 0 0%
## 6 Financial Health factor 5 0 0%
## 7 TotalAssessedChange numeric 40 87 59%
## 8 Main Address factor 2 0 0%
## 9 Location factor 2 0 0%
## 10 PopLevel factor 6 0 0%
## 11 PopulationDensity numeric 102 0 0%
## 12 MaleP numeric 80 0 0%
## 13 WhiteP numeric 86 0 0%
## 14 BlackP numeric 71 0 0%
## 15 OtherRaceP numeric 86 0 0%
## 16 ForeignBornP numeric 87 0 0%
## 17 LaborP numeric 85 0 0%
## 18 NoLaberP numeric 85 0 0%
## 19 CivilianLaborP numeric 86 0 0%
## 20 EmployedWorker numeric 86 0 0%
## 21 Unemployedworker numeric 80 0 0%
## 22 Occupation.ProfessionalWorkerP numeric 83 0 0%
## 23 Occupation.ManagersP numeric 84 0 0%
## 24 Occupation.ClericalP numeric 77 0 0%
## 25 Occupation.CraftmanP numeric 81 0 0%
## 26 Occupation.DomesticServiceP numeric 72 0 0%
## 27 Household1 numeric 41 46 31%
## 28 Household2 numeric 42 46 31%
## 29 Household3 numeric 50 46 31%
## 30 Income1 numeric 75 21 14%
## 31 Income2 numeric 73 21 14%
## 32 Income3 numeric 64 21 14%
## 33 Income4 numeric 69 21 14%
## 34 Single numeric 74 21 14%
## 35 Married numeric 77 21 14%
## 36 Separated numeric 58 42 29%
## 37 Widowed numeric 64 42 29%
## 38 Divorced numeric 58 42 29%
## 39 IndustryManufactory numeric 62 42 29%
## 40 IndustryPublishing numeric 34 84 57%
## 41 Hospitality numeric 61 42 29%
## 42 IndustryRetail numeric 61 42 29%
## 43 IndustryBusiness numeric 57 42 29%
## 44 JapaneseP numeric 37 84 57%
## 45 ChineseP numeric 42 84 57%
## 46 FilipinoP numeric 40 84 57%
## 47 KoreanP numeric 37 84 57%
## 48 AsianIndianP numeric 34 84 57%
## 49 VietnameseP numeric 40 84 57%
## 50 HouseholdAsianandPacificP numeric 44 84 57%
## 51 AsianBelowPovertyLevelP numeric 41 85 58%
## 52 AsianP numeric 44 84 57%
## 53 ForeignBornafter1980 numeric 29 105 71%
## 54 ForeignBorn1970to1979 numeric 30 105 71%
## 55 ForeignBorn1960to1969 numeric 28 105 71%
## 56 ForeignBornBefore1960 numeric 28 105 71%
## 57 LowEduc numeric 86 0 0%
## 58 Address factor 2 0 0%
## 59 status factor 3 0 0%
##
## Numeric Variables
## n mean sd skew min p25 median
## AgeYoung 147 53.35 9.37 0.41 25.47 47.86 52.41
## TotalAssessedChange 60 64.87 114.74 2.06 -61.06 17.43 19.51
## PopulationDensity 147 9879.44 6301.17 1.37 36.21 5734.39 8730.02
## MaleP 147 51.63 9.09 1.69 21.43 47.03 48.22
## WhiteP 147 60.38 35.39 -0.31 4.66 26.28 68.88
## BlackP 147 5.42 10.86 2.94 0.00 0.15 0.97
## OtherRaceP 147 34.20 35.27 0.56 0.00 1.66 21.75
## ForeignBornP 147 40.95 24.50 0.28 0.00 19.59 36.22
## LaborP 147 51.14 11.02 -1.84 0.17 48.93 52.69
## NoLaberP 147 48.86 11.02 1.84 35.19 41.96 47.31
## CivilianLaborP 147 50.73 11.23 -1.73 0.17 47.21 52.66
## EmployedWorker 147 46.46 11.53 -1.22 0.17 40.45 49.53
## Unemployedworker 147 4.26 2.95 1.65 0.00 2.38 3.58
## Occupation.ProfessionalWorkerP 147 10.87 6.99 0.20 0.00 6.31 9.20
## Occupation.ManagersP 147 10.36 6.94 0.95 0.00 6.36 7.53
## Occupation.ClericalP 147 11.44 8.90 0.96 0.00 5.63 9.11
## Occupation.CraftmanP 147 11.43 6.65 1.83 0.00 8.55 9.88
## Occupation.DomesticServiceP 147 1.70 1.70 1.30 0.00 0.50 1.10
## Household1 101 0.99 1.55 1.86 0.00 0.00 0.24
## Household2 101 1.94 3.84 3.01 0.00 0.00 0.32
## Household3 101 62.39 41.85 -0.40 0.00 11.55 94.71
## Income1 126 32.85 29.79 1.13 0.00 11.04 25.81
## Income2 126 7.21 4.77 0.33 0.00 3.31 7.04
## Income3 126 14.47 9.14 0.01 0.00 9.91 14.57
## Income4 126 43.50 26.44 -0.32 0.00 20.75 42.44
## Single 126 29.17 8.90 0.40 14.14 23.34 28.35
## Married 126 52.39 10.84 0.02 29.31 47.57 50.91
## Separated 105 3.54 3.27 2.24 0.59 1.83 2.54
## Widowed 105 9.17 2.94 0.00 2.98 6.84 9.24
## Divorced 105 6.46 2.86 0.80 1.90 4.71 6.49
## IndustryManufactory 105 14.74 10.98 0.53 0.00 5.79 9.93
## IndustryPublishing 63 2.20 1.51 0.85 0.00 1.60 2.08
## Hospitality 105 7.53 9.12 1.52 0.00 1.91 3.21
## IndustryRetail 105 15.48 9.26 1.30 0.00 10.00 12.40
## IndustryBusiness 105 5.78 4.01 1.37 0.00 3.22 4.64
## JapaneseP 63 6.33 8.67 1.42 0.36 0.53 2.11
## ChineseP 63 68.20 19.79 -1.07 7.27 58.04 73.16
## FilipinoP 63 4.58 5.36 1.34 0.07 0.42 2.40
## KoreanP 63 2.84 3.95 1.82 0.08 0.30 0.89
## AsianIndianP 63 0.88 1.16 2.16 0.00 0.27 0.38
## VietnameseP 63 12.32 5.66 1.07 4.29 7.48 11.67
## HouseholdAsianandPacificP 63 51.70 29.73 -0.28 0.00 22.23 55.43
## AsianBelowPovertyLevelP 62 26.68 11.30 2.13 8.00 21.60 23.21
## AsianP 63 54.83 42.86 -0.29 0.03 0.82 67.78
## ForeignBornafter1980 42 69.63 10.86 -0.33 40.06 62.81 67.28
## ForeignBorn1970to1979 42 20.13 7.74 0.73 8.75 13.03 21.20
## ForeignBorn1960to1969 42 5.09 2.67 0.25 0.00 3.37 4.56
## ForeignBornBefore1960 42 5.15 1.98 0.38 1.30 4.17 5.14
## LowEduc 147 41.91 19.35 0.43 0.00 26.69 37.23
## p75 max
## AgeYoung 58.59 79.46
## TotalAssessedChange 32.78 443.68
## PopulationDensity 14267.88 40489.24
## MaleP 51.37 91.31
## WhiteP 97.84 100.00
## BlackP 3.77 57.39
## OtherRaceP 65.87 94.03
## ForeignBornP 58.05 86.04
## LaborP 58.04 64.81
## NoLaberP 51.07 99.83
## CivilianLaborP 57.85 64.70
## EmployedWorker 54.16 62.43
## Unemployedworker 4.90 13.40
## Occupation.ProfessionalWorkerP 16.48 25.90
## Occupation.ManagersP 14.44 31.03
## Occupation.ClericalP 16.78 34.47
## Occupation.CraftmanP 14.05 34.19
## Occupation.DomesticServiceP 2.35 6.88
## Household1 1.32 8.11
## Household2 2.47 17.64
## Household3 99.88 100.00
## Income1 39.75 97.60
## Income2 10.87 18.30
## Income3 21.94 36.60
## Income4 67.64 93.25
## Single 33.63 50.42
## Married 60.27 75.84
## Separated 3.27 16.46
## Widowed 11.31 14.57
## Divorced 8.28 19.05
## IndustryManufactory 24.72 42.59
## IndustryPublishing 2.83 7.06
## Hospitality 8.53 30.46
## IndustryRetail 18.57 39.53
## IndustryBusiness 6.88 22.22
## JapaneseP 6.88 28.41
## ChineseP 83.12 88.50
## FilipinoP 6.61 21.56
## KoreanP 4.04 16.62
## AsianIndianP 1.08 5.61
## VietnameseP 16.92 32.54
## HouseholdAsianandPacificP 78.84 87.49
## AsianBelowPovertyLevelP 29.44 83.50
## AsianP 99.50 100.00
## ForeignBornafter1980 80.34 88.09
## ForeignBorn1970to1979 24.41 45.89
## ForeignBorn1960to1969 7.50 11.15
## ForeignBornBefore1960 6.65 10.84
## LowEduc 58.52 80.24
##
## Categorical Variables
## variable level n pct
## Tract 116 2 0.01
## 117 2 0.01
## 118 6 0.04
## 119 8 0.05
## 2060.10 4 0.03
## 2060.20 4 0.03
## 2061 3 0.02
## 2065 4 0.03
## 2071 1 0.01
## 2071.01 4 0.03
## (32 more levels) 109 0.74
## Year 1940 21 0.14
## 1950 21 0.14
## 1960 21 0.14
## 1970 21 0.14
## 1980 21 0.14
## 1990 21 0.14
## 2000 21 0.14
## Theaters Bard’s Garfield Egyp 21 0.14
## Kim Sing Theatre 21 0.14
## King Hing Theatre 21 0.14
## Kuo Hwa 2 Cinema 21 0.14
## Kuo Hwa Theatre 21 0.14
## Monterey Theatre 21 0.14
## Pagoda Cinema 21 0.14
## Opening 0 78 0.53
## 1 36 0.24
## 2 33 0.22
## Financial Health 0 66 0.45
## 1 39 0.27
## 2 3 0.02
## 3 3 0.02
## 4 36 0.24
## Main Address No 98 0.67
## Yes 49 0.33
## Location Downtown 63 0.43
## Suburb 84 0.57
## PopLevel 1 6 0.04
## 2 28 0.19
## 3 27 0.18
## 4 21 0.14
## 5 25 0.17
## 6 40 0.27
## Address Theaters Neighborhoo 98 0.67
## Theater Main Locatio 49 0.33
## status Closing Entire Decad 78 0.53
## Opening Several Year 36 0.24
## Opening Entire Decad 33 0.22
Dependent Variables:
Dependent Variables:
Opening Status
0-Closing for The Entire Decade
1-Opening for a Period of Time in This Decade
2-Opening for the Entire Decade
Financial Health
0-unknown or closing
1-Suspend or Termination while opening
2-at least a penalty
3-at least 1 SI
4-very healthy - filing the tax for the existing year
df$Opening <- factor(df$Opening,
levels=c(0,1,2),
ordered = FALSE)
df$`Financial Health` <- factor(df$`Financial Health`,
levels=c(0,1,2,3,4),
ordered = TRUE)
Independent Variables
PopLevel
1-Population less than 1000
2-Population 1000-2000
3-Population 2000-3000
4-Population 3000-4000
5-Population 4000-5000
6-Population more than 5000
AgeYoung: Proportion of population age 35 and younger
TotalAssessedChange: The location’s parcel assessed total value change in percentage in each decade
LowEduc: Proportion of Population have never been to High School
Household Value
Household1 - Percent of household value less than 10000
Household2 - Percent of household value less than 20000 more than 10000
Household3 - Percent of household value more than 20000
Household Income
Income1 -> Proportion of people earn household income less than 10000
Income2 -> Proportion of people household income 10000-15000
Income3 -> Proportion of people household income 15000-25000
Income4 -> Proportion of people household income more than 25000
MaleP: Proportion of male in the total population
Race
WhiteP: Proportion of Population are White
BlackP: Proportion of Population are Black
OtherRaceP: Proportion of Population are Other Races
AsianP: Proportion of Population are Asian
ChineseP: Proportion of Population are Chinese
FilipinoP: Proportion of Population are Filipino
KoreanP: Proportion of Population are Korean
AsianIndianP: Proportion of Population are AsianIndian
VietnameseP: Proportion of Population are Vietnamese
HouseholdAsianandPacificP: Proportion of Household are Asian and Pacific
ForeignBornP: Proportion of Population Born in Foreign Countries
Labor:
LaborP: Proportion of Population in the labor force
NoLaborP: Proportion of Population not in the labor force
CivilianLaborP: Proportion of Population in the civilian labor force
EmployedWorker: Proportion of Population in the civilian labor force are employed
UnemployedWorker: Proportion of Population in the civilian labor force are unemployed
Occupation
Occupation.ProfessionalWorkerP: Proportion of Population (16+) Have Professional Specialty Occupations
Occupation.ManagersP: Proportion of Population (16+) Have Executive, Administrative, and Managerial Occupations
Occupation.ClericalP:Proportion of Population (16+) Have Administrative Support Occupations, Including Clerical
Occupation.CraftmanP: Proportion of Population (16+) are Craftsmen, Foremen, and Kindred Workers
Occupation.DomesticServiceP:Proportion of Population (16+) Have Private Household Occupations
Marriage Status
Single: Proportion of Population (16+) are single
Married: Proportion of Population (16+) are in marriage
Separated: Proportion of Population (16+) are in separation
Widowed: Proportion of Population (16+) are widows
Industry
IndustryManufactory: Proportion of Population (16+) are in the Manufacturing industry
IndustryPublishing: Proportion of Population (16+) are in the Printing and Publishing industry
Hospitality: Proportion of Population (16+) are in the Food, Entertainment, and Recreation Services industry
IndustryRetail: Proportion of Population (16+) are in the Retail Trade industry
IndustryBusiness: Proportion of Population (16+) are in the Business industry
Foreign Born Year
ForeignBornafter1980: Proportion of Population (16+) are born in foreign countries after 1980
ForeignBorn1970to1979: Proportion of Population (16+) are born in foreign countries from 1970 to 1979
ForeignBorn1960to1969: Proportion of Population (16+) are born in foreign countries from 1960 to 1969
ForeignBornBefore1960: Proportion of Population (16+) are born in foreign countries before 1960
AsianBelowPovertyLevelP: Proportion of Asian Population are Below Standard Poverty Level
Data Visualization with Statistical Plots
library(ggplot2)
library("ggthemes")
ggplot(df, aes(Year, ForeignBornP, color=Location, shape=Address))+
geom_point(size = 4)+
theme_light()+
ylab("Percentage of Birth in Foreign Countries")+
xlab("Year")+
ggtitle("Proportion of Foreign Born in the Population from 1940 to 2000")
ggplot(data=df)+
geom_bar(aes(x=Year,
fill=status),
position="fill", width=0.5)+
scale_fill_brewer("Opening Status",palette = "BuPu")+
facet_grid(Location~.)+
ylab("Opening Status Proportion")+
xlab("Year")+
theme_bw()
As we can see in the graph, the proportion of the population born in foreign countries increased from the 1940s to the 1970s, both downtown and suburb. During this period, the proportion of foreigners in the downtown area is higher than that of foreigners in suburban areas. This could be explained by the existence of a new Chinatown built in the 1930s, which is the exact location where downtown theaters are.
Beginning in the 1960s, Chinatown became much more diverse in the backgrounds of its Chinese residents and business people. Immigrants, whose numbers grew steadily between the 1950s and the 1990s, came from many different parts of China, as well as from Hong Kong, Taiwan, and Southeast Asia. Chinatown changed particularly fast during the 1980s, as more and more Chinese from Southeast Asia opened up businesses there. Therefore, we can see that the theaters are more likely to open from 1970 to 1990, especially in downtown areas.
While from 1970 to 2000, more and more foreigners come to the suburbs, especially the San Gabriel Valley, where the suburban theaters locate. Such a phenomenon could relate to the new immigration trend of Asians from 1960, which peaked in 1990. This new Ethnoburb mainly attracts Chinese immigrants from Hong Kong, Taiwan, and Mainland China, which could also reveal why some Chinese movie theatres that specifically project Hong Kong films long existed in the 1980s and 1990s.
It is worth noting that the post-1980 globalization trend has profoundly shaped Asian immigration, assimilation, and development in the San Gabriel Valley. The San Gabriel Valley Chinese community was created under global, national, and local contexts and had stronger global connections and internal stratification.
df$FinancialStatus<- factor(df$`Financial Health`, labels = c("Unknown or Closing","Suspend or Termination While Opening","At Least a Penalty","At Least 1 SI-Minor Warn","Very Healthy-Always Filing The Tax"))
ggplot(data=df)+
geom_bar(aes(x=Year,
fill=FinancialStatus),
position="fill", width=0.5)+
scale_fill_brewer("Financial Health")+
facet_grid(Location~.)+
ylab("Financial Health")+
xlab("Year")+
theme_bw()
The distribution of financial health from 1940 to 2000 is quite similar to the relationship between the year and opening status, and both of these response variables peak at nearly 1970. While the opening status for theaters in downtown and suburbs varies during the time and perform differently, the financial health of these theaters has a similar trend. For the downtown, the theaters performed financially well in the 1960s and the 1970s, while didn’t do well in other decades. It is possibly caused by the lack of information for their tax records in some specific decades, but it could also bring out the phenomenon that Chinatown has become more diversified and more financially successful as more immigrants come to that area from 1960 and Asian domestic immigration from downtown to suburbs from 1980.
However, although the financial health of theaters in suburbs didn’t have considerable change during this period, it still aggravated from 1980, similar to downtown theaters’ financial health trend. It is important and fascinating to check the reasons that could potentially relate to such a phenomenon and affect the opening status of the theaters. As we set our dependent variables as categorical and ordinal, I first choose ordinal regression ( a statistical technique that is used to predict behavior of ordinal level dependent variables with a set of independent variables )to see whether those variables have correlations. Later I will do another regression with clustering to do the multilevel logistic regression (a statistical technique that is used to estimate the probability that an event will occur “the yes/no outcome” while taking the dependency of data into account such as the fact that pupils are nested in classrooms) so we can both consider their locations (Downtown or Suburb) as a big group while controlling theaters that distribute in these two clusters as the lower level groups.
Choosing the variables
Because we have too many variables and not enough observations due to the limited sample size, we have to reduce variables by checking their significance level with the dependent variables separately and finally choose ten variables in the final regression. However, we have a lot of missing values for a group of variables that only appear from 1980 to 2000. I will not consider these variables in the final regression but take them in the additional regression that only has the observations between 1980 and 2000 to see their significance. Here is an example of how I chose the variables based on checking their p-value. In addition, it is important to check the correlation between predictors since we don’t want predictors to be strongly correlated.
library(ordinal)
##
## Attaching package: 'ordinal'
## The following object is masked from 'package:dplyr':
##
## slice
library(reshape2)
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##Test significance level of variables which have few missing values
mod1<-clm(`Financial Health`~Theaters+Location+AgeYoung, data=df)
summary(mod1)
## formula: `Financial Health` ~ Theaters + Location + AgeYoung
## data: df
##
## link threshold nobs logLik AIC niter max.grad cond.H
## logit flexible 147 -135.29 292.59 7(2) 2.91e-13 4.8e+05
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## TheatersKim Sing Theatre -1.93635 0.62652 -3.091 0.001997 **
## TheatersKing Hing Theatre -0.66423 0.63620 -1.044 0.296457
## TheatersKuo Hwa 2 Cinema -3.22192 0.78058 -4.128 3.67e-05 ***
## TheatersKuo Hwa Theatre 2.67054 0.76922 3.472 0.000517 ***
## TheatersMonterey Theatre -2.42591 0.66508 -3.648 0.000265 ***
## TheatersPagoda Cinema -0.87559 0.59759 -1.465 0.142864
## LocationSuburb NA NA NA NA
## AgeYoung 0.01573 0.01885 0.835 0.403957
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Threshold coefficients:
## Estimate Std. Error z value
## 0|1 -0.6144 1.0845 -0.567
## 1|2 1.2072 1.0888 1.109
## 2|3 1.3933 1.0896 1.279
## 3|4 1.5743 1.0897 1.445
#Finding correlations between independent variables
sub1<-subset(df, select=c("Opening","OtherRaceP","LowEduc","Unemployedworker","Occupation.ProfessionalWorkerP","Occupation.DomesticServiceP","Income2","Married","IndustryManufactory","IndustryBusiness"))
sub1<-data.frame(apply(sub1,2,function(x) as.numeric(as.character(x))))
cor_plot(sub1,number=TRUE)
In the correlation matrix, the red color represents the negative correlation and the purple color represents the positive correlation. The lighter the color, the weaker the correlation is; the darker the color, the stronger the correlation is.
Based on the plot, we could know which two variables have a strong correlation so we can better choose possible independent variables which should not relate to each other and have a potential relationship with dependent variables. For instance, in the plot, we can see the occupation as professional work, and the proportion of people getting lower education level are strongly negatively related, which means more people who have lower education fewer people will become experienced workers in this area. In addition, occupation as a professional worker is also negatively related to Income2(Proportion of people’s household income 10000-15000). Income2 is the median level of people’s wages, and its share represents the proportion of the middle-class. So the correlation shows the hypothesis that the greater the middle-class proportion, the fewer people will join the professional sector. Thus, I rule out the factor of occupation as professional work since it could be present by either LowEduc or Income. Based on this rule and their correlation levels and one-to-one significance test, I finally choose ten variables in the final regression model(not including those that only appear in late decades).
Ordinal Regression (Opening Status)
We finally chose ten variables in the final regression based on the correlation plot and their significance levels with the opening status.
require(ordinal)
modFinal<-clm(Opening~Theaters+Year+OtherRaceP+LowEduc+Unemployedworker+Occupation.DomesticServiceP+Income2+Married+IndustryManufactory+IndustryBusiness, data=df)
summary(modFinal)
## formula:
## Opening ~ Theaters + Year + OtherRaceP + LowEduc + Unemployedworker + Occupation.DomesticServiceP + Income2 + Married + IndustryManufactory + IndustryBusiness
## data: df
##
## link threshold nobs logLik AIC niter max.grad cond.H
## logit flexible 105 -50.49 140.99 7(0) 6.68e-11 8.9e+06
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## TheatersKim Sing Theatre -3.08724 2.53960 -1.216 0.22412
## TheatersKing Hing Theatre -2.15027 2.54552 -0.845 0.39826
## TheatersKuo Hwa 2 Cinema 0.33004 1.07894 0.306 0.75969
## TheatersKuo Hwa Theatre 4.51794 1.30949 3.450 0.00056 ***
## TheatersMonterey Theatre -0.93464 1.06410 -0.878 0.37976
## TheatersPagoda Cinema 1.52522 2.55507 0.597 0.55055
## Year1970 1.36166 1.23412 1.103 0.26988
## Year1980 -1.83230 3.62106 -0.506 0.61285
## Year1990 -4.80889 3.49219 -1.377 0.16850
## Year2000 -14.65039 5.51489 -2.657 0.00790 **
## OtherRaceP 0.04548 0.03233 1.407 0.15948
## LowEduc 0.07495 0.06247 1.200 0.23026
## Unemployedworker -0.38432 0.31373 -1.225 0.22058
## Occupation.DomesticServiceP 0.33000 0.47039 0.702 0.48296
## Income2 0.08883 0.19861 0.447 0.65467
## Married -0.18642 0.07903 -2.359 0.01833 *
## IndustryManufactory 0.07362 0.09901 0.744 0.45712
## IndustryBusiness 0.49258 0.28407 1.734 0.08291 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Threshold coefficients:
## Estimate Std. Error z value
## 0|1 -5.993 5.007 -1.197
## 1|2 -1.549 4.969 -0.312
## (42 observations deleted due to missingness)
exp(-0.18642)
## [1] 0.829925
exp(0.49258)
## [1] 1.636533
##Data Visualization based on the regression
new_obs1<-expand.grid(Theaters="Kim Sing Theatre",
Year=c("1960","1970","1980","1990","2000"),
OtherRaceP=mean(df$OtherRaceP,na.rm = TRUE ),
LowEduc=mean(df$LowEduc,na.rm = TRUE),
Unemployedworker=mean(df$Unemployedworker,na.rm = TRUE),
Occupation.ProfessionalWorkerP=mean(df$Occupation.ProfessionalWorkerP,na.rm = TRUE),
Occupation.DomesticServiceP=mean(df$Occupation.DomesticServiceP,na.rm = TRUE),
Income2=mean(df$Income2,na.rm = TRUE),
Married=seq(0,80,by=20),
IndustryManufactory=mean(df$IndustryManufactory,na.rm = TRUE),
IndustryBusiness=mean(df$IndustryBusiness,na.rm = TRUE))
predictions1<-predict(modFinal,new_obs1,type="p")
prediction_data1<-cbind(new_obs1,predictions1)
prediction_data1_long <-melt(prediction_data1, id=c("Married","Year","Theaters","OtherRaceP","LowEduc","Unemployedworker","Occupation.ProfessionalWorkerP","Occupation.DomesticServiceP","Income2","Married","IndustryManufactory","IndustryBusiness"))
prediction_data1_long$status<-factor(prediction_data1_long$variable,labels=c("Closing Entire Decade","Opening Several Years", "Opening Entire Decade"))
prediction_data1_long1<-prediction_data1_long[,c("Theaters","Married","value","status","Year")]
ggplot(data=prediction_data1_long1)+
geom_area(aes(x=Married, y=value, fill=status),
stat="identity", alpha=0.5)+
facet_grid(.~Year, labeller = label_both)+
scale_fill_manual("Theater Opening Status",values=c("pink","blue","navy"))+
ylab("Predicted Cumulative Proportions")+
xlab("Proportion of Married Population")+
theme_clean()
As we can see in the summary of the model, besides controlling the theaters and years, two more variables significantly influence the opening status. The first one is the marriage rate, and it has a negative relationship with the opening status. It shows that more people are getting married in that area during that decade, and the theaters are more likely to close. It suggests that people who get married are unlikely to go to the Chinese Movie Theaters, which relates to marriage and leisure activities. The plot shows that from 1940 to 2000, marriage negatively impacted theaters’ operating status. However, its impact varies over decades. In 2000, even though the marriage rate was low, it was still hard for theaters to be open. For the later decades, the weight of marriage is not a significant factor that impacts Chinese Movie Theaters’ status. This raises another question: what else is contributing to the closure of Chinese Movie Theaters.
boxplot(Married~Year,data=df, main="Final Data",
xlab="Proportion of Marriage People in the Population", ylab="Year")
Another assumption is that the marriage rate could relate to the immigration of the population, as the marriage rate was unusual in 1980 and 1990, which correspond to the time when a new trend of immigration appeared. It also shows new trends in people’s lifestyles and recreational activities in the 1980s and 1990s, and I am still exploring some of the factors that could account for this phenomenon.
More specifically, according to the logarithm used, the chances that the married group would contribute to the theaters opening is 0.186 units less than the chances that the non-married group would contribute to the opening of the theaters. That is, as the proportion of people getting married increases 1 unit, the odds that the theaters are more likely to be in the opening status by a factor of exp(-0.186) = 0.83 times, holding other variables fixed.
Another variable that significantly influences opening status is the prosperity of the business industry, and it has a positive relationship with the dependent variable. As more people join the business industry, the more likely the theaters long exist in that period in the area. To be more specific, as the proportion of people in the business industry increases by 1 unit, the odds that the theaters are more likely to be in the opening status by a factor of exp(0.493) = 1.64 holding other variables fixed. It is reasonable since the more prosperous the area is, the more likely the theaters will be open. Both downtown areas and suburbs become the clusters where the Chinese achieve financial success. Since Chinatown is a place that mainly develops hospitality, I test whether hospitality strongly correlates with the theaters’ opening status. It shows that they are not highly related and enable us to think of other businesses in those clusters beside the food and recreational industry.
Ordinal Regression (Financial Health)
modFinal2<-clm(`Financial Health`~Theaters+LowEduc+Unemployedworker+Occupation.DomesticServiceP+Married+IndustryManufactory+OtherRaceP+IndustryBusiness+`Main Address`, data=df)
summary(modFinal2)
## formula:
## `Financial Health` ~ Theaters + LowEduc + Unemployedworker + Occupation.DomesticServiceP + Married + IndustryManufactory + OtherRaceP + IndustryBusiness + `Main Address`
## data: df
##
## link threshold nobs logLik AIC niter max.grad cond.H
## logit flexible 105 -62.07 160.14 9(1) 3.83e-13 3.4e+06
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## TheatersKim Sing Theatre 0.136626 1.874335 0.073 0.941891
## TheatersKing Hing Theatre 5.803995 2.035792 2.851 0.004359 **
## TheatersKuo Hwa 2 Cinema -8.105591 1.697566 -4.775 1.80e-06 ***
## TheatersKuo Hwa Theatre 6.069761 1.446116 4.197 2.70e-05 ***
## TheatersMonterey Theatre -5.098475 1.330559 -3.832 0.000127 ***
## TheatersPagoda Cinema 4.095790 1.900574 2.155 0.031160 *
## LowEduc -0.138456 0.049998 -2.769 0.005619 **
## Unemployedworker 0.134936 0.227610 0.593 0.553288
## Occupation.DomesticServiceP -0.026345 0.216971 -0.121 0.903359
## Married 0.001012 0.041547 0.024 0.980572
## IndustryManufactory -0.089403 0.038131 -2.345 0.019045 *
## OtherRaceP -0.006342 0.014044 -0.452 0.651567
## IndustryBusiness -0.563349 0.129630 -4.346 1.39e-05 ***
## `Main Address`Yes -0.326672 0.522556 -0.625 0.531877
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Threshold coefficients:
## Estimate Std. Error z value
## 0|1 -12.619 3.085 -4.091
## 1|2 -6.418 2.384 -2.692
## 2|3 -5.917 2.376 -2.490
## 3|4 -5.390 2.364 -2.280
## (42 observations deleted due to missingness)
exp(-0.56)
## [1] 0.5712091
exp(-0.089)
## [1] 0.9148456
exp(-0.138)
## [1] 0.8710987
As we can see in the regression result, low-level education, Manufacture Industry, and Business Industry all have a significant negative correlation with the financial health of the theaters. Surprisingly, this time the business industry doesn’t contribute to the theaters’ success but negatively impacts the economic success of theaters. As the proportion of people in the business industry increases by 1 unit, the odds that the theaters are less likely to have good financial health by a factor of exp(-0.56) = 0.57 holding other variables fixed. Therefore, the more people joining the business industry, the worse for theaters’ financial health.
It could be applied to the same situation between the manufacturing industry and theaters’ financial health. Though they are not so strongly correlated, it turns out that the more people join the manufacturing industry, the worse of theaters’ financial health. As the proportion of people in the manufactory industry increases by 1 unit, the odds that the theaters are less likely to have good financial health by a factor of exp(-0.089) = 0.92 holding other variables fixed. It could be explained by an assumption that more people joining the manufacturing industry lead to a higher labor force which conflicts with their spare time in going to the theaters.
The third variable is low education level which negatively affects the theaters’ financial health. As the proportion of people who haven’t been to high school increases by 1 unit, the odds that the theaters are less likely to have good financial health by a factor of exp(-0.138) = 0.87 holding other variables fixed. Therefore, to have good financial health for theaters, the education level is essential and good education could contribute to theaters’ success. People with higher education are more likely to go to the theaters, and people with a higher literacy level are more likely to spend their time in Chinese movies or going to the Chinese Movie Theaters.
1980-2000 Ordinal Logistic Regression
We then introduce the second regression with only variables with values from the 1940s to the 1970s. The regression is mainly based on the 1980s to the 2000s dataset (delete the observations with missing values). For this regression, I recoded opening status from three levels to two levels that only state open or closed status. Thus we will use ordinal logistic regression.
##recode the dependent variable
df2<-df
df2$Opening1[df2$Opening==1|df2$Opening==2]<-1
df2$Opening1[df2$Opening==0]<-0
df2$Opening1<-as.factor(df2$Opening1)
df3<-df2
df2<-subset(df2,select=c("Opening1", "Year","Location","Theaters","ChineseP","JapaneseP","AsianIndianP","KoreanP","VietnameseP","AsianBelowPovertyLevelP"))
df2<-na.omit(df2)
modadd<-glm(Opening1~Location+Theaters+JapaneseP+ChineseP+KoreanP+VietnameseP+AsianBelowPovertyLevelP, family="binomial",data=df2)
summary(modadd)
##
## Call:
## glm(formula = Opening1 ~ Location + Theaters + JapaneseP + ChineseP +
## KoreanP + VietnameseP + AsianBelowPovertyLevelP, family = "binomial",
## data = df2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.95118 -0.01253 0.00000 0.01601 1.48163
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -192.8316 3645.3735 -0.053 0.9578
## LocationSuburb -9.0857 3643.3437 -0.002 0.9980
## TheatersKim Sing Theatre -36.2082 3643.5350 -0.010 0.9921
## TheatersKing Hing Theatre -26.5222 3643.3451 -0.007 0.9942
## TheatersKuo Hwa 2 Cinema -2.0398 2.7791 -0.734 0.4630
## TheatersKuo Hwa Theatre 14.3433 9.2115 1.557 0.1194
## TheatersMonterey Theatre -19.3673 11.2743 -1.718 0.0858 .
## TheatersPagoda Cinema NA NA NA NA
## JapaneseP 3.3846 2.0434 1.656 0.0977 .
## ChineseP 2.1315 1.2232 1.742 0.0814 .
## KoreanP 0.7267 0.6628 1.096 0.2729
## VietnameseP 1.3597 0.8554 1.590 0.1119
## AsianBelowPovertyLevelP 0.9429 0.5880 1.604 0.1088
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 84.33 on 61 degrees of freedom
## Residual deviance: 22.92 on 50 degrees of freedom
## AIC: 46.92
##
## Number of Fisher Scoring iterations: 19
exp(2.13)
## [1] 8.414867
exp(3.38)
## [1] 29.37077
levels(df$Theaters)
## [1] "Bard’s Garfield Egyptian Theatre\n" "Kim Sing Theatre"
## [3] "King Hing Theatre" "Kuo Hwa 2 Cinema"
## [5] "Kuo Hwa Theatre" "Monterey Theatre"
## [7] "Pagoda Cinema"
new_obs2<-expand.grid(Theaters="Pagoda Cinema",
Location=c("Downtown","Suburb"),
JapaneseP=mean(df2$JapaneseP,na.rm = TRUE ),
AsianIndianP=mean(df2$AsianIndianP,na.rm = TRUE),
VietnameseP=mean(df2$VietnameseP,na.rm = TRUE),
ChineseP=seq(55,68,by=0.01),
KoreanP=mean(df2$KoreanP,na.rm = TRUE),
AsianBelowPovertyLevelP=mean(df2$AsianBelowPovertyLevelP,na.rm = TRUE))
predictions2<-predict(modadd,new_obs2,type="response")
prediction_data2<-cbind(new_obs2,predictions2)
prediction_data2_long <-melt(prediction_data2, id=c("JapaneseP","ChineseP","AsianIndianP","KoreanP","VietnameseP","AsianBelowPovertyLevelP","Location","Theaters"))
prediction_data2_long$status<-factor(prediction_data2_long$variable, labels = c("Opening"))
ggplot(data=prediction_data2_long)+
geom_area(aes(x=ChineseP, y=value, fill="Opening"))+
scale_fill_manual("Theater Status",values = "#C3D7A4")+
geom_line(aes(x=ChineseP, y=value))+
facet_grid(Location~., labeller = label_both)+
ylab("Predicted Cumulative Proportion of Theaters in Opening Status")+
xlab("Proportion of Chinese in The Total Population")+
theme_clean()
As we can see in this regression result, no variable strongly impacts theaters’ opening status if we set the confidence level to be 95%. However, it could still count the Chinese and Japanese Proportion in the population as two variables that positively relate to theaters’ opening status. As the Proportion of Chinese in the population increases by 1 unit, the odds that the theaters are likely to be open by a factor of exp(2.13) = 8.41 holding other variables fixed. As the Proportion of Japanese in the population increases by 1 unit, the odds that the theaters are likely to be open by a factor of exp(3.38) = 29.37, holding other variables fixed. It is essential to mention that Proportion of Chinese is really high compared with other Asian races, so the influence of the Chinese contributing to the theaters is statistically smaller than other race factors. Still, it contributes a lot since it has a large base population both in downtown and suburbs.
The graph of one example of a theater indicates that the more Chinese in the area, the more likely the theaters are to be open. It also illustrates that the increasing Chinese population can more easily contribute to theaters’success downtown than in the suburbs. If the Chinese proportion is 60%, it seems that a downtown theater has a 50% likelihood of being open while it is more likely to be closed in the suburbs.
Hierarchical Logistic Regression
Finally, we take additional hierarchical logistic regression to make a better model for the dataset. It is because we have several levels here, the first level is the theater, and then the second level is the location. The lower level presents the small unit, and the higher level presents the bigger group. Since downtown and suburb are two clusters that are obvious on the map where theaters distribute, and for each theater, it could be regarded as a small cluster since I also count its tract and its adjacent tracts into the dataset. Therefore, we could use hierarchical logistic regression to find the significant variables better.
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'lme4'
## The following objects are masked from 'package:ordinal':
##
## ranef, VarCorr
mod2<-glmer(Opening1~(1 + 1|Location/Theaters)+LowEduc+Occupation.ProfessionalWorkerP+Married+IndustryBusiness+Location, family = "binomial",data=df3)
summary(mod2)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## Opening1 ~ (1 + 1 | Location/Theaters) + LowEduc + Occupation.ProfessionalWorkerP +
## Married + IndustryBusiness + Location
## Data: df3
##
## AIC BIC logLik deviance df.resid
## 80.7 101.9 -32.3 64.7 97
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0367 -0.0532 0.0004 0.1568 5.2921
##
## Random effects:
## Groups Name Variance Std.Dev.
## Theaters:Location (Intercept) 1.845e+01 4.2956575
## Location (Intercept) 1.307e-08 0.0001143
## Number of obs: 105, groups: Theaters:Location, 7; Location, 2
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 50.84100 18.70888 2.717 0.00658 **
## LowEduc -0.04663 0.05657 -0.824 0.40977
## Occupation.ProfessionalWorkerP -0.73456 0.38063 -1.930 0.05363 .
## Married -0.65591 0.24630 -2.663 0.00774 **
## IndustryBusiness -1.12843 0.42237 -2.672 0.00755 **
## LocationSuburb 5.52195 5.68680 0.971 0.33154
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) LowEdc Oc.PWP Marrid IndstB
## LowEduc -0.241
## Occptn.PrWP -0.935 0.068
## Married -0.968 0.054 0.943
## IndstryBsns -0.926 0.042 0.897 0.933
## LocatinSbrb 0.542 0.296 -0.713 -0.684 -0.632
exp(-0.65591)
## [1] 0.5189696
exp(-1.12843)
## [1] 0.3235408
new_obs3<-expand.grid(Theaters=c("King Hing Theater"),
Location=c("Downtown"),
LowEduc=mean(df$LowEduc,na.rm = TRUE),
Married=mean(df$Married,na.rm = TRUE),
Occupation.ProfessionalWorkerP=mean(df$Occupation.ProfessionalWorkerP,na.rm = TRUE),
IndustryBusiness=seq(2,8,by=1))
predictions3<-predict(mod2,new_obs3,allow.new.levels = TRUE,type="response")
prediction_data3<-cbind(new_obs3,predictions3)
prediction_data3_long <-melt(prediction_data3, id=c("LowEduc","Theaters","Occupation.ProfessionalWorkerP","IndustryBusiness","Married","Location"))
prediction_data3_long$status<-factor(prediction_data3_long$variable, labels = c("Opening"))
ggplot(data=prediction_data3_long)+
geom_area(aes(x=IndustryBusiness, y=value, fill="Opening"))+
scale_fill_manual("Theater Status",values=c("lightblue"))+
geom_line(aes(x=IndustryBusiness, y=value))+
ylab("Predicted Cumulative Proportion of Theaters in Opening Status")+
xlab("Proportion of People in The Business Industry")+
theme_classic()
As we can see in the regression result and graph, it is surprisingly finding out that it is almost the same result as what we get from ordinal regression since only two significant variables are “Married” and “IndustryBusiness.” However, as the married population is still negatively related to the opening status, the business industry turns from positively to negatively related to the opening status. We could compare two models’ AIC to see which one could more accurately predict based on the model. In statistics, AIC is used to compare different possible models and determine which one is the best fit for the data. Normally, lower AIC values indicate a better-fit model. As the logistic regression’s AIC( 80.7) is lower than the ordinal one (141), it is better to choose the logistic one. Thus, as the proportion of people getting married in the population increases by 1 unit, the odds that the theaters are less likely to be open by a factor of exp(-0.656) = 0.519, holding other variables fixed. As the proportion of people joining the business industry in the population increases by 1 unit, the odds that the theaters are less likely to be open by a factor of exp(-1.128) = 0.324, holding other variables fixed. As I assumed before, such a correlation could relate to immigrants’ new entertaining fashion and some businesses that could potentially compete with the theater industry.
1980-2000 Hierarchical Logistic Regression
mod3<-glmer(Opening1~(1|Location/Theaters)+Location+JapaneseP+ChineseP+KoreanP+VietnameseP+AsianBelowPovertyLevelP, family="binomial",data=df2)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00655929 (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: large eigenvalue ratio
## - Rescale variables?
summary(mod3)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## Opening1 ~ (1 | Location/Theaters) + Location + JapaneseP + ChineseP +
## KoreanP + VietnameseP + AsianBelowPovertyLevelP
## Data: df2
##
## AIC BIC logLik deviance df.resid
## 68.2 87.3 -25.1 50.2 53
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.3933 -0.2210 0.0095 0.1767 1.4278
##
## Random effects:
## Groups Name Variance Std.Dev.
## Theaters:Location (Intercept) 1.682e+01 4.101350
## Location (Intercept) 7.654e-05 0.008749
## Number of obs: 62, groups: Theaters:Location, 7; Location, 2
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -82.0838 38.9055 -2.110 0.0349 *
## LocationSuburb 4.0305 4.5227 0.891 0.3728
## JapaneseP 1.1202 0.5609 1.997 0.0458 *
## ChineseP 0.8032 0.3810 2.108 0.0350 *
## KoreanP 0.7744 0.4946 1.566 0.1174
## VietnameseP 0.6666 0.3275 2.035 0.0418 *
## AsianBelowPovertyLevelP 0.3029 0.1665 1.819 0.0690 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) LctnSb JapnsP ChinsP KorenP VtnmsP
## LocatinSbrb -0.647
## JapaneseP -0.956 0.596
## ChineseP -0.997 0.605 0.958
## KoreanP -0.502 0.202 0.260 0.507
## VietnameseP -0.844 0.409 0.751 0.848 0.641
## AsnBlwPvrLP -0.842 0.605 0.852 0.827 0.217 0.483
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.00655929 (tol = 0.002, component 1)
## Model is nearly unidentifiable: large eigenvalue ratio
## - Rescale variables?
exp(1.1202)
## [1] 3.065467
exp(0.667)
## [1] 1.948383
exp(0.8032)
## [1] 2.232674
new_obs3<-expand.grid(Theaters="Pagoda Cinema",
Location=c("Downtown","Suburb"),
JapaneseP=mean(df2$JapaneseP),
AsianIndianP=mean(df2$AsianIndianP),
VietnameseP=seq(0,20,by=0.1),
ChineseP=mean(df2$ChineseP),
KoreanP=mean(df2$KoreanP),
AsianBelowPovertyLevelP=mean(df2$AsianBelowPovertyLevelP))
predictions3<-predict(mod3,new_obs3,allow.new.levels = TRUE,type="response")
prediction_data3<-cbind(new_obs3,predictions3)
prediction_data3_long <-melt(prediction_data3, id=c("JapaneseP","ChineseP","AsianIndianP","KoreanP","VietnameseP","AsianBelowPovertyLevelP","Location","Theaters"))
prediction_data3_long$status<-prediction_data3_long$variable
prediction_data3_long$status<-factor(prediction_data3_long$variable)
ggplot(data=prediction_data3_long)+
geom_area(aes(x=VietnameseP, y=value, fill="Opening"))+
geom_line(aes(x=VietnameseP, y=value))+
scale_fill_manual("Theater Status",values=c("#52854C"))+
facet_grid(Location~., labeller = label_both)+
ylab("Predicted Cumulative Proportion of Theaters in Opening Status")+
xlab("Proportion of Vietnamese in The Total Population")+
theme_clean()
I also ran another regression for the small dataset that ranges from 1980 to 2000, so we could know better about those variables that only show up in this period. It turns out that besides the two variables “ChineseP” and “JapaneseP” having a strong positive correlation with the opening status that we have known from the ordinal regression, the new variable “VietnameseP” also shows a significant positive correlation with the opening status of theaters. It accounts for the phenomenon that a lot of Vietnamese immigrants to America, especially in suburban areas, from 1980 with the well-known trend of Chinese immigration.
More specifically, as the proportion of Chinese in the population increases by 1 unit, the odds that the theaters are likely to be open by a factor of exp(0.8032) = 2.23 holding other variables fixed. As the proportion of Japanese or Vietnamese in the population increases by 1 unit, the odds that the theaters are likely to be open by a factor of exp(1.1202) = 3.06 or exp(0.667)= 1.95 respectively while holding other variables fixed.
It is easy to observe the positive relationship between the Vietnamese Population and the opening status of Chinese Movie Theaters in the graph(one example of a theater), and there is no significant difference between their performances in downtown or suburb of this correlation.
Let’s compare the two models again. This time the AIC for the hierarchical logistic model is 68.2, while the ordinal logistic regression’s AIC is 46.92. Although the ordinal logistic regression’s AIC is lower, there are some other ways we need to consider in choosing the model. In the ordinal model, location is a factor could impact the correlation between theaters’ opening status, but in the multilevel hierarchical logistic model, location is no longer the important factor. This is because the multilevel model brings out the concept of clusters and we take in theaters’ distribution. Therefore, it is more rational to consider the cluster in our case and use the multilevel logistic regression.
Conclusion
Based on the ordinal models and multilevel logistic regressions, we surprisingly found that among a lot of demographic factors, such as education, age, marriage status, population, race, income, household value, industry, and occupation, two variables are significantly negatively related to the theaters’ opening status. As more people get married, they are less likely to go to the theaters, which makes us think of married people’s distribution of recreational activities and leisure time. After checking the number of married people in different decades, we found that in the 1980s and the 1990s, the number was lower than marriage rates in other decades, which makes us consider the influence of the immigration policies and people’s changing lifestyles. As for the business industry, the result is quite different as we run different models. Based on the ordinal model, it is positively related to the opening status, while the logistic reveals that it is negatively related to the opening status. As we test that the business industry is negatively associated with the theaters’ financial health and AIC for the logistic model is lower than the one we get from the ordinal model, it is more reasonable to assume that the business industry harms the theaters’ success. The more people join the business industry, or the area is more focused on the business, the less likely the theaters will be open or financially successful. A study shows that when the local labor market becomes tighter, the minority labor force, which normally concentrates at the lower level of the job hierarchy-is most likely to be squeezed out of employment. This could partially explain the decline of the Chinese Movie Theaters in the 1990s and the negative relationship between business industry prosperity and theaters’ success. Since Movie Industry in Chinatown is mainly operated by minorities and they are self-employed, concentrating on low-skill sectors with easy entry. Such industry could be regarded as being in a low job hierarchy, and it is easy to be ruled out while competing with other businesses operated by whites or not in a lower order. Therefore, Chinese Movie Theaters quickly decreased in numbers as other businesses developed in the 1990s. Such phenomenon is also brought by another finding that the manufacturing industry and low education level are also negatively related to the theaters’ financial health. It seems that people who go the Chinese movie theaters have a higher education level and literacy level. It makes us consider the target audiences of Chinese movie theaters and their backgrounds.
Although we didn’t find out a direct correlation between the Asian population and the theater opening status or prove our assumption that immigration is a significant factor contributing to the theaters’ success (AsianP is not significantly related to the dependent variable), we have shown that Chinese population, Japanese population as well as Vietnamese population are positively correlated with theaters’ opening status. Those Chinese movie theaters’ target audiences are Asians, and most of them are operated by Chinese. Therefore, from the 1970s to the 2000s, many Chinese and Vietnamese immigrants appeared in that period, hugely contributing to theaters’ financial success and keeping them open for a long time. We cannot test their correlation because we lack detailed information about Asian races from the 1940s to the 2000s, but we can still assume they are possibly related. Another interesting finding is that location is not an essential factor influencing the opening status of theaters. This means that no matter whether theaters are in downtown or the suburbs, the site could not hugely impact their operating conditions.
Therefore, the business industry, Asian population (Japanese, Chinese, Vietnamese), married population, education level, and manufacturing industry are all important factors that could affect the opening status of theaters and their financial health. However, location is not a significant factor affecting theaters’ survival status. Our findings could be more accurate if we include more theaters based in LA and other counties or states to have a more significant sample and know the general factors that could influence the financial health of Chinese-language movie theaters. It is necessary to conduct more research on immigration policy and theaters’ histories to have more concrete evidence to support such correlations. As for the chosen model, I have tried ordinal, multinomial, and hierarchical logistic regression. It could be most accurate as we run the hierarchical logistic regression since it is mainly cluster-based and could have several levels of groups. We have to control those variables to predict the possible factors better. It could be more profound for the post estimation, and I am interested in exploring more findings after the regression.
Acknowledgement:
I would like to thank Prof.Dombrowski for her continued guidance and support and Sophie Gilbert for her help . I would also like to thank Prof.Kaparakis, Prof. Nazarro, Prof.Kabacoff, Prof.Rose, Prof. Gooyabadi, Prof.Oleinikov, and the QAC.
References
Acs, Z.J. 2007. Entrepreneurship, economic growth and public policy. Small Business Economics 28: 109–122.
Allen, J. P. and Turner, E. (1997). The Ethnic Quilt. Northridge, California: The Center for Geographical Studies, California State University, Northridge.
Social Explorer. Los Angeles. From https://www.socialexplorer.com/explore-tables
1950 census of population.From https://www2.census.gov/library/publications/decennial/1950/population-volume-1/vol-01-01.pdf
Resources:
Data Source: https://www.socialexplorer.com/explore-tables
1950 Census: https://www2.census.gov/library/publications/decennial/1950/population-volume-3/41557421v3p2ch07.pdf
Census Tract Geocoder: https://geocoding.geo.census.gov/geocoder/geographies/address?form
Articles:
CHINESE IMMIGRATION AND ITS IMPLICATIONS ON URBAN MANAGEMENT IN LOS ANGELES: https://www.jstor.org/stable/pdf/24872617.pdf
1990s: The Golden Decade : CHINATOWN LOS ANGELES : Revitalized Community Rises From Shock Waves of Change: https://www.latimes.com/archives/la-xpm-1990-01-15-ss-97-story.html