Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.

-Winston Churchill

Though my journey through HST 251 has come to a close for the semester, my journey into a much larger world has only just begun—Churchill’s words capture this bittersweet moment best.

Learning how to think like a historian has taught me how to better think; learning how to question like a historian has taught me how to better question; learning how to reflect like a historian has taught me how to better reflect; learning how to learn like a historian has taught me how to better learn. Over the course of the past 15 weeks, learning how to process information like a historian has led me to become a better all-around scholar, researcher, communicator, leader and innovator; over the course of the past 15 weeks, learning how to be a historian has led me to become an all-around better human being.

Deconstructing, unpacking, reorganizing, and rebuilding my mental models of the world to incorporate a historical viewpoint has led me to see a new light in the dull, a new spark in the dark, and a new neutrality in the polarized; bringing a historical mindset to the table has led me to make better decisions, think more rationally, appreciate more deeply, and doubt more strategically.

I wasn’t expecting to get all of that out of an elective history course, but here I stand.

Churchill’s words reflect the fact that today’s post marks my last of the spring 2019 semester in Michigan State University’s HST 251: Doing Digital History course with Dr. Sharon Leon while simultaneously reflecting a more important fact: today’s post merely marks the end of a beginning. The end of a beginning of a brighter, wiser, more measured, more nuanced, more sophisticated future. The end of a beginning of a more meaningful future.

Let’s jump to the agenda at hand before I get too emotional.

Winston Churchill, Prime Minister of the United Kingdom (1940-1945).

For our final assignment, we were tasked with revisiting a past project from the semester and reinvigorating it with new data, new knowledge, new purpose, new questions, and a new perspective—hence, I naturally set my eyes on my work in Geospatial Analysis. Comparing the spread of urban settlements with the spread of slavery from 1790 to 1850 across a time-series compilation of chloropleths fascinated me the first time around, but left me with more questions than answers: what other geospatial metrics correlated with the spread of slavery in the United States? What other factors facilitated the spread of slavery, and what other characteristics were the product of slavery’s creep?

I sought a more holistic understanding of the spatial dynamics surrounding slavery’s spread, and thus decided to aim my final deep dive of the semester into the treasure trove of 1850 county-level census data from NHGIS. As curiosity took the wheel, I found myself downloading 20 different county-level census tables, merging the 20 variables on each county’s GISJOIN ID, and manufacturing a mammoth Flourish project like a kid in a candy shop—the thrill of mystery followed by discovery induced an addiction. The product of such an addiction lies in the visualizations below: browse through them yourself, and we’ll reconvene to discuss shortly.

Ask yourself the questions I pondered over the course of the project’s creation: which variables correlated with the spatial footprint of slavery in 1850, and why? Which variables might have driven slavery’s growth, and which were more likely a result of its growth? How do we still see the spatial artifacts of slavery today, 150 years later? What do the maps below tell us, and what do they not tell us? How are the models below insightful, and how are they wrong? What other questions do the maps lead you to ask? How might you go about answering those?

Through the lens of a historian, enjoy: I present to you, a spatial comparison of the percentage enslaved 1850 United States population and a plethora1 of other spatial variables. Scroll through each visualization in the bottom compilation2, and compare its topology, geometry, and character to that of the percentage enslaved visualization3 on top—then, we’ll discuss. If you’re on a slow connection, check out the static maps here.

Before jumping to conclusions, I’d like to first be completely transparent about the process by which I produced these visualizationI’ve mentioned before the dangers inherent in the creation of any visualization. Before we may discuss what the visualizations above tell us, we must discuss how such visualizations were created and how they fail to tell the full storythen, we’re free to hypothesize, question, postulate, propose, and infer.

Before we try to look smart, we must understand and accept how we could—potentiallylook dumb. History’s never as simple as it seems; we must not treat it as such.

Data Retrieval & Processing

The entire dataset used to create the above visualizations was downloaded from the publicly-accessible NHGIS website—but if you’d like to take a look at the specific nuts and bolts behind the above visualizations, feel free to download and investigate my cleaned, processed CSV, GEOJSON and static visualization files for yourself. Transparency’s the name of the game when it comes to honest data journalism, and I’ve got nothing to hide.

Although NHGIS packaged most of the data used above in the same CSV file, I had to join a few additional columns by their GISJOIN identifiers to create a master CSV file in the development of the project, and thereby lost approximately 20 of the 1600+ counties visualized above due to incomplete matching between the tablesluckily, the majority of such losses were in frontier territories of 1850. That’s not to say that the losses are insignificant, but we should naturally expect the census metrics of frontier territories to deviate largely from national averages—thus, the losses in the data were primarily among outlier counties, not among wholly-representative counties. If I had to lose 20 counties, those were probably the best ones to lose.

Normalizing

You’ll note that every metric in the above visualizations is normalized by population to some degree—the reason for this is to avoid high-population washouts, where population-dependent metrics of counties with higher populations overshadow population-dependent metrics of counties with lower populations. Charting the absolute number of illiterate white adults is sure to cause Charleston and Richmond to light up—charting instead the illiteracy rate as a percentage tells us much more from a relative standpoint. Not all metrics express themselves nicely as a percentage, however, so I’ve normalized some metrics—like the number of colleges and libraries—per 10,000 residents, while I’ve normalized others—like the annual dollar value of agricultural output—per capita. The goal here was to keep normalized metrics in a range of numbers that makes sense; humans are bad when it comes to understanding both large and small numbers, so I aimed to normalize the metrics above into a general range of 0.1 to 1000 when choosing each statistic’s respective quantity of normalization.

Binning

Because many population-dependent metrics obey a power-law distribution, I’ve binned several quantities by a pseudo-logarithmic scale to better show that metric’s gradient—in English, I decided to bin the quantities in each visualization to counteract exponential behavior and prevent a handful of high-scoring counties from overshadowing the valuable information of lower-scoring counties on each metric.

It’s important to realize that picking different bin boundaries can create a wholly different map and tell a wildly different story; hence, I exercised my best judgement when selecting bin boundaries with the goal of reflecting an insightful and honest representation in each map. For the sake of transparency, I’ve included the bin boundaries in the subtitle of each visualization to objectify an otherwise subjective decision.

Outliers

One problem that comes with visualizing normalized data is most apparent in counties with small populations—the smaller the sample size, the higher the variance. You’ll notice that some of the frontier territory counties in the visualizations above exhibit both extremely high and extremely low measures for various metrics, and may think those counties have something special going on; however, the more likely explanation is that they’re statistically-insignificant outliers. Taking an average sized number and dividing it by a relatively small population yields a large normalized metric, while taking zero and dividing by anything returns zero—hence, it shouldn’t come as a surprise that frontier counties and counties with small populations appear to span the extremes of each metric in each map. The county in Southern California which lights up on the annual dollar value of manufacturing product per capita map (#13) is likely the result of one man’s gold boom spread across few neighbors, while Oregon and Washington’s absurdly high number of academies and other schools per 10,000 residents (#4) is likely due to a shortage of residents—not an excess of academies.

With this in mind, take the frontier counties with a grain of salt; focus instead on America’s colonial heart and near-west, where statistical noise has had a chance to quiet down over time.

Granularity

Analyzing these metrics at a county-level tells us more than we would learn from a state-level analysis; however, it’s worth noting that even county-level analyses are extremely abstract in comparison to a more granular analysis. All models are wrong, and a model’s accuracy is indirectly proportional to its level of abstraction—thus, it’s our responsibility to recall that no human is the average. By no means is the lived experience of the thousands of individuals within each county reflected by a single number, or a collection of 20—so, while our models above are useful, they only convey a high-level fragment of the story.

Style & Presentation

While something as simple as color and font choice shouldn’t make a difference in the transmission of information, our subconscious biases argue otherwise; psychology, like history, is anything but simple. I grouped metrics from similar families (i.e. education, agriculture, manufacturing, etc.) by color, but chose those colors arbitrarily—generally speaking, however, I stuck to the principle that darker = more within each map.

Although I’m not a huge fan of Flourish’s story layout for a non-time-series set of visualizations, beggars can’t be choosers. Using the scrolling-story layout was the best way for me to fit 20 interactive, dynamic, embedded visualizations each with their own binning and coloring onto a single webpage—alas, enjoy the story layout. A dropdown menu would be preferable, but this will have to do.

If you’re tired of scrolling through 19 other visualizations to see the 20th, feel free to check out the static maps here.

Correlation ≠ Causation

As a final note, keep in mind that correlation and causation are two diametrically distinct phenomena. The presence of slavery in a region may, in fact, have been reinforced as a result of poor education or poor industrialization in a region, just as a high per capita annual agricultural product in a region may have been the result of slavery. The purpose of this visualization is to facilitate the analysis of correlation between various census metrics and each county’s percentage enslaved population—to make an argument with respect to causation is much more difficult. To argue in terms of causation requires primary source investigation, secondary source consultation, and many hours of synthesis; to argue in terms of correlation is as simple as matching colors on a map.

In this sense, it’s worth contextualizing the above visualizations as a research tool rather than an argument; to quote Stanford Professor of History Richard White,

 …visualization and spatial history are not about producing illustrations or maps to communicate things that you have discovered by other means. It is a means of doing research; it generates questions that might otherwise go unasked, it reveals historical relations that might otherwise go unnoticed, and it undermines, or substantiates, stories upon which we build our own versions of the past.

Keeping the above considerations in mind, let’s discuss the insights which our visualizations do conveydue to the sheer number of them, however, I’ll stay high-level.

Urban Concentration

% Enslaved, 1850 US Population (by county) vs. various metrics of urban growth. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
My first adventure in geospatial analysis compared the spread of the city with the spread of slavery in the United States from 1790 to 1850; thus, the blue maps in the compilation aren’t anything new. Regardless, I’ll reiterate that it’s interesting to note the clear delineation between a more urban North and more rural South in each map. Aside from the bright spots of New Orleans, Mobile, Savannah, Charleston, Richmond, and Baltimore, urban growth south of the Mason-Dixon Line appears to lag behind the urban growth of the North—at least given the snapshot of 1850.

While there’s no sense arguing that the spread of slavery suppressed urban growth or that urban growth suppressed the spread of slavery slavery given a single map and my note on correlation causation above, I find it fascinating to ponder how this early lack of towns, cities, and communities in slaveholding regions has shaped the nation in which we live today. Society is a chaotic system in which small changes to initial conditions lead to large deviations in outcomes, and history is somewhat of a stochastic process; hence, I encourage you to consider how different our nation might look today if the script was flippedif urban growth was higher in slaveholding regions. This line of thought naturally conflicts with the agrarian roots of chattel slavery in the United States—but what if chattel slavery was instead rooted in manufacturing and craftsmanship? What would our nation look like today?

Education

% Enslaved, 1850 US Population (by county) vs. various metrics of education. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
One can clearly see a boundary between North and South, between free America and enslaved America, in the visualization of the number of public schools per 10,000 residents. Whether this is a product of slavery’s legality or a driver of its propagation—whether the arrow of causation points in one direction or another—is beyond our scope of argument—yet regardless, it’s fascinating to note the strong correlation evident in the chloropleth. Perhaps public school density was a function of urban density, in which case this visualization naturally follows the trajectory of the preceding urban growth maps—or, perhaps it was a standalone variable subject to its own laws. Either way, it’s clear to see that slaveholding regions tended to have fewer public schools per capita.

Whether such a metric is indicative of the quality of education isn’t as straightforward, especially when considering that the per-capita distribution of colleges and academies was more balanced. The visualizations of the percent of white and nonwhite free persons attending school suggest that access to education was more open in the North than in the South, as one would expect, but the disparity between these two maps tells a concerning story: in spite of legal status, persons of color were denied equal educational opportunities almost unilaterally across the United States. In only a few counties of Maine, New York, and the frontier territories of Wisconsin and Illinois is the rate of nonwhite free persons attending school comparable to the rate of white persons attending school, reflective of a sad racist reality. De jure oppression may have been eliminated across the North, but de facto oppression clearly persisted, and still persists to this day.

Literacy

% Enslaved, 1850 US Population (by county) vs. various metrics of literacy. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
As one would expect, the distribution of illiterate populations opposes the distribution of public schools per 10,000 persons: here, the direction of causation is more obvious. Still, it’s worth considering that support for public education is likely more valued by literate citizens, and thus illiteracy as a product of a lack of education probably serves to reinforce a future lack of access to education.

The distribution of libraries mirrors the distribution of public schools per 10,000 residents, and opposes the distribution of illiterate adults, as one would—again—expect. Were libraries not as valued by illiterate populations, or did populations deem learning to read as a waste of time in a library vacuum? Here, we may tentatively propose a direction of causation: because literacy and access to libraries facilitated the spread of ideas, including those of rebellion, natural equality, and revolution, it makes sense for literacy rates to be suppressed in regions of high enslavement as a product of nervous enslavers.

How this historic suppression of literacy, either as a product of slavery’s legality or as a driver of its perpetuation, changed the course of our nation’s history is both fascinating and terrifying to consider.

Manufacturing

% Enslaved, 1850 US Population (by county) vs. various metrics of manufacturing activity. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
Manufacturing’s geographic distribution again follows a North-South split—what I find interesting, however, is the high density of manufacturing activity along the Mississippi and Alabama Gulf Coast, and in the more urban areas of Northern Virginia and Maryland. Calvin Schermerhorn discusses the adaptation of slavery to industrial settings—particularly in Virginia and Maryland—in his book Unrequited Toil, providing an explanation for some of these disparities, but I’m not familiar with the 1850 drivers of industry along the Gulf Coast. Perhaps most insightful is fact that these counties along the Gulf Coast which light up as hotspots of manufacturing activity tended to have lower rates of enslavement than their neighboring counties, suggesting that dethroning cotton was—in some sense—synonymous with debasing the horrific institution of slavery.

Religion

% Enslaved, 1850 US Population (by county) vs. various metrics of religion. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
While the distribution of church property value per capita follows a North-South split, the distribution of churches per 10,000 residents is much more even across the entire United States—with the exception of regions along the Mississippi River and South Atlantic Coast where the rate of enslavement was highest. If it were possible to overlay the maps of each county’s enslavement rate and number of churches per 10,000 residents, these regions would seem to be photographic negatives of one another. Whether religion spread the notion that slavery was immoral, thereby weakening its practice in a region, or that the practice of slavery discouraged the spread of religion because of its revolutionary potential is an impossible question to answer given only a map—but it’s an important question to consider. Like literacy, religion was a means which facilitated the spread of information and organization, two of an enslaver’s worst nightmares—thus, it makes sense that enslavers might try to suppress the influence of religion on a region. Such suppression stands in stark contrast to the high church density of Appalachian Georgia, North Carolina, and Tennessee, where slavery’s grip was much weaker.

Agriculture

% Enslaved, 1850 US Population (by county) vs. various metrics of agricultural activity. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
The visualizations associated with agriculture tell a story of Southern monopolization and concentration contrasted by Northern multiplicity—where stolen labor encouraged the formation of fewer, larger, illegitimately-profitable plantations in the South, tradition, legitimate labor, and entrepreneurship encouraged the persistence of more plentiful, moderately-sized family farms across the North. Such Southern concentration is most easily visible along the Mississippi River where the annual dollar value of agricultural output per capita was among the highest in the nation, while the number of farms per 10,000 residents was among the lowest; such Northern multiplicity is best visible in Pennsylvania, Ohio, Indiana and Illinois, where both the annual dollar value of agricultural output per capita and the number of farms per 10,000 residents were middle-of-the-road. The map of per-acre farmland and farm building value reflects this as well, though more implicitly—recall that dividing any number by a large number returns a small number, while dividing any number by a small number returns a large one. Per-acre farmland and farm building value is, therefore, likely low across the South due to large tracts under the control of a single plantation owner, and high across the North due to smaller tracts managed by families.

Transportation

% Enslaved, 1850 US Population (by county) vs. various metrics of transportation. For a larger, interactive map, see embedded visualizations above, or standalone visualizations linked in the footnotes below. [data: NHGIS, visualization: Flourish]
Due to the boolean (true = 1, false = 0) nature of the rail and water transport access maps, we’re more limited in the arguments we can make and the insights we can glean—still, we’re better off discussing the limited data we have than we are simply ignoring it. Slavery, like all growth in the formative years of the United States, appears to have spread along the arteries of our nation’s rivers, likely due to a) the resulting ease of slave importation, b) the resulting ease of crop exportation, and c) the associated fertile land along riverbanks and floodplains. As a result, one can clearly see the veins of slavery follow the Mississippi and Ohio Rivers northward through the heart of the United States while its capillaries permeate across coastal Georgia, Carolina, and Virginia where small rivers drain into the Atlantic. Even in Missouri, the effect of water on slavery’s spread is evident: where the Missouri River cuts through the center of the state, a corresponding patch of orange streaks across the percentage enslaved visualization.

Railroads, on the other hand, seem to exhibit a distribution more independent from that of slavery, weaving arbitrarily throughout Alabama, Georgia, South Carolina, North Carolina, and Virginia in the South, and through the majority of New England in the North. Such independence is to be (somewhat) expected—by the time railroads began their rise to become our nation’s preferred method of transportation, land had been partitioned, plantations established, and seeds of slavery sewn tens of years prior. Instead, railroads grew where they delivered the most benefit: to the industrial centers of the North, and to landlocked regions where accessible water transport was unavailable—both antithetical to regions where slavery was dominant. In the long run, this would turn out to be detrimental to the economic development within these regions; in the short run, perhaps their early Southern stagnation was for the better. After all, aside from their potential to spread ideas and information, railroads were far from a godsend to the majority of the enslaved population within a region.

Causation aside, it appears as though slavery’s spread was positively correlated with a county’s access to water transport, and negatively correlated to a county’s access to rail transport.

• • •

Now comes the part I’ve been dreading. Even if this post marks only the end of the beginning, it’s still bittersweet.

Over the course of the past 15 weeks, I’ve grown as a writer, reader, thinker, judge, investigator, analyst, detective, and theorist—not just as a historian. I’ve changed the way I approach the world, seeing the complex behind the simple, the system behind the actor, the assumptions behind the obvious, and the lies behind the truth. I’ve learned to seek questions over answers, curiosity over complacency, and comfort in the discomfort of not knowing but wanting to. I’ve learned to appreciate grey over black or white, to search for the continuous over the discrete, and to value the qualitative over the quantitative.

I’ve learned that all models are wrong, but that some are useful; I’ve learned that a forest is not wholly defined by its trees. I’ve learned that it’s our duty to embrace the second half of “digital humanities” over the first; I’ve learned that nothing can convey emotion like a primary source.

I’ve learned that history is more than just the study of the past; I’ve learned that history is instead the study of what led and continues to lead humans to make decisions which shape the future. I’ve learned that history is the study of what motivated our finest hours and trapped us in our darkest; I’ve learned that history is the study of countless improbable circumstances collaborating to transform the unpredictable into the trivial.

I’ve learned that history is both ugly, twisted, mutilated, scarred, bruised, battered, and beautiful, alluring, enchanting, seducing, engaging, fascinating, impressing. I’ve learned that history is by nature dual: for every positive, there exists a negative; for every right, there exists a wrong; for every up, there exists a down.

Above all, I’ve learned that history has value. It is the collective study of us, and we all, therefore—by definition—have the most to benefit from it.

Agree or disagree, I rest my case.

We are not makers of history; we are made by history.

-Dr. Martin Luther King, Jr.

    1. % Population in Urban Centers of at least 2,500 Persons
    2. % Population in Urban Centers of at least 25,000 Persons
    3. Number of Public Schools per 10,000 Persons
    4. Number of Academies and Other Schools per 10,000 Persons
    5. Number of Colleges per 10,000 Persons
    6. % White Population Attending School
    7. % Nonwhite Free Population Attending School
    8. Number of Libraries per 10,000 Persons
    9. % Illiterate White Adults
    10. % Illiterate Nonwhite Free Adults
    11. % Population Employed in Manufacturing Establishments
    12. $ Annual Investment in Manufacturing Establishments Per Capita
    13. $ Annual Product of Manufacturing Establishments Per Capita
    14. Number of Churches per 10,000 Persons
    15. $ Value of Church Property Per Capita
    16. Number of Farms per 10,000 Persons
    17. $ Annual Output of Agriculture Per Capita
    18. Average $ Value of Farmland and Farm Buildings per Acre
    19. Access to Water Transport
    20. Access to Rail Transport

  1. https://public.flourish.studio/story/38386/
  2. https://public.flourish.studio/visualisation/325345/