Pandas - GeoPandas

 

GeoPandas
As you can see, the type of data is not limited: integers, floats, strings, pictures, or even a map. Here, on this page, we will show how GeoPandas is the strong tool to manage map data. Here is the data contains county-level data for cumulative cases and deaths resulting from Covid - 19 in the US.
Using the informative above, we want to animate the visualization of how Covid - 19 was spread through out the states.
1. States Plot:
First plot we want is a states map which will be the base of all the events. First of all, I received a county data including necessary US county data to plot. Since it's hard to see some states like Hawaii, due to the size of the state, I dropped those almost invisible states from the data frame. The, I dissolved the data frame by each State FP since there were many rows that have State FP numbers. At this point, it is extremely important that county_df and states_map have the same CRS number. CRS stands for Coordinate Reference System representing and locating geographic features on the Earth's surface. With that reason, it is crucial that we have the appropriate CRS number. Lastly, we boundary plot because we don't want to confuse readers with some color in the boundary with colors we want to depict after.


2. Plot Cases:
Now we have a base to draw cases on, let's draw Covid cases on March 21st, 2020.  I received the covid data here and excluded cases from Hawaii. After that we want to merge two different data frames covid and county_df but we don't have a shared column. So, we created a column called, "fips", composed of a unique 5 digit number: the first two numbers are state fp number and the last three digits are the county fp. Of course, we change the new data frame after the merge to have the same CRS number by using to_crs(). After that we plot the states boundary map and plot the cases on march 21st, 2020: 



3. Norm Scale:
Note from the above map, we don't really see the different degree of covid cases throughout the states. For that we can use LogNorm so that we can see cases in details. I used LogNorm and ranged the values based on covid_us.cases data. Then, I plugged the cases and states map for the base as above and this is what we can see now:

4. Animation:
In the beginning, we set out goal to visualize how Covid-19 was spread. We do pretty much the same thing to plot but define a new function: Update. Instead of only March 21st, 2020, we keep a single date as an input for the function so that we can present each Covid-19 case on each different date. Then we use animation.FuncAnimation function to plot:
After a long few minutes of running, we get this as a result:




*GitHub:  https://github.com/KwakSukyoung/coding/blob/master/ACME/Pandas4/pandas4.ipynb













Comments