`PySAL`

¶- Dani Arribas-Bel (@darribas)

This notebook is available as a gist, or viewable online as static html.

Local Indicators of Spatial Association (LISAs, Anselin, 1995) are a common tool to explore spatial heterogeneity, identify spatial concentration of (dis)similar values and test for the probability such agglomerations originate from a random process.

`PySAL`

has had functionality to run LISAs since its very beginning. However, the output of such computation is not very useful just by itself as, being a local statistic, LISAs imply running a test for every single observation. The usual approach has been to visualize significant clusters on a map using a simple color coding for each class (High-High, HH; Low-Low, LL; High-Low, HL; Low-High, LH). Such visualiations have been available for a long time in packages such as GeoDa, but this would imply breaking the workflow to switch between Python and GeoDa, plus with the annoyance that the workflow cannot thus be automated, a convenient feature for reproducible science.

In this notebook, we show how to leverage the visualization layer that is being built in `PySAL`

to create generic and custom LISA cluster maps.

For this example, we will use the `NAT`

dataset, included in `PySAL`

by default, exploring the variable `HR90`

, homicide rates in 1990 at the county level.

Let's start by importing the required code. This is all code that will be made available in the upcoming release of `PySAL`

in July 2015 (if you're reading this afterwards, it's all in there by default!).

In [5]:

```
%matplotlib inline
import pysal as ps
import numpy as np
from pysal.contrib.viz import mapping as maps
shp_link = ps.examples.get_path('NAT.shp')
print 'Reading from ', shp_link
hr90 = np.array(ps.open(shp_link.replace('.shp', '.dbf')).by_col('HR90'))
```

Before any further analysis, it is always good practice to visualize the distribution of values. We can now conveniently do that in `PySAL`

:

In [15]:

```
_ = maps.plot_choropleth(shp_link, hr90, 'quantiles', cmap='Greens', figsize=(9, 6))
```

Before we can plot a map, we need to run a LISA. In the spirit of flexibility and modularity envisioned in `PySAL`

, the two operations (computation, visualization) are decoupled. We will use a simple contiguity matrix for this.

In [45]:

```
w = ps.queen_from_shapefile(shp_link)
lisa = ps.Moran_Local(hr90, w, permutations=9999)
```

The simplest way to obtain a quick visualization of LISA results is with the "one-liner" `plot_lisa_cluster`

:

In [46]:

```
_ = maps.plot_lisa_cluster(shp_link, lisa, figsize=(9, 6))
```

`cartopy`

¶The previous map is a very quick, painless way to obtain a visualization of LISA results. This is incredibly useful for interactive work where one wants to quickly iterate to get a sense of the data. However, once the researcher settles on a particular visualiation, there's some "eye-candy" that can be applied that improves notably the final result.

In this section, we use the fantastic library `cartopy`

to produce a publication-ready LISA plot. This, in part, uses some of the internal machinery in `pysal.contrib.viz`

, combined with plain `matplotlib`

functionality.

**NOTE**: besides `PySAL`

, you will need `cartopy`

installed in your system to reproduce this example.

In [24]:

```
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
```

In [47]:

```
orig_crs = ccrs.PlateCarree()
projection = ccrs.LambertConformal()
p_thres = 0.01
shp = ps.open(shp_link)
polys = maps.map_poly_shp(shp)
polys = maps.base_lisa_cluster(polys, lisa, p_thres=p_thres)
polys.set_edgecolor('1')
polys.set_linewidth(0.2)
polys.set_transform(orig_crs)
f = plt.figure(figsize=(12, 8))
ax = plt.axes(projection=projection)
extent = [shp.bbox[0], shp.bbox[2], shp.bbox[1], shp.bbox[3]]
ax.set_extent(extent, crs=ccrs.PlateCarree())
ax.add_collection(polys)
ax.outline_patch.set_visible(False)
boxes, labels = maps.lisa_legend_components(lisa, p_thres=p_thres)
plt.legend(boxes, labels, loc='lower left', fancybox=True)
plt.title('HR90 | LISA clusters | P-value = %.2f'%p_thres)
plt.show()
```

Having this kind of maps produced in an automated way is handy because it is very straightforward then to plot a series of them. As an example, we will plot a different map for solutions that consider different significance thresholds.

In [48]:

```
orig_crs = ccrs.PlateCarree()
projection = ccrs.LambertConformal()
p_thresS = [0.1, 0.05, 0.01, 0.001]
f = plt.figure(figsize=(16, 10))
for i, p_thres in enumerate(p_thresS):
shp = ps.open(shp_link)
polys = maps.map_poly_shp(shp)
polys = maps.base_lisa_cluster(polys, lisa, p_thres=p_thres)
polys.set_edgecolor('1')
polys.set_linewidth(0.2)
polys.set_transform(orig_crs)
ax = plt.subplot(2, 2, i+1, projection=projection)
extent = [shp.bbox[0], shp.bbox[2], shp.bbox[1], shp.bbox[3]]
ax.set_extent(extent, crs=ccrs.PlateCarree())
ax.add_collection(polys)
ax.outline_patch.set_visible(False)
boxes, labels = maps.lisa_legend_components(lisa, p_thres=p_thres)
plt.legend(boxes, labels, loc='lower left', frameon=False)
ax.set_title('P-value = %.3f'%p_thres)
plt.show()
```