Merge pull request #12 from philpatton/clean

Clean up
philpatton · May 17, 2024 · 8d1237c · 8d1237c
2 parents 1538180 + 6a17a1f
commit 8d1237c
Show file tree

Hide file tree

Showing 69 changed files with 2,605 additions and 896 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/.ipynb_checkpoints/occupancy-checkpoint.ipynb b/.ipynb_checkpoints/occupancy-checkpoint.ipynb
diff --git a/_quarto.yml b/_quarto.yml
@@ -35,7 +35,7 @@ website:
               text: Occupancy
             - href: closed_cmr.qmd
               text: Closed capture-recapture
-            - href: dist.qmd
+            - href: distance.qmd
               text: Distance sampling
             - href: scr.qmd
               text: Spatial capture-recapture

diff --git a/about.qmd b/about.qmd
diff --git a/closed_cmr.qmd b/closed_cmr.qmd
@@ -37,25 +37,22 @@ import pandas as pd
 import pymc as pm
 import arviz as az
 import matplotlib.pyplot as plt 
-import seaborn as sns
 from pymc.distributions.dist_math import binomln, logpow
 
+# plotting parameters
 plt.style.use('fivethirtyeight')
 plt.rcParams['axes.facecolor'] = 'white'
 plt.rcParams['figure.facecolor'] = 'white'
-# pal = sns.color_palette("Set2")
-# sns.set_palette(pal)
 
 # hyperparameters 
 SEED = 808
 RNG = np.random.default_rng(SEED)
-M = 1500
 ```
 
 I explore fitting the simplest closed capture-recapture model, Model $M_0,$ through parameter-expanded data-augmentation [PX-DA, @royle2008]. The idea with PX-DA is to augment the capture histories with $M-n$ all zero capture-histories, where $M$ is a hyperparameter that should be much greater than the true population size $N,$ and $n$ is the total number of individuals that were captured during the study. This allows us to treat the data as a zero-inflated binomial distribution (see below).
 
 ```{python}
-def augment_history(history):
+def augment_history(history, M):
     '''Augment a capture history with all-zero histories.'''
     
     animals_captured, T = history.shape
@@ -70,7 +67,7 @@ def augment_history(history):
     return augmented 
 ```
 
-To demonstrate this approach, I use the salamander dataset from @bailey2004, as demonstrated in @hooten2019, Chapter 24. These data were collected on two salamander species, the red-cheeked salamander (*Plethodon jordani*) and the pygmy salamander (*Desmognathus wrighti*), in Great Smoky Mountains National Park. The salamanders were counted in 15m by 15m square plots. In this case, we augment the history by setting $M=1500$ (see above). There were $n=92$ individual red-cheeked and $n=132$ pygmy salamanders captured during the course of the survey.
+To demonstrate this approach, I use the salamander dataset from @bailey2004, as demonstrated in @hooten2019, Chapter 24. These data were collected on two salamander species, the red-cheeked salamander (*Plethodon jordani*) and the pygmy salamander (*Desmognathus wrighti*), in Great Smoky Mountains National Park. The salamanders were counted in 15m by 15m square plots. In this case, we augment the history by setting $M=1500$. There were $n=92$ individual red-cheeked and $n=132$ pygmy salamanders captured during the course of the survey.
 
 ```{python}
 def get_histories():
@@ -89,19 +86,12 @@ def get_histories():
     pyg = sal_data.loc[is_pyg, col_labs].to_numpy()
     red = sal_data.loc[is_red, col_labs].to_numpy()
 
-    # # augment each set separately since they differ in length
-    # pyg_augmented = augment_history(pyg)
-    # red_augmented = augment_history(red)
-
-    # # recombine into one history 
-    # history = np.concatenate((pyg_augmented, red_augmented))
-
     return {'pyg': pyg, 'red': red}
 
-def augment_histories(histories):
+def augment_histories(histories, M):
 
-    pyg_augmented = augment_history(histories['pyg'])
-    red_augmented = augment_history(histories['red'])
+    pyg_augmented = augment_history(histories['pyg'], M=M)
+    red_augmented = augment_history(histories['red'], M=M)
 
     # recombine into one history 
     history = np.concatenate((pyg_augmented, red_augmented))
@@ -114,11 +104,12 @@ n_red, T = histories['red'].shape
 n_pyg, T = histories['pyg'].shape
 
 # # summarize into binomial data
-history_augmented = augment_histories(histories)
+M = 1500
+history_augmented = augment_histories(histories, M=M)
 history_summarized = history_augmented.sum(axis=1)
 ```
 
-For this model, I use the `pm.ZeroInflatedBinomial` class, just as I did in the [occupancy notebook](https://philpatton.github.io/occ.html). That said, the parameters here are different. First, $p$ represents the probability of capturing a given individual during the survey. Second, $\psi$ represents a mysterious entity known as the inclusion probability. That is, the probability that an individual from the hypothetical superpopulation $M$ is included in the population of interest $N.$ Then, we can estimate the population size as $\hat{N}=M\hat{\psi},$ or generate posterior draws of $N,$ e.g., $N^{(s)} \sim \text{Bin}(M,\psi^{(s)})$
+For this model, I use the `pm.ZeroInflatedBinomial` class, just as I did in the [occupancy notebook](https://philpatton.github.io/occ.html). That said, the parameters here are different. First, $p$ represents the probability of capturing a given individual during the survey. Second, $\psi$ represents a mysterious entity known as the inclusion probability. That is, the probability that an individual from the hypothetical superpopulation $M$ is included in the population of interest $N.$ Then, we can simulate the posterior distribution for $N$ using $M$ and the posterior distributions of $\psi.$
 
 In this example, I combine the two species into one `pm.Model` object, making use of `coords`. That said, the parameters for each species are treated as independent. In other words, this is a "no-pooling" model. 
 
@@ -164,7 +155,7 @@ az.plot_trace(M0_idata, figsize=(8,4));
 
 For faster sampling, it's better to separate the two species into two separate models. On my machine, the individual species models finish sampling in 2-3 seconds, compared to 15-20 seconds for the two species model. That said, the two species model is somewhat more convenient.
 
-Of course, the trace plots lack our true parameter of interest: the population size $N.$ We can simulate the posterior of $N$ as a *derived quantity*, using the posterior distribution of $\psi$ and $p.$ 
+Of course, the trace plots lack our true parameter of interest: the population size $N.$ We can simulate the posterior of $N$ as a *derived quantity*, using $M$ and the posterior distribution of $\psi$.
 
 ```{python}
 # az.extract flattens the chains
@@ -203,7 +194,7 @@ ax.legend()
 plt.show()
 ```
 
-We might expect estimates of capture probability $p$ and the abundance $N$ to be somewhat correlated. We can explore this relationship visually by plotting the posterior draws. For a more custom look to the plots, I plot the draws using matplotlib.
+We might expect estimates of capture probability $p$ and the abundance $N$ to be somewhat correlated. We can explore this relationship visually by plotting the posterior draws. 
 
 ```{python}
 #| fig-cap: Posterior draws of $N$ and $p$ for both species of salamander.
@@ -250,7 +241,7 @@ n, T = micro_hist.shape
 
 # augment with all zero histories
 M = 100
-micro_augmented = augment_history(micro_hist)
+micro_augmented = augment_history(micro_hist, M=M)
 
 # note the occasion when each individual was first seen
 first_seen = (micro_hist != 0).argmax(axis=1)

diff --git a/cv.qmd b/cv.qmd
@@ -46,7 +46,7 @@ description: Graduate Research Assistant & NOAA QUEST Fellow
 
 1. **Patton, P.T.** Some hierarchical and machine learning models for wildlife science. Invited talk at University of Natural Resources and Life Sciences, Vienna (BOKU). July 2023.
 2. **Patton, P.T.** et al. The effect of fully automated photo--identification on mark-recapture estimates. Paper presented at the EURING Analytical Meeting. Montpellier, France. April 2023
-3. **Patton, P.T.** Assessing populations of resident cetaceans. HIMB Scholarship Symposium. K\=ane`ohe, Hawai`i. April 2022.
+3. **Patton, P.T.** Assessing populations of resident cetaceans. HIMB Scholarship Symposium. Kāneʻohe, Hawaiʻi. April 2022.
 4. **Patton, P. T.** & Gardner, B. Misspecifying movement models in spatial capture recapture studies. Paper presented at The Ecological Society of America Conference. Portland, OR, USA. August 2017
 5. **Patton, P. T.** et al. Modeling and estimating co--occurrence between generalist brood parasites and host communities. Paper presented at the EURING Analytical Meeting. Barcelona, Spain. June 2017
 6. **Patton, P. T.** et al. Multi--species occupancy models that incorporate false positive and false negative sampling errors. Paper presented at The Wildlife Society Conference. Raleigh, NC, USA. October 2016
@@ -70,7 +70,7 @@ description: Graduate Research Assistant & NOAA QUEST Fellow
 
 - Referee: *Wildlife Society Bulletin*, *Marine Mammal Science*
 - Member: British Ecological Society, The Wildlife Society (biometrics working group), The Ecological Society of America (statistical ecology section)
-- Representative to the Faculty, Marine Biology Graduate Program, University of Hawai`i at Mānoa
-- Representative to the Graduate Student Organization, Marine Biology Graduate Program, University of Hawai`i at Mānoa
+- Representative to the Faculty, Marine Biology Graduate Program, University of Hawaiʻi at Mānoa
+- Representative to the Graduate Student Organization, Marine Biology Graduate Program, University of Hawaiʻi at Mānoa
 
 ---
diff --git a/dist.qmd → distance.qmd b/dist.qmd → distance.qmd
@@ -116,16 +116,6 @@ az.plot_trace(
 This model samples slower than the models in the other notebooks, presumably because of the unobserved $x.$ As in the closed capture-recapture notebook, we will have to simulate the posterior for $N$ using the posterior distribution of $\psi$ and $M.$ 
 
 ```{python}
-# RNG = np.random.default_rng()
-
-# this does not make a copy of the posterior
-# post = distance_idata.posterior
-
-# # simulate draws of N using the posterior of psi 
-# N_samples = RNG.binomial(M, post.psi)
-# N_samples = N_samples.flatten()
-# N_hat = N_samples.mean()
-
 RNG = np.random.default_rng()
 
 posterior = az.extract(distance_idata)

diff --git a/docs/.DS_Store b/docs/.DS_Store
diff --git a/docs/cjs.html b/docs/cjs.html
@@ -2,7 +2,7 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.4.549">
+<meta name="generator" content="quarto-1.4.554">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
@@ -246,7 +246,7 @@
 </li>
           <li class="sidebar-item">
   <div class="sidebar-item-container"> 
-  <a href="./dist.html" class="sidebar-item-text sidebar-link">
+  <a href="./distance.html" class="sidebar-item-text sidebar-link">
  <span class="menu-text">Distance sampling</span></a>
   </div>
 </li>
@@ -728,6 +728,24 @@ <h1>Cormack-Jolly-Seber</h1>
     // clear code selection
     e.clearSelection();
   });
+    var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
+    var mailtoRegex = new RegExp(/^mailto:/);
+      var filterRegex = new RegExp('/' + window.location.host + '/');
+    var isInternal = (href) => {
+        return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
+    }
+    // Inspect non-navigation links and adorn them if external
+ 	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool)');
+    for (var i=0; i<links.length; i++) {
+      const link = links[i];
+      if (!isInternal(link.href)) {
+        // undo the damage that might have been done by quarto-nav.js in the case of
+        // links that we want to consider external
+        if (link.dataset.originalHref !== undefined) {
+          link.href = link.dataset.originalHref;
+        }
+      }
+    }
   function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
     const config = {
       allowHTML: true,
@@ -762,7 +780,11 @@ <h1>Cormack-Jolly-Seber</h1>
       try { href = new URL(href).hash; } catch {}
       const id = href.replace(/^#\/?/, "");
       const note = window.document.getElementById(id);
-      return note.innerHTML;
+      if (note) {
+        return note.innerHTML;
+      } else {
+        return "";
+      }
     });
   }
   const xrefs = window.document.querySelectorAll('a.quarto-xref');
@@ -1038,7 +1060,7 @@ <h1>Cormack-Jolly-Seber</h1>
 </script>
 <nav class="page-navigation">
   <div class="nav-page nav-page-previous">
-      <a href="./msom.html" class="pagination-link  aria-label=" community="" occupancy"="">
+      <a href="./msom.html" class="pagination-link" aria-label="Community occupancy">
         <i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">Community occupancy</span>
       </a>          
   </div>