update load json

lorenz-peter · Nov 4, 2024 · f8de48d · f8de48d
1 parent cc4ce4a
commit f8de48d
Show file tree

Hide file tree

Showing 4 changed files with 334 additions and 20 deletions.
diff --git a/_posts/2023-03-21-tables.md → _posts/2023-03-21-rw-ms.md b/_posts/2023-03-21-tables.md → _posts/2023-03-21-rw-ms.md
@@ -28,6 +28,28 @@ A note on the data: This list updates automatically with new papers, sometimes b
 
 Below, you'll find the comprehensive paper list. I've also provided [JSON](https://github.com/lorenz-peter/lorenz-peter.github.io/blob/master/assets/json/model_stealing_papers.json) file containing the same data, including one with abstracts. If you use this data for any interesting projects, I'd love to hear about your experiences.
 
+{::nomarkdown}
+{% assign jupyter_path = "assets/jupyter/load_json.ipynb" | relative_url %}
+{% capture notebook_exists %}{% file_exists assets/jupyter/load_json.ipynb %}{% endcapture %}
+{% if notebook_exists == "true" %}
+{% jupyter_notebook jupyter_path %}
+{% else %}
+
+<p>Sorry, the notebook you are looking for does not exist.</p>
+{% endif %}
+{:/nomarkdown}
+
+## Acknowledgment
+
+The idea is derived from Nicolas Carlini:
+[nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html](https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html).
+
+Recently, another website was deployed to discover research trends, [researchtrend.ai](https://researchtrend.ai/communities/AAML).
+
+
+## Table
+
+
 <table
   data-toggle="table"
   data-show-fullscreen="true"
@@ -54,9 +76,3 @@ Below, you'll find the comprehensive paper list. I've also provided [JSON](https
   }
 </script>
 
-## Acknowledgment
-
-The idea is derived from Nicolas Carlini:
-[nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html](https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html).
-
-Recently, another website was deployed to discover research trends, [researchtrend.ai](https://researchtrend.ai/communities/AAML).
diff --git a/_posts/2024-11-04-notebook.md b/_posts/2024-11-04-notebook.md
@@ -0,0 +1,45 @@
+---
+layout: post
+title: a post with jupyter notebook
+date: 2023-07-04 08:57:00-0400
+description: an example of a blog post with jupyter notebook
+tags: formatting jupyter
+categories: sample-posts
+giscus_comments: true
+related_posts: false
+---
+
+To include a jupyter notebook in a post, you can use the following code:
+
+{% raw %}
+
+```liquid
+{::nomarkdown}
+{% assign jupyter_path = 'assets/jupyter/blog.ipynb' | relative_url %}
+{% capture notebook_exists %}{% file_exists assets/jupyter/blog.ipynb %}{% endcapture %}
+{% if notebook_exists == 'true' %}
+  {% jupyter_notebook jupyter_path %}
+{% else %}
+  <p>Sorry, the notebook you are looking for does not exist.</p>
+{% endif %}
+{:/nomarkdown}
+```
+
+{% endraw %}
+
+Let's break it down: this is possible thanks to [Jekyll Jupyter Notebook plugin](https://github.com/red-data-tools/jekyll-jupyter-notebook) that allows you to embed jupyter notebooks in your posts. It basically calls [`jupyter nbconvert --to html`](https://nbconvert.readthedocs.io/en/latest/usage.html#convert-html) to convert the notebook to an html page and then includes it in the post. Since [Kramdown](https://jekyllrb.com/docs/configuration/markdown/) is the default Markdown renderer for Jekyll, we need to surround the call to the plugin with the [::nomarkdown](https://kramdown.gettalong.org/syntax.html#extensions) tag so that it stops processing this part with Kramdown and outputs the content as-is.
+
+The plugin takes as input the path to the notebook, but it assumes the file exists. If you want to check if the file exists before calling the plugin, you can use the `file_exists` filter. This avoids getting a 404 error from the plugin and ending up displaying the main page inside of it instead. If the file does not exist, you can output a message to the user. The code displayed above outputs the following:
+
+{::nomarkdown}
+{% assign jupyter_path = "assets/jupyter/blog.ipynb" | relative_url %}
+{% capture notebook_exists %}{% file_exists assets/jupyter/blog.ipynb %}{% endcapture %}
+{% if notebook_exists == "true" %}
+{% jupyter_notebook jupyter_path %}
+{% else %}
+
+<p>Sorry, the notebook you are looking for does not exist.</p>
+{% endif %}
+{:/nomarkdown}
+
+Note that the jupyter notebook supports both light and dark themes.
diff --git a/assets/jupyter/load_json.ipynb b/assets/jupyter/load_json.ipynb
@@ -0,0 +1,237 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Load JSON FILE"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import json\n",
+    "import pandas as pd"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>date</th>\n",
+       "      <th>title</th>\n",
+       "      <th>author</th>\n",
+       "      <th>link</th>\n",
+       "      <th>abstract</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>2014-12-30</td>\n",
+       "      <td>Detecting Malicious Code by Exploiting Depende...</td>\n",
+       "      <td>Stavros D. Nikolopoulos, and Iosif Polenakis</td>\n",
+       "      <td>http://arxiv.org/abs/1412.8712v1</td>\n",
+       "      <td>In this paper we present an elaborated graph-b...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2014-12-30</td>\n",
+       "      <td>Percolation Model of Insider Threats to Assess...</td>\n",
+       "      <td>Jeremy Kepner, Vijay Gadepally, and Pete Micha...</td>\n",
+       "      <td>http://arxiv.org/abs/1412.8699v1</td>\n",
+       "      <td>Rules, regulations, and policies are the basis...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2014-12-29</td>\n",
+       "      <td>Bloom Filters in Adversarial Environments</td>\n",
+       "      <td>Moni Naor, and Eylon Yogev</td>\n",
+       "      <td>http://arxiv.org/abs/1412.8356v5</td>\n",
+       "      <td>Many efficient data structures use randomness,...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>2014-12-27</td>\n",
+       "      <td>Attacks exploiting deviation of mean photon nu...</td>\n",
+       "      <td>Shihan Sajeed, Igor Radchenko, Sarah Kaiser, J...</td>\n",
+       "      <td>http://arxiv.org/abs/1412.8032v2</td>\n",
+       "      <td>The security of quantum communication using a ...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>2014-12-24</td>\n",
+       "      <td>Balancing Isolation and Sharing of Data for Th...</td>\n",
+       "      <td>Florian Schröder, Raphael M. Reischuk, and Joh...</td>\n",
+       "      <td>http://arxiv.org/abs/1412.7641v2</td>\n",
+       "      <td>In the landscape of application ecosystems, to...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>196</th>\n",
+       "      <td>2014-01-15</td>\n",
+       "      <td>Multipath Private Communication: An Informatio...</td>\n",
+       "      <td>Hadi Ahmadi, and Reihaneh Safavi-Naini</td>\n",
+       "      <td>http://arxiv.org/abs/1401.3659v1</td>\n",
+       "      <td>Sending private messages over communication en...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>197</th>\n",
+       "      <td>2014-01-15</td>\n",
+       "      <td>Intelligent Systems for Information Security</td>\n",
+       "      <td>Ayman M. Bahaa-Eldin</td>\n",
+       "      <td>http://arxiv.org/abs/1401.3592v1</td>\n",
+       "      <td>This thesis aims to use intelligent systems to...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>198</th>\n",
+       "      <td>2014-01-13</td>\n",
+       "      <td>A reduced semantics for deciding trace equival...</td>\n",
+       "      <td>David Baelde, Stéphanie Delaune, and Lucca Hir...</td>\n",
+       "      <td>http://arxiv.org/abs/1401.2854v2</td>\n",
+       "      <td>Many privacy-type properties of security proto...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>199</th>\n",
+       "      <td>2014-01-12</td>\n",
+       "      <td>Practical and fast quantum random number gener...</td>\n",
+       "      <td>You-Qi Nie, Hong-Fei Zhang, Zhen Zhang, Jian W...</td>\n",
+       "      <td>http://arxiv.org/abs/1401.2594v1</td>\n",
+       "      <td>We present a practical high-speed quantum rand...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>200</th>\n",
+       "      <td>2014-01-06</td>\n",
+       "      <td>Power Grid Defense Against Malicious Cascading...</td>\n",
+       "      <td>Paulo Shakarian, Hansheng Lei, and Roy Lindelauf</td>\n",
+       "      <td>http://arxiv.org/abs/1401.1086v1</td>\n",
+       "      <td>An adversary looking to disrupt a power grid m...</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>201 rows × 5 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "          date                                              title  \\\n",
+       "0   2014-12-30  Detecting Malicious Code by Exploiting Depende...   \n",
+       "1   2014-12-30  Percolation Model of Insider Threats to Assess...   \n",
+       "2   2014-12-29          Bloom Filters in Adversarial Environments   \n",
+       "3   2014-12-27  Attacks exploiting deviation of mean photon nu...   \n",
+       "4   2014-12-24  Balancing Isolation and Sharing of Data for Th...   \n",
+       "..         ...                                                ...   \n",
+       "196 2014-01-15  Multipath Private Communication: An Informatio...   \n",
+       "197 2014-01-15       Intelligent Systems for Information Security   \n",
+       "198 2014-01-13  A reduced semantics for deciding trace equival...   \n",
+       "199 2014-01-12  Practical and fast quantum random number gener...   \n",
+       "200 2014-01-06  Power Grid Defense Against Malicious Cascading...   \n",
+       "\n",
+       "                                                author  \\\n",
+       "0         Stavros D. Nikolopoulos, and Iosif Polenakis   \n",
+       "1    Jeremy Kepner, Vijay Gadepally, and Pete Micha...   \n",
+       "2                           Moni Naor, and Eylon Yogev   \n",
+       "3    Shihan Sajeed, Igor Radchenko, Sarah Kaiser, J...   \n",
+       "4    Florian Schröder, Raphael M. Reischuk, and Joh...   \n",
+       "..                                                 ...   \n",
+       "196             Hadi Ahmadi, and Reihaneh Safavi-Naini   \n",
+       "197                               Ayman M. Bahaa-Eldin   \n",
+       "198  David Baelde, Stéphanie Delaune, and Lucca Hir...   \n",
+       "199  You-Qi Nie, Hong-Fei Zhang, Zhen Zhang, Jian W...   \n",
+       "200   Paulo Shakarian, Hansheng Lei, and Roy Lindelauf   \n",
+       "\n",
+       "                                 link  \\\n",
+       "0    http://arxiv.org/abs/1412.8712v1   \n",
+       "1    http://arxiv.org/abs/1412.8699v1   \n",
+       "2    http://arxiv.org/abs/1412.8356v5   \n",
+       "3    http://arxiv.org/abs/1412.8032v2   \n",
+       "4    http://arxiv.org/abs/1412.7641v2   \n",
+       "..                                ...   \n",
+       "196  http://arxiv.org/abs/1401.3659v1   \n",
+       "197  http://arxiv.org/abs/1401.3592v1   \n",
+       "198  http://arxiv.org/abs/1401.2854v2   \n",
+       "199  http://arxiv.org/abs/1401.2594v1   \n",
+       "200  http://arxiv.org/abs/1401.1086v1   \n",
+       "\n",
+       "                                              abstract  \n",
+       "0    In this paper we present an elaborated graph-b...  \n",
+       "1    Rules, regulations, and policies are the basis...  \n",
+       "2    Many efficient data structures use randomness,...  \n",
+       "3    The security of quantum communication using a ...  \n",
+       "4    In the landscape of application ecosystems, to...  \n",
+       "..                                                 ...  \n",
+       "196  Sending private messages over communication en...  \n",
+       "197  This thesis aims to use intelligent systems to...  \n",
+       "198  Many privacy-type properties of security proto...  \n",
+       "199  We present a practical high-speed quantum rand...  \n",
+       "200  An adversary looking to disrupt a power grid m...  \n",
+       "\n",
+       "[201 rows x 5 columns]"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "file_path = \"../json/model_stealing_papers.json\"\n",
+    "df = pd.read_json(file_path)\n",
+    "df"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "p310",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/papers/fetch_papers.py b/papers/fetch_papers.py
@@ -4,11 +4,8 @@
 from dateutil.parser import parse
 from datetime import datetime
 
-# Construct the default API client.
-# client = Client()
+# https://info.arxiv.org/help/api/user-manual.html
 # https://lukasschwab.me/arxiv.py/arxiv.html
-# Perform the search using arxiv.Search
-
 
 def create_author_str(authors):
     # Join authors with ", " and handle the last author differently
@@ -19,21 +16,40 @@ def create_author_str(authors):
 
     return authors_str
 
-curr_year = datetime.now().year
-submittedDate = "submittedDate:[2014 TO {curr_year}]"
-search = arxiv.Search(
-    query=f"{submittedDate} AND (cat:cs.CR) AND (model stealing OR model extraction OR high-fidelity)",
-    # max_results=500,
-    sort_by=arxiv.SortCriterion.SubmittedDate,
-    sort_order=arxiv.SortOrder.Descending
-)
+
+submittedDate = f"submittedDate:[2017 TO {datetime.now().year}]"
+query=f"{submittedDate} AND (cat:cs.CR) AND (model steal* OR model extract* OR high-fidelity)",
+query="(cat:cs.CR) AND (model stealing OR model extract OR high-fidelity)",
+
+# query='"quantum dots"'
+
+# id_list = [240610011]
+
+results_generator = arxiv.Client(
+  page_size=1000,
+  delay_seconds=3,
+  num_retries=3
+).results(arxiv.Search(
+  query=query,
+  id_list=[],
+  sort_by=arxiv.SortCriterion.SubmittedDate,
+  sort_order=arxiv.SortOrder.Descending
+))
+
+
+# search = arxiv.Search(
+#     query=f"{submittedDate} AND (cat:cs.CR) AND (model stealing OR model extraction OR high-fidelity)",
+#     # max_results=500,
+#     sort_by=arxiv.SortCriterion.SubmittedDate,
+#     sort_order=arxiv.SortOrder.Descending
+# )
 
 papers_data = []
 
 # Iterate over the results from search
-for result in search.results():
+for result in results_generator:
     # breakpoint()
-    formatted_date = result.published.strftime("%Y-%m-%d")
+    formatted_date = result.published.strftime("%Y-%m")
     authors = [author.name for author in result.authors]
 
     # papers_data.append({'id': result.entry_id, 'title': result.title, 'authors': ', '.join(authors)})