-
Notifications
You must be signed in to change notification settings - Fork 1
Jupyter Notebook
Jupyter is a powerful open-source platform for interactive computing . Here's a comprehensive overview of Jupyter, focusing on its key aspects.
Further installation instructions can be found on the jupyter notebook
- Python 3.7 or later
- pip (Python package manager)
- Install Jupyter Notebook via
pip
pip install notebook
- Verify the installation
jupyter --version
You should see the Jupyter version along with additional tool versions.
- Launch Jupyter Notebook
jupyter notebook
This will open a browser window where you can create and manage notebooks.
- You can even access the notebook by this url http://localhost:6001/team1/jupyter
- Using Anaconda: Install Jupyter as part of the Anaconda distribution
conda install -c conda-forge notebook
- Docker: Run Jupyter Notebooks inside a Docker container
docker run -p 8888:8888 jupyter/base-notebook
-
By default, Jupyter saves files in the current directory where it was launched.
-
You can configure Jupyter by editing the
jupyter_notebook_config.py
file:
jupyter notebook --generate-config
- Set a password
jupyter notebook password
- Change the default notebook directory
- Edit
jupyter_notebook_config.py
to set the desired folder:
c.NotebookApp.notebook_dir = '/path/to/project-folder'
- Install Jupyter themes for a customized interface:
pip install jupyterthemes
jt -t <theme-name>
- Use extensions like JupyterLab for an enhanced experience:
pip install jupyterlab
- To include Jupyter and the IPython kernel, add the following lines to your Dockerfile:
RUN /bin/bash -c "source ~/.bashrc && mamba install -c conda-forge jupyter ipykernel"
RUN /root/miniconda3/envs/team1_env/bin/python -m ipykernel install --name team1_env --display-name "Python (team1_env)"
EXPOSE 6001
jupyter notebook
- Configure the default command to launch Jupyter
CMD ["jupyter", "notebook", "--port=6001", "--no-browser", "--ip=0.0.0.0"]
- Open Jupyter in a browser.
- Click on "New" → "Python 3" to create a new Python notebook.
- Each notebook consists of cells where you can enter Python code or markdown for the documentation.
# Sample Python code in a Jupyter cell
print ("Hello world-Team 1!!!")
- Save your work by clicking the Save button or pressing
Ctrl+S
. - Export notebooks as PDFs, HTML, or LaTeX via File → Download as.
- Creates the milvus directory if it doesn’t already exist, then attempts to connect to the database file. Returns a boolean indicating whether the database was successfully found
def vector_store_check(uri):
"""
Returns response on whether the vector storage exists
Returns:
boolean
"""
# Create the directory if it does not exist
head = os.path.split(uri)
os.makedirs(head[0], exist_ok=True)
# Connect to the Milvus database
connections.connect("default", uri=uri)
# Return True if exists, False otherwise
return utility.has_collection("IT_support")
print("Function `vector_store_check` defined.")
- This function removes extra whitespace and blank lines from given input,returning a more readable,compact version of the text.
def clean_text(text):
"""Further clean the text by removing extra whitespace and new lines."""
lines = (line.strip() for line in text.splitlines())
cleaned_lines = [line for line in lines if line]
return '\n'.join(cleaned_lines)
print("Function `clean_text` defined.")
- This function parses HTML content, removes unnecessary elements scripts, styles, headers, footers, and navigation elements, and extracts the main text. If a element is present, the function prioritizes its content. The cleaned content is returned as plain text, free from HTML tags and unnecessary whitespace.
def clean_text_from_html(html_content):
"""Clean HTML content to extract main text."""
soup = BeautifulSoup(html_content, 'html.parser')
# Remove unnecessary elements
for script_or_style in soup(['script', 'style', 'header', 'footer', 'nav']):
script_or_style.decompose()
main_content = soup.find('main')
if main_content:
content = main_content.get_text(separator='\n')
else:
content = soup.get_text(separator='\n')
return clean_text(content)
print("Function `clean_text_from_html` defined.")
- Recursively load documents from the web according to CORPUS_SOURCE, ensuring that only pages within the base_url of CORPUS_SOURCE are retrieved. The function returns the loaded documents.
def load_documents_from_web():
"""
Load the documents from the web and store the page contents
Returns:
list: The documents loaded from the web
"""
loader = RecursiveUrlLoader(
url=CORPUS_SOURCE,
prevent_outside=True,
base_url=CORPUS_SOURCE
)
raw_documents = loader.load()
# Ensure documents are cleaned
cleaned_documents = []
for doc in raw_documents:
cleaned_text = clean_text_from_html(doc.page_content)
cleaned_documents.append(Document(page_content=cleaned_text, metadata=doc.metadata))
return cleaned_documents
print("Function `load_documents_from_web` defined.")
-
Unable to access Jupyter in browser: Verify port configuration: Ensure that port
6001
is properly exposed and correctly mapped in your Docker run command -
Check firewall settings: Make sure no firewall rules are restricting access to port 6001.
-
Kernel Errors: Restart the kernel by clicking Kernel → Restart.
-
Ensure the IP address is correct: Confirm you are using the appropriate IP address or localhost if the service is running locally.
-
Notebook Missing: Verify the directory you used to launch the Jupyter Notebook.
-
Rebuild the container: If changes to Jupyter are not applied after updating the Dockerfile, rebuild the Docker image.
docker build -t team1-app
- View detailed error logs by navigating to the terminal where the Jupyter is running.
- Use
%debug
magic command to step into any errors within the notebook.