Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Generate the charset tables dynamically from codes #3409

Merged
merged 9 commits into from
Sep 1, 2024
Merged
16 changes: 16 additions & 0 deletions doc/_static/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -203,3 +203,19 @@ a.copybtn {
.sphx-glr-single-img {
max-width: 80%!important;
}

/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the CSS styles from myst-nb.

div.cell_output tr,
div.cell_output th,
div.cell_output td {
  text-align: right;
  vertical-align: middle;
  padding: 0.5em 0.5em;
  line-height: normal;
  white-space: normal;
  max-width: none;
  border: none;
}

The text-align is set to right, which makes sense for displaying objects like pandas.DataFrame (an example at https://myst-nb.readthedocs.io/en/latest/render/format_code_cells.html#remove-stdout-or-stderr).

In MyST, the table cell alignment can be controlled using : characters (https://myst-parser.readthedocs.io/en/latest/syntax/tables.html), which is done by assigning text-left/text-center/text-right to cells. These CSS classes are not defined in Sphinx_rtd_theme, so we have to define them here.

* Styles for aligning table cells.
* https://myst-parser.readthedocs.io/en/latest/syntax/tables.html#markdown-syntax
*/
th.text-left, td.text-left {
text-align: left !important;
}

th.text-center, td.text-center {
text-align: center !important;
}

th.text-right, td.text-right {
text-align: right !important;
}
3 changes: 3 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@
"requires": requirements,
}

# MyST-NB configurations: https://myst-nb.readthedocs.io/en/latest/configuration.html
nb_render_markdown_format = "myst" # The format to use for text/markdown rendering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is 'commonmark', which doesn't support tables.


# Make the list of returns arguments and attributes render the same as the
# parameters list
Expand Down
141 changes: 53 additions & 88 deletions doc/techref/encodings.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
---
file_format: mystnb
---

```{code-cell}
---
tags: [remove-input]
---
from IPython.display import display, Markdown
from pygmt.encodings import charset


def get_charset_mdtable(name):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is modified from the original script at #3206 (comment)

"""
Create a markdown table for a charset.
"""
mappings = charset[name]

undefined = "\ufffd"
text = "| Octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |\n"
text += "|:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|\n"
for i in range(0o00, 0o400, 8):
chars = [mappings.get(j, undefined) for j in range(i, i + 8)]
if chars == [undefined] * 8:
continue
chars = [f"&#x{ord(char):04x};" for char in chars]
row = f"\\{i:03o}"[:-1] + "x"
text += f"| **{row}** | {' | '.join(chars)} |\n"
text += "\n"
return Markdown(text)
```

# Supported Encodings and Non-ASCII Characters

GMT supports a number of encodings and each encoding contains a set of ASCII and
Expand All @@ -10,100 +42,33 @@ that the character is not defined in the encoding.

## Adobe ISOLatin1+ Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\03x** | � | • | … | ™ | — | – | fi | ž |
| **\04x** |   | ! | " | # | $ | % | & | ’ |
| **\05x** | ( | ) | * | + | , | - | . | / |
| **\06x** | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| **\07x** | 8 | 9 | : | &#x003b; | < | = | > | ? |
| **\10x** | @ | A | B | C | D | E | F | G |
| **\11x** | H | I | J | K | L | M | N | O |
| **\12x** | P | Q | R | S | T | U | V | W |
| **\13x** | X | Y | Z | [ | \ | ] | ^ | _ |
| **\14x** | ‘ | a | b | c | d | e | f | g |
| **\15x** | h | i | j | k | l | m | n | o |
| **\16x** | p | q | r | s | t | u | v | w |
| **\17x** | x | y | z | { | | | } | ~ | š |
| **\20x** | Œ | † | ‡ | Ł | ⁄ | ‹ | Š | › |
| **\21x** | œ | Ÿ | Ž | ł | ‰ | „ | “ | ” |
| **\22x** | ı | ` | ´ | ^ | ˜ | ¯ | ˘ | ˙ |
| **\23x** | ¨ | ‚ | ˚ | ¸ | ' | ˝ | ˛ | ˇ |
| **\24x** | � | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § |
| **\25x** | ¨ | © | ª | « | ¬ | ­ | ® | ¯ |
| **\26x** | ° | ± | ² | ³ | ´ | µ | ¶ | · |
| **\27x** | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
| **\30x** | À | Á | Â | Ã | Ä | Å | Æ | Ç |
| **\31x** | È | É | Ê | Ë | Ì | Í | Î | Ï |
| **\32x** | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × |
| **\33x** | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
| **\34x** | à | á | â | ã | ä | å | æ | ç |
| **\35x** | è | é | ê | ë | ì | í | î | ï |
| **\36x** | ð | ñ | ò | ó | ô | õ | ö | ÷ |
| **\37x** | ø | ù | ú | û | ü | ý | þ | ÿ |
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("ISOLatin1+"))
```

## Adobe Symbol Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\04x** |   | ! | ∀ | # | ∃ | % | & | ∋ |
| **\05x** | ( | ) | ∗ | + | , | − | . | / |
| **\06x** | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| **\07x** | 8 | 9 | : | &#x003b; | < | = | > | ? |
| **\10x** | ≅ | Α | Β | Χ | ∆ | Ε | Φ | Γ |
| **\11x** | Η | Ι | ϑ | Κ | Λ | Μ | Ν | Ο |
| **\12x** | Π | Θ | Ρ | Σ | Τ | Υ | ς | Ω |
| **\13x** | Ξ | Ψ | Ζ | [ | ∴ | ] | ⊥ | _ |
| **\14x** |  | α | β | χ | δ | ε | φ | γ |
| **\15x** | η | ι | ϕ | κ | λ | μ | ν | ο |
| **\16x** | π | θ | ρ | σ | τ | υ | ϖ | ω |
| **\17x** | ξ | ψ | ζ | { | | | } | ∼ | � |
| **\24x** | € | ϒ | ′ | ≤ | ∕ | ∞ | ƒ | ♣ |
| **\25x** | ♦ | ♥ | ♠ | ↔ | ← | ↑ | → | ↓ |
| **\26x** | ° | ± | ″ | ≥ | × | ∝ | ∂ | • |
| **\27x** | ÷ | ≠ | ≡ | ≈ | … | ⏐ | ⎯ | ↵ |
| **\30x** | ℵ | ℑ | ℜ | ℘ | ⊗ | ⊕ | ∅ | ∩ |
| **\31x** | ∪ | ⊃ | ⊇ | ⊄ | ⊂ | ⊆ | ∈ | ∉ |
| **\32x** | ∠ | ∇ | ® | © | ™ | ∏ | √ | ⋅ |
| **\33x** | ¬ | ∧ | ∨ | ⇔ | ⇐ | ⇑ | ⇒ | ⇓ |
| **\34x** | ◊ | 〈 | ® | © | ™ | ∑ | ⎛ | ⎜ |
| **\35x** | ⎝ | ⎡ | ⎢ | ⎣ | ⎧ | ⎨ | ⎩ | ⎪ |
| **\36x** | � | 〉 | ∫ | ⌠ | ⎮ | ⌡ | ⎞ | ⎟ |
| **\37x** | ⎠ | ⎤ | ⎥ | ⎦ | ⎫ | ⎬ | ⎭ | � |

**Note**: The octal code `\140` represents the RADICAL EXTENDER character, which is not available in
the Unicode character set.
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("Symbol"))
```

**Note**: The octal code `\140` represents the RADICAL EXTENDER character, which is not
available in the Unicode character set.

## Adobe ZapfDingbats Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\04x** |   | ✁ | ✂ | ✃ | ✄ | ☎ | ✆ | ✇ |
| **\05x** | ✈ | ✉ | ☛ | ☞ | ✌ | ✍ | ✎ | ✏ |
| **\06x** | ✐ | ✑ | ✒ | ✓ | ✔ | ✕ | ✖ | ✗ |
| **\07x** | ✘ | ✙ | ✚ | ✛ | ✜ | ✝ | ✞ | ✟ |
| **\10x** | ✠ | ✡ | ✢ | ✣ | ✤ | ✥ | ✦ | ✧ |
| **\11x** | ★ | ✩ | ✪ | ✫ | ✬ | ✭ | ✮ | ✯ |
| **\12x** | ✰ | ✱ | ✲ | ✳ | ✴ | ✵ | ✶ | ✷ |
| **\13x** | ✸ | ✹ | ✺ | ✻ | ✼ | ✽ | ✾ | ✿ |
| **\14x** | ❀ | ❁ | ❂ | ❃ | ❄ | ❅ | ❆ | ❇ |
| **\15x** | ❈ | ❉ | ❊ | ❋ | ● | ❍ | ■ | ❏ |
| **\16x** | ❐ | ❑ | ❒ | ▲ | ▼ | ◆ | ❖ | ◗ |
| **\17x** | ❘ | ❙ | ❚ | ❛ | ❜ | ❝ | ❞ | � |
| **\20x** | ❨ | ❩ | ❪ | ❫ | ❬ | ❭ | ❮ | ❯ |
| **\21x** | ❰ | ❱ | ❲ | ❳ | ❴ | ❵ | � | � |
| **\24x** | � | ❡ | ❢ | ❣ | ❤ | ❥ | ❦ | ❧ |
| **\25x** | ♣ | ♦ | ♥ | ♠ | ① | ② | ③ | ④ |
| **\26x** | ⑤ | ⑥ | ⑦ | ⑧ | ⑨ | ⑩ | ❶ | ❷ |
| **\27x** | ❸ | ❹ | ❺ | ❻ | ❼ | ❽ | ❾ | ❿ |
| **\30x** | ➀ | ➁ | ➂ | ➃ | ➄ | ➅ | ➆ | ➇ |
| **\31x** | ➈ | ➉ | ➊ | ➋ | ➌ | ➍ | ➎ | ➏ |
| **\32x** | ➐ | ➑ | ➒ | ➓ | ➔ | → | ↔ | ↕ |
| **\33x** | ➘ | ➙ | ➚ | ➛ | ➜ | ➝ | ➞ | ➟ |
| **\34x** | ➠ | ➡ | ➢ | ➣ | ➤ | ➥ | ➦ | ➧ |
| **\35x** | ➨ | ➩ | ➪ | ➫ | ➬ | ➭ | ➮ | ➯ |
| **\36x** | � | ➱ | ➲ | ➳ | ➴ | ➵ | ➶ | ➷ |
| **\37x** | ➸ | ➹ | ➺ | ➻ | ➼ | ➽ | ➾ | � |
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("ZapfDingbats"))
```

## ISO/IEC 8859

Expand Down
Loading