Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moogle 2.0 + leandocsearch #548

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jesse-michael-han
Copy link
Contributor

Video demonstrating new UI / features:

moogle-2-and-doc-search-v2.mp4

formatting
formatting
formatting
formatting

textbook links

2.0
Copy link
Collaborator

@mhuisi mhuisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other comments:

  • Entering &a into the query field leads to a TypeError: hits.map is not a function error that is not cleared when sending a new query.
  • The header of each doc search entry should tell users which documentation the entry is from, and ideally even what Lean version those docs are for.
  • Doc quotes should not use a monospace font.
  • In the theorem search, the start of the doc comment appears to be cut off (i.e. it displays as -).
  • The UI should be explicit about which version of Mathlib it is targeting.
  • Most doc searches yield very broken formatting. E.g. when searching for "difference between theorem and lemma", the first entry immediately has very broken formatting.
  • Entering "introduction" into the doc search produces the TPIL ToC as a result, including links that don't work.
  • Searching for "what does the infoview debounce setting do" in the doc search and clicking the "Introduction" entry yields an empty result.
  • Most FPiL entries contain broken tags, e.g. {{#example_in Examples/Intro.lean NatDoubleFour}}.
  • When searching for "what is mathlib" and clicking the "The Standard Library" entry, the "view online" link leads to a 404.
  • The spinner should be directly next to the input container.
  • I'd place the "Searching" widget below the search box, not next to it. It should also not be indented relative to the input box when the line needs to be broken in two.
  • When clicking the search mode toggle, the change from a regular font to a bold font moves the toggle and the "Docs" text either to the left or the right. It should be static when clicking this button.

Broken doc formatting

Comment on lines 27 to 76
<ol type="1">
<li>
<p>
By constant:<br>
<a href="javascript:void(0)" class="query-suggestion">Real.sin</a> finds all lemmas whose statement somehow mentions the sinefunction.
By constant:<br />
<a href="javascript:void(0)" class="query-suggestion">Real.sin</a> finds all lemmas whose statement somehow
mentions the sinefunction.
</p>
</li>
<li>
<p>
By lemma name substring:<br>
<a href="javascript:void(0)" class="query-suggestion">"differ"</a> finds all lemmas that have <code>"differ"</code> somewhere in their lemma <em>name</em>.
By lemma name substring:<br />
<a href="javascript:void(0)" class="query-suggestion">"differ"</a> finds all lemmas that have
<code>"differ"</code> somewhere in their lemma <em>name</em>.
</p>
</li>
<li>
<p>
By subexpression:<br>
<a href="javascript:void(0)" class="query-suggestion">_ * (_ ^ _)</a> finds all lemmas whose statements somewhere include a product where the second argument is raised to some power.
By subexpression:<br />
<a href="javascript:void(0)" class="query-suggestion">_ * (_ ^ _)</a> finds all lemmas whose statements
somewhere include a product where the second argument is raised to some power.
</p>
<p>
The pattern can also be non-linear, as in <a href="javascript:void(0)" class="query-suggestion">Real.sqrt ?a * Real.sqrt ?a</a>
The pattern can also be non-linear, as in
<a href="javascript:void(0)" class="query-suggestion">Real.sqrt ?a * Real.sqrt ?a</a>
</p>
<p>
If the pattern has parameters, they are matched in any order. Both of these will find <code>List.map</code>:<br>
<a href="javascript:void(0)" class="query-suggestion">(?a -&gt; ?b) -&gt; List ?a -&gt; List ?b</a><br>
If the pattern has parameters, they are matched in any order. Both of these will find
<code>List.map</code>:<br />
<a href="javascript:void(0)" class="query-suggestion">(?a -&gt; ?b) -&gt; List ?a -&gt; List ?b</a><br />
<a href="javascript:void(0)" class="query-suggestion">List ?a -&gt; (?a -&gt; ?b) -&gt; List ?b</a>
</p>
</li>
<li>
<p>
By main conclusion:<br>
<a href="javascript:void(0)" class="query-suggestion">|- tsum _ = _ * tsum _</a> finds all lemmas where the conclusion (the subexpression to the right of all <code>→</code> and <code>∀</code>) has the given shape.
By main conclusion:<br />
<a href="javascript:void(0)" class="query-suggestion">|- tsum _ = _ * tsum _</a> finds all lemmas where the
conclusion (the subexpression to the right of all <code>→</code> and <code>∀</code>) has the given shape.
</p>
<p>
As before, if the pattern has parameters, they are matched against the hypotheses of the lemma in any order; for example, <a href="javascript:void(0)" class="query-suggestion">|- _ &lt; _ → tsum _ &lt; tsum _</a> will find <code>tsum_lt_tsum</code> even though the hypothesis <code>f i &lt; g i</code> is not the last.
As before, if the pattern has parameters, they are matched against the hypotheses of the lemma in any order;
for example, <a href="javascript:void(0)" class="query-suggestion">|- _ &lt; _ → tsum _ &lt; tsum _</a> will
find <code>tsum_lt_tsum</code> even though the hypothesis <code>f i &lt; g i</code> is not the last.
</p>
</li>
</ol>
<p>
If you pass more than one such search filter, separated by commas Loogle will return lemmas which match <em>all</em> of them. The search <a href="javascript:void(0)" class="query-suggestion">Real.sin, "two", tsum, _ * _, _ ^ _, |- _ &lt; _ → _</a>
would find all lemmas which mention the constants <code>Real.sin</code>
and <code>tsum</code>, have <code>"two"</code> as a substring of the
lemma name, include a product and a power somewhere in the type,
<em>and</em> have a hypothesis of the form <code>_ &lt; _</code> (if
there were any such lemmas). Metavariables (<code>?a</code>) are
assigned independently in each filter.
If you pass more than one such search filter, separated by commas Loogle will return lemmas which match
<em>all</em> of them. The search
<a href="javascript:void(0)" class="query-suggestion">Real.sin, "two", tsum, _ * _, _ ^ _, |- _ &lt; _ → _</a> would
find all lemmas which mention the constants <code>Real.sin</code> and <code>tsum</code>, have <code>"two"</code> as
a substring of the lemma name, include a product and a power somewhere in the type, <em>and</em> have a hypothesis
of the form <code>_ &lt; _</code> (if there were any such lemmas). Metavariables (<code>?a</code>) are assigned
independently in each filter.
</p>

<h2 id="source-code">Source code</h2>
<p>You can find the source code for this service at <a href="https://github.com/nomeata/loogle" class="uri">https://github.com/nomeata/loogle</a>. The <a href="https://loogle.lean-lang.org/" class="uri">https://loogle.lean-lang.org/</a> service is currently
provided by Joachim Breitner &lt;<a href="mailto:[email protected]" class="email">[email protected]</a>&gt;.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't touch LoogleView files in this PR.

@@ -3,25 +3,44 @@ import { InputAbbreviationRewriter } from '@leanprover/unicode-input-component'

const vscodeApi = acquireVsCodeApi()

interface MoogleHit {
interface BaseHit {
id: string
displayHtml: string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this field dead code? It's only set in runMoogleQuery but never read in displayTheoremHit.

Comment on lines 107 to 108
private theoremText = document.querySelector('.theorem-text') as HTMLElement
private docText = document.querySelector('.doc-text') as HTMLElement
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using classes instead of IDs?

const baseUrl = 'https://morph-cors-anywhere.pranavnt.workers.dev/?https://www.moogle.ai/api'

if (this.currentSearchMode === 'theorem') {
const result = await fetch(`${baseUrl}/moogle2?query=${query}`, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this input needs to be escaped?

Comment on lines 273 to 290
private transformDisplayHTML(input: string): string {
// Replace <code>lean with <pre><code class="language-lean">
let result = input.replace(/<code>lean\n/g, '<pre><code class="language-lean">')
// Replace <code>output info with <pre><code class="language-text">
result = result.replace(/<code>output info\n/g, '<pre><code class="language-text">')
// Replace standalone </code> with </code></pre>
result = result.replace(/<\/code>(?!<\/pre>)/g, '</code></pre>')
// Replace {{#example_in ...}} and {{#example_out ...}} with placeholders
result = result.replace(/{{#example_in [\w/.]+}}/g, '<span class="example-in">[Example Input]</span>')
result = result.replace(/{{#example_out [\w/.]+}}/g, '<span class="example-out">[Example Output]</span>')
// Replace {{#example_decl ...}} with a placeholder
result = result.replace(/{{#example_decl [\w/.]+}}/g, '<span class="example-decl">[Example Declaration]</span>')
// Replace {{#example_eval ...}} with a placeholder
result = result.replace(/{{#example_eval [\w/.]+}}/g, '<span class="example-eval">[Example Evaluation]</span>')
// Remove any remaining newline character immediately after opening <code> tags
result = result.replace(/<code>(\s*)\n/g, '<code>')
return result
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are already having the server serve HTML, why is this kind of post-processing that every client of the API needs to do implemented in the client, and not the server?
Furthermore, why apply the same post-processing to all documentation sources, instead of applying the ones you need for each individual documentation source?

Comment on lines 304 to 311
// Add Lean syntax highlighting
element.querySelectorAll('code.language-lean').forEach(block => {
const code = block.innerHTML
block.innerHTML = code
.replace(/\b(def|fun|let|structure|where|match|with|Type|class)\b/g, '<span class="keyword">$1</span>')
.replace(/\b(Nat|String|Bool|List|Option|Type)\b/g, '<span class="type">$1</span>')
.replace(/(:=|-&gt;|→|←|↔|⟹|⟸|⟺)/g, '<span class="symbol">$1</span>')
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above: Why is this done in the client?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants