Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync release 0.5.13 into dev #4408

Merged
merged 13 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion extensions/inference-cortex-extension/bin/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.7
1.0.8
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,16 @@
"placeholder": "4"
}
},
{
"key": "cpu_threads",
"title": "CPU Threads",
"description": "The number of CPU threads to use (when in CPU mode)",
"controllerType": "input",
"controllerProps": {
"value": "",
"placeholder": "Number of CPU threads"
}
},
{
"key": "flash_attn",
"title": "Flash Attention enabled",
Expand Down
10 changes: 10 additions & 0 deletions extensions/inference-cortex-extension/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ export enum Settings {
flash_attn = 'flash_attn',
cache_type = 'cache_type',
use_mmap = 'use_mmap',
cpu_threads = 'cpu_threads',
}

/**
Expand All @@ -66,6 +67,7 @@ export default class JanInferenceCortexExtension extends LocalOAIEngine {
flash_attn: boolean = true
use_mmap: boolean = true
cache_type: string = 'f16'
cpu_threads?: number

/**
* The URL for making inference requests.
Expand Down Expand Up @@ -105,6 +107,10 @@ export default class JanInferenceCortexExtension extends LocalOAIEngine {
this.flash_attn = await this.getSetting<boolean>(Settings.flash_attn, true)
this.use_mmap = await this.getSetting<boolean>(Settings.use_mmap, true)
this.cache_type = await this.getSetting<string>(Settings.cache_type, 'f16')
const threads_number = Number(
await this.getSetting<string>(Settings.cpu_threads, '')
)
if (!Number.isNaN(threads_number)) this.cpu_threads = threads_number

this.queue.add(() => this.clean())

Expand Down Expand Up @@ -150,6 +156,9 @@ export default class JanInferenceCortexExtension extends LocalOAIEngine {
this.cache_type = value as string
} else if (key === Settings.use_mmap && typeof value === 'boolean') {
this.use_mmap = value as boolean
} else if (key === Settings.cpu_threads && typeof value === 'string') {
const threads_number = Number(value)
if (!Number.isNaN(threads_number)) this.cpu_threads = threads_number
}
}

Expand Down Expand Up @@ -207,6 +216,7 @@ export default class JanInferenceCortexExtension extends LocalOAIEngine {
flash_attn: this.flash_attn,
cache_type: this.cache_type,
use_mmap: this.use_mmap,
...(this.cpu_threads ? { cpu_threads: this.cpu_threads } : {}),
},
timeout: false,
signal,
Expand Down
2 changes: 1 addition & 1 deletion web/hooks/useCreateNewThread.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
ExtensionTypeEnum,
Thread,
ThreadAssistantInfo,
ThreadState,

Check warning on line 8 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-ubuntu

'ThreadState' is defined but never used

Check warning on line 8 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-macos

'ThreadState' is defined but never used

Check warning on line 8 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'ThreadState' is defined but never used

Check warning on line 8 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-windows-pr

'ThreadState' is defined but never used

Check warning on line 8 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'ThreadState' is defined but never used
AssistantTool,
Model,
Assistant,
} from '@janhq/core'
import { atom, useAtom, useAtomValue, useSetAtom } from 'jotai'

Check warning on line 13 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-ubuntu

'atom' is defined but never used

Check warning on line 13 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-macos

'atom' is defined but never used

Check warning on line 13 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'atom' is defined but never used

Check warning on line 13 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-windows-pr

'atom' is defined but never used

Check warning on line 13 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'atom' is defined but never used

import { useDebouncedCallback } from 'use-debounce'

Expand All @@ -33,7 +33,7 @@
threadsAtom,
updateThreadAtom,
setThreadModelParamsAtom,
isGeneratingResponseAtom,

Check warning on line 36 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-ubuntu

'isGeneratingResponseAtom' is defined but never used

Check warning on line 36 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-macos

'isGeneratingResponseAtom' is defined but never used

Check warning on line 36 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'isGeneratingResponseAtom' is defined but never used

Check warning on line 36 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / test-on-windows-pr

'isGeneratingResponseAtom' is defined but never used

Check warning on line 36 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

'isGeneratingResponseAtom' is defined but never used
createNewThreadAtom,
} from '@/helpers/atoms/Thread.atom'

Expand Down Expand Up @@ -98,7 +98,7 @@
// Use ctx length by default
const overriddenParameters = {
max_tokens: !isLocalEngine(defaultModel?.engine)
? (defaultModel?.parameters.token_limit ?? 8192)
? (defaultModel?.parameters.max_tokens ?? 8192)
: defaultContextLength,
}

Expand Down Expand Up @@ -136,19 +136,19 @@
//TODO: Why do we have thread list then thread states? Should combine them
try {
const createdThread = await persistNewThread(thread, assistantInfo)
if (!createdThread) throw 'Thread created failed.'
createNewThread(createdThread)

Check warning on line 140 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

139-140 lines are not covered with tests

setSelectedModel(defaultModel)
setThreadModelParams(createdThread.id, {

Check warning on line 143 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

142-143 lines are not covered with tests
...defaultModel?.settings,
...defaultModel?.parameters,
...overriddenSettings,
})

// Delete the file upload state
setFileUpload(undefined)
setActiveThread(createdThread)

Check warning on line 151 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

150-151 lines are not covered with tests
} catch (ex) {
return toaster({
title: 'Thread created failed.',
Expand All @@ -159,7 +159,7 @@
}

const updateThreadExtension = (thread: Thread) => {
return extensionManager

Check warning on line 162 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

162 line is not covered with tests
.get<ConversationalExtension>(ExtensionTypeEnum.Conversational)
?.modifyThread(thread)
}
Expand All @@ -168,7 +168,7 @@
threadId: string,
assistant: ThreadAssistantInfo
) => {
return extensionManager

Check warning on line 171 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

171 line is not covered with tests
.get<ConversationalExtension>(ExtensionTypeEnum.Conversational)
?.modifyThreadAssistant(threadId, assistant)
}
Expand Down Expand Up @@ -205,13 +205,13 @@
.get<ConversationalExtension>(ExtensionTypeEnum.Conversational)
?.createThread(thread)
.then(async (thread) => {
await extensionManager

Check warning on line 208 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

208 line is not covered with tests
.get<ConversationalExtension>(ExtensionTypeEnum.Conversational)
?.createThreadAssistant(thread.id, assistantInfo)
.catch(console.error)
return thread

Check warning on line 212 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

212 line is not covered with tests
})
.catch(() => undefined)

Check warning on line 214 in web/hooks/useCreateNewThread.ts

View workflow job for this annotation

GitHub Actions / coverage-check

214 line is not covered with tests
}

return {
Expand Down
2 changes: 1 addition & 1 deletion web/screens/Settings/Advanced/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
const selectedGpu = gpuList
.filter((x) => gpusInUse.includes(x.id))
.map((y) => {
return y['name']

Check warning on line 81 in web/screens/Settings/Advanced/index.tsx

View workflow job for this annotation

GitHub Actions / coverage-check

81 line is not covered with tests
})

/**
Expand All @@ -87,7 +87,7 @@
* there is also a case where state update persist everytime user type in the input
*/
const updatePullOptions = useDebouncedCallback(
() => configurePullOptions(),

Check warning on line 90 in web/screens/Settings/Advanced/index.tsx

View workflow job for this annotation

GitHub Actions / coverage-check

90 line is not covered with tests
300
)
/**
Expand Down Expand Up @@ -417,7 +417,7 @@
)}

{/* Vulkan for AMD GPU/ APU and Intel Arc GPU */}
{!isMac && gpuList.length > 0 && experimentalEnabled && (
{!isMac && experimentalEnabled && (
<div className="flex w-full flex-col items-start justify-between gap-4 border-b border-[hsla(var(--app-border))] py-4 first:pt-0 last:border-none sm:flex-row">
<div className="space-y-1">
<div className="flex gap-x-2">
Expand Down
Loading