moving current wip to main #6

b08x · 2024-09-25T11:37:21Z

No description provided.

…edis

…er component requires.

…inding.pry breakpoints.EUpdated the AdvancedAnalysisTask:EEModified the file path for the advanced_analysis_cartridge.yml.EChanged the prompt for analysis to generate a short narrative.

…training process:EENow trains in iterations, printing progress.EOutputs more detailed model statistics.EEEUpdated the infer_topics method:EENow uses make_doc method.EHandles case where topic inference fails.EIdentifies and returns the most probable topic.EPrints full topic distribution.

…ts in flowbots.rb,EDeleted topic_modeler.rb file,ESimplified TextProcessor and TextSegmenter classes,EUpdated TextProcessingWorkflow to use get_topics,ERemoved Redis initialization from WorkflowOrchestrator

- Improved error handling and logging - Updated Docker configuration - Removed unused segmentation code - Enhanced configuration management - Adjusted file paths and dependencies - Updated nano-bots submodule

- Improved error handling and logging throughout - Removed redundant code and improved readability - Added logger initialization in the constructor

- Update GrammarProcessor to use Treetop grammar file - Simplify markdown_yaml.treetop grammar for better YAML parsing - Enhance PreprocessTextFileTask with improved error handling and logging - Modify TextSegmentTask to use preprocessed content - Add parallel processing support to flowbots.rb - Update CLI to use TopicModelTrainerWorkflow instead of test version - Improve error logging and context in GrammarProcessor - Enhance WorkflowOrchestrator cleanup process This commit significantly improves the text processing pipeline, particularly in handling YAML front matter in Markdown files. It also adds better error handling and logging throughout the workflow.

Renaming the Textfile model to FileObject. Updating all references to Textfile to FileObject. Modifying the FileLoader class to use the FileObject model. Updating the InputRetrieval module to retrieve FileObject instances. Adjusting the RedisKeys module to use keys related to FileObject. Updating tasks and workflows to use the FileObject model.

- Reduce log file max size to 2,145,728 bytes - Increase max number of log files to 100 - Comment out flush_redis_cache in unified_file_processing - Add batch mode to TextProcessingWorkflow - Implement separate processing for batch and single file modes - Add methods for fetching unprocessed file IDs and creating/fetching file objects - Update perform_additional_tasks to work with specific file IDs

b08x added 30 commits July 5, 2024 17:44

wip: putting together the scaffolding

f2fca2b

demo working

9f8153a

wip: examples

f76e2d6

added examples command

8636655

wip

1dfca70

setting interface output colors here results in ascii chars sent to r…

1cda980

…edis

added gems

e26a3ff

added example

ac9939f

added provider placeholder

9a52ed2

wip: flowise api

4b92754

added helper module from monadic-chat, wip: flowise api working

b8ac68e

added setup instructions for python libs

fcb1478

wip ToT example, workflow architecture

dd90e4c

added original example

da07c8c

moved cartridges to nano-bot registry

64b593a

wip

cac3135

added singleton class for spacy tasks

1209b36

wip ERROR -- : No valid words found in the provided documents

0752561

Moved the require statement for text_processing_workflow to after oth…

baa9def

…er component requires.

Changed logging level from DEBUG to INFO.ECommented out most of the b…

8d3e423

…inding.pry breakpoints.EUpdated the AdvancedAnalysisTask:EEModified the file path for the advanced_analysis_cartridge.yml.EChanged the prompt for analysis to generate a short narrative.

Removed unused imports and dependencies,EReorganized require statemen…

5165cbf

…ts in flowbots.rb,EDeleted topic_modeler.rb file,ESimplified TextProcessor and TextSegmenter classes,EUpdated TextProcessingWorkflow to use get_topics,ERemoved Redis initialization from WorkflowOrchestrator

- Modularized topic modeling functionality

ee0a4db

- Improved error handling and logging - Updated Docker configuration - Removed unused segmentation code - Enhanced configuration management - Adjusted file paths and dependencies - Updated nano-bots submodule

- Extracted train_model and infer_topics methods

3250ccc

- Improved error handling and logging throughout - Removed redundant code and improved readability - Added logger initialization in the constructor

adding tty-box functions

378d8e1

moved workflows, renamed components

f4aec89

future utils

174fc77

wip: error handler

ca378e4

wip: ui

4e4c74d

added error handling cartridge

04972bf

b08x added 28 commits July 27, 2024 06:34

this works at least

41e3886

update readme

31c4bfb

set preprocess task to get the current_textfile_id in the workflow

cdb9758

add engtagger task wip: text compressor

e03904e

added rdocs

da110bf

documentation

dc8377f

extras

7614aad

fix: linear logic for detecting file type

1173a8e

wip

27b7ee1

Refactor tasks and implement uniform input retrieval (Epics 1 & 2)

e5085a2

added lemmas ohm model

05ca137

ui improvements

c228855

UI improvements

bf55826

cartridge updates

fe27538

ui improvements

a266a7e

adjusted readme

456f219

updated readme, results

3a628cf

wip

bf83333

adjusted nano-bots

d4a4431

added SI chars

0dea453

doc updates

499ef9b

snapshot

56f80ff

snapshot

768279f

readme edits

076390e

snapshot

9c51b8f

b08x merged commit 844ca60 into main Sep 25, 2024
0 of 5 checks passed

b08x deleted the topicmodeler branch September 25, 2024 12:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moving current wip to main #6

moving current wip to main #6

b08x commented Sep 25, 2024

moving current wip to main #6

moving current wip to main #6

Conversation

b08x commented Sep 25, 2024