✨ Access to Bedrock and Textract for CJS & Capability Data Science Teams #4740

K1Br · 2024-07-22T15:31:29Z

Describe the feature request.

Headline: Request for the CJS & Capability Data Science team to have access to use textract to get data in the right format from pdfs and the Claude generative AI models on the AP.

Describe the context.

One current projects, and potentially others in the pipelins for the Parole Board. The current proposed data is publically available. However, there be some additional internal data added subject to adding to the DPIA/sharing agreements.

For all projects, we need to be able to:

Read and extract text and non text data from pdf stored in s3, ideally using amazon textract. We've tried other methods to little success.
Test the capabilities of the different models. To retrieveand/or summarise relevant information.
Run the models for inference in production. If the models meet our evaluation thresholds, we then want to be able to use them in production. In practice, this means things like the below for future development:
3.1 Running a script on schedule using Airflow to send data to the models, receive outputs, and save those to an s3 bucket or database on AWS.
3.2 Calling the models via API from a deployed streamlit application

Project details:
Parole Policy binders: Parole Board colleagues currently have a lot of guidance on SharePoint. Members must sift through lots of policy documents. This is problematic as it is not always clear which document to look in and the SharePoint search functionality isn't adequate.  They also may need bespoke advice for their case. We would deliver:

Text analysis of policies and flagging which documents the advice has come from.
Potentially served as an app pulling data from the pdfs and linking members to the places in the documents
Members might also provide details of the case and have related policies flagged to them to consider.

Value / Purpose

We have tested open source methods of extracting text and other mediums from data. Due to the size and complexity of the documents (the documents are very long and may include a wide range of diagrams etc).

Impact:
Members more quickly able to access the right advice needed to progress the case they are working on and get through the parole board faster without decreasing quality of decisions. There are people of varied abilities and this additional tolling may provide extra support in their work.

User Types

Data scientists in CJS & Capability Data Science

julialawrence · 2024-07-23T08:05:24Z

Hiya, unless you need some kind of additional support in implementing Bedrock in your usecase, Bedrock can now be requested via a support ticket. When you open a ticket please list users and/or apps that need that access and region you want to use it in. A support request can be opened here: https://github.com/ministryofjustice/data-platform-support/issues

RolakeO-mojo · 2024-07-29T06:24:27Z

Hiya, unless you need some kind of additional support in implementing Bedrock in your usecase, Bedrock can now be requested via a support ticket. When you open a ticket please list users and/or apps that need that access and region you want to use it in. A support request can be opened here: https://github.com/ministryofjustice/data-platform-support/issues

Thanks for the response, would it be the same method of request for Amazon Textract?

RolakeO-mojo · 2024-08-06T13:14:49Z

@julialawrence can we get the access to Textract and implement Bedrock seperately?

julialawrence · 2024-08-12T07:22:27Z

Apologies, I missed your question.

We don't currently offer textract which is why we don't provide it via a support request.

We have not had a chance to assess this request yet but are happy for you to request bedrock via our support process.

RolakeO-mojo · 2024-08-27T09:18:54Z

Thank you for your response @julialawrence , we will investigate other methods instead of Textract and request bedrock via the support process once we need to use it, I think this FR can be closed now.

simon-pope · 2024-09-20T10:21:59Z

@RolakeO-mojo Thanks for the update, I will close this Feature Request.

K1Br added the feature-request label Jul 22, 2024

github-project-automation bot added this to Analytical Platform Jul 22, 2024

github-project-automation bot moved this to 👀 TODO in Analytical Platform Jul 22, 2024

github-actions bot mentioned this issue Aug 1, 2024

Monthly issue metrics report #4823

Closed

simon-pope moved this from 👀 TODO to 🎉 Done in Analytical Platform Sep 20, 2024

simon-pope closed this as completed by moving to 🎉 Done in Analytical Platform Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Access to Bedrock and Textract for CJS & Capability Data Science Teams #4740

✨ Access to Bedrock and Textract for CJS & Capability Data Science Teams #4740

K1Br commented Jul 22, 2024

julialawrence commented Jul 23, 2024 •

edited

Loading

RolakeO-mojo commented Jul 29, 2024

RolakeO-mojo commented Aug 6, 2024

julialawrence commented Aug 12, 2024

RolakeO-mojo commented Aug 27, 2024

simon-pope commented Sep 20, 2024

✨ Access to Bedrock and Textract for CJS & Capability Data Science Teams #4740

✨ Access to Bedrock and Textract for CJS & Capability Data Science Teams #4740

Comments

K1Br commented Jul 22, 2024

Describe the feature request.

Describe the context.

Value / Purpose

User Types

julialawrence commented Jul 23, 2024 • edited Loading

RolakeO-mojo commented Jul 29, 2024

RolakeO-mojo commented Aug 6, 2024

julialawrence commented Aug 12, 2024

RolakeO-mojo commented Aug 27, 2024

simon-pope commented Sep 20, 2024

julialawrence commented Jul 23, 2024 •

edited

Loading